Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
Add more filters










Publication year range
1.
Wellcome Open Res ; 8: 24, 2023.
Article in English | MEDLINE | ID: mdl-36864925

ABSTRACT

As genomic data transform our understanding of biodiversity, the Earth BioGenome Project (EBP) has set a goal of generating reference quality genome assemblies for all ~1.9 million described eukaryotic taxa. Meeting this goal requires coordination among many individual regional and taxon-focussed projects working under the EBP umbrella. Large-scale sequencing projects require ready access to validated genome-relevant metadata, such as genome sizes and karyotypes, but these data are dispersed across the literature, and directly measured values are lacking for most taxa. To meet these needs, we have developed Genomes on a Tree (GoaT), an Elasticsearch-powered datastore and search index for genome-relevant metadata and sequencing project plans and statuses. GoaT indexes publicly available metadata for all eukaryotic species and interpolates missing values through phylogenetic comparison. GoaT also holds target priority and sequencing status information for many projects affiliated to the EBP to aid project coordination. Metadata and status attributes in GoaT can be queried through a mature API, a web front end, and a command line interface. The web front end additionally provides summary visualisations for data exploration and reporting (see https://goat.genomehubs.org). GoaT currently holds direct or estimated values for over 70 taxon attributes and over 30 assembly attributes across 1.5 million eukaryotic species. The depth and breadth of curated data, frequent updates, and a versatile query interface make GoaT a powerful data aggregator and portal to explore and report underlying data for the eukaryotic tree of life. We illustrate this utility through a series of use cases from planning through to completion of a genome-sequencing project.

2.
Wellcome Open Res ; 6: 108, 2021.
Article in English | MEDLINE | ID: mdl-34632087

ABSTRACT

We present a genome assembly from an individual female Salmo trutta (the brown trout; Chordata; Actinopteri; Salmoniformes; Salmonidae). The genome sequence is 2.37 gigabases in span. The majority of the assembly is scaffolded into 40 chromosomal pseudomolecules. Gene annotation of this assembly on Ensembl has identified 43,935 protein coding genes.

3.
Wellcome Open Res ; 6: 118, 2021.
Article in English | MEDLINE | ID: mdl-34660910

ABSTRACT

We present a genome assembly from an individual male Rattus norvegicus (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled. This genome assembly, mRatBN7.2, represents the new reference genome for R. norvegicus and has been adopted by the Genome Reference Consortium.

4.
Wellcome Open Res ; 6: 112, 2021.
Article in English | MEDLINE | ID: mdl-34671705

ABSTRACT

We present a genome assembly from an individual female Aquila chrysaetos chrysaetos (the European golden eagle; Chordata; Aves; Accipitridae). The genome sequence is 1.23 gigabases in span. The majority of the assembly is scaffolded into 28 chromosomal pseudomolecules, including the W and Z sex chromosomes.

5.
Wellcome Open Res ; 6: 225, 2021.
Article in English | MEDLINE | ID: mdl-34703904

ABSTRACT

We present a genome assembly from a clonal population of Eimeria tenella Houghton parasites (Apicomplexa; Conoidasida; Eucoccidiorida; Eimeriidae). The genome sequence is 53.25 megabases in span. The entire assembly is scaffolded into 15 chromosomal pseudomolecules, with complete mitochondrion and apicoplast organellar genomes also present.

6.
Philos Trans R Soc Lond B Biol Sci ; 376(1825): 20200157, 2021 05 24.
Article in English | MEDLINE | ID: mdl-33813885

ABSTRACT

As sequencing becomes more accessible and affordable, the analysis of genomic and transcriptomic data has become a cornerstone of many research initiatives. Communities with a focus on particular taxa or ecosystems need solutions capable of aggregating genomic resources and serving them in a standardized and analysis-friendly manner. Taxon-focussed resources can be more flexible in addressing the needs of a research community than can universal or general databases. Here, we present MolluscDB, a genome and transcriptome database for molluscs. MolluscDB offers a rich ecosystem of tools, including an Ensembl browser, a BLAST server for homology searches and an HTTP server from which any dataset present in the database can be downloaded. To demonstrate the utility of the database and verify the quality of its data, we imported data from assembled genomes and transcriptomes of 22 species, estimated the phylogeny of Mollusca using single-copy orthologues, explored patterns of gene family size change and interrogated the data for biomineralization-associated enzymes and shell matrix proteins. MolluscDB provides an easy-to-use and openly accessible data resource for the research community. This article is part of the Theo Murphy meeting issue 'Molluscan genomics: broad insights and future directions for a neglected phylum'.


Subject(s)
Databases, Genetic , Genome , Mollusca/genetics , Transcriptome , Animals , Gene Expression Profiling , Genomics
7.
Wellcome Open Res ; 6: 162, 2021.
Article in English | MEDLINE | ID: mdl-35600244

ABSTRACT

We present a genome assembly from an individual male Arvicola amphibius (the European water vole; Chordata; Mammalia; Rodentia; Cricetidae). The genome sequence is 2.30 gigabases in span. The majority of the assembly is scaffolded into 18 chromosomal pseudomolecules, including the X sex chromosome. Gene annotation of this assembly on Ensembl has identified 21,394 protein coding genes.

8.
Wellcome Open Res ; 5: 27, 2020.
Article in English | MEDLINE | ID: mdl-33215047

ABSTRACT

We present a genome assembly from an individual male Sciurus carolinensis (the eastern grey squirrel; Vertebrata; Mammalia; Eutheria; Rodentia; Sciuridae). The genome sequence is 2.82 gigabases in span. The majority of the assembly (92.3%) is scaffolded into 21 chromosomal-level scaffolds, with both X and Y sex chromosomes assembled.

9.
Wellcome Open Res ; 5: 18, 2020.
Article in English | MEDLINE | ID: mdl-32587897

ABSTRACT

We present a genome assembly from an individual male Sciurus vulgaris (the Eurasian red squirrel; Vertebrata; Mammalia; Eutheria; Rodentia; Sciuridae). The genome sequence is 2.88 gigabases in span. The majority of the assembly is scaffolded into 21 chromosomal-level scaffolds, with both X and Y sex chromosomes assembled.

10.
Wellcome Open Res ; 5: 33, 2020.
Article in English | MEDLINE | ID: mdl-32258427

ABSTRACT

We present a genome assembly from an individual male Lutra lutra (the Eurasian river otter; Vertebrata; Mammalia; Eutheria; Carnivora; Mustelidae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled.

11.
Evol Lett ; 4(1): 19-33, 2020 Feb.
Article in English | MEDLINE | ID: mdl-32055408

ABSTRACT

Evolutionary adaptation is generally thought to occur through incremental mutational steps, but large mutational leaps can occur during its early stages. These are challenging to study in nature due to the difficulty of observing new genetic variants as they arise and spread, but characterizing their genomic dynamics is important for understanding factors favoring rapid adaptation. Here, we report genomic consequences of recent, adaptive song loss in a Hawaiian population of field crickets (Teleogryllus oceanicus). A discrete genetic variant, flatwing, appeared and spread approximately 15 years ago. Flatwing erases sound-producing veins on male wings. These silent flatwing males are protected from a lethal, eavesdropping parasitoid fly. We sequenced, assembled and annotated the cricket genome, produced a linkage map, and identified a flatwing quantitative trait locus covering a large region of the X chromosome. Gene expression profiling showed that flatwing is associated with extensive genome-wide effects on embryonic gene expression. We found that flatwing male crickets express feminized chemical pheromones. This male feminizing effect, on a different sexual signaling modality, is genetically associated with the flatwing genotype. Our findings suggest that the early stages of evolutionary adaptation to extreme pressures can be accompanied by greater genomic and phenotypic disruption than previously appreciated, and highlight how abrupt adaptation might involve suites of traits that arise through pleiotropy or genomic hitchhiking.

12.
G3 (Bethesda) ; 10(4): 1361-1374, 2020 04 09.
Article in English | MEDLINE | ID: mdl-32071071

ABSTRACT

Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.


Subject(s)
Genome , Software , Sequence Analysis, DNA
13.
Science ; 366(6465): 594-599, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31672890

ABSTRACT

We used 20 de novo genome assemblies to probe the speciation history and architecture of gene flow in rapidly radiating Heliconius butterflies. Our tests to distinguish incomplete lineage sorting from introgression indicate that gene flow has obscured several ancient phylogenetic relationships in this group over large swathes of the genome. Introgressed loci are underrepresented in low-recombination and gene-rich regions, consistent with the purging of foreign alleles more tightly linked to incompatibility loci. Here, we identify a hitherto unknown inversion that traps a color pattern switch locus. We infer that this inversion was transferred between lineages by introgression and is convergent with a similar rearrangement in another part of the genus. These multiple de novo genome sequences enable improved understanding of the importance of introgression and selective processes in adaptive radiation.


Subject(s)
Butterflies/genetics , Gene Flow , Genetic Introgression , Genome, Insect , Animals , Biological Evolution , Butterflies/anatomy & histology , Chromosome Inversion , Genes, Insect , Genetic Speciation , Phylogeny , Wings, Animal/anatomy & histology
14.
PLoS Genet ; 15(9): e1008366, 2019 09.
Article in English | MEDLINE | ID: mdl-31539368

ABSTRACT

The capacity of organisms to tune their development in response to environmental cues is pervasive in nature. This phenotypic plasticity is particularly striking in plants, enabled by their modular and continuous development. A good example is the activation of lateral shoot branches in Arabidopsis, which develop from axillary meristems at the base of leaves. The activity and elongation of lateral shoots depends on the integration of many signals both external (e.g. light, nutrient supply) and internal (e.g. the phytohormones auxin, strigolactone and cytokinin). Here, we characterise natural variation in plasticity of shoot branching in response to nitrate supply using two diverse panels of Arabidopsis lines. We find extensive variation in nitrate sensitivity across these lines, suggesting a genetic basis for variation in branching plasticity. High plasticity is associated with extreme branching phenotypes such that lines with the most branches on high nitrate have the fewest under nitrate deficient conditions. Conversely, low plasticity is associated with a constitutively moderate level of branching. Furthermore, variation in plasticity is associated with alternative life histories with the low plasticity lines flowering significantly earlier than high plasticity lines. In Arabidopsis, branching is highly correlated with fruit yield, and thus low plasticity lines produce more fruit than high plasticity lines under nitrate deficient conditions, whereas highly plastic lines produce more fruit under high nitrate conditions. Low and high plasticity, associated with early and late flowering respectively, can therefore be interpreted alternative escape vs mitigate strategies to low N environments. The genetic architecture of these traits appears to be highly complex, with only a small proportion of the estimated genetic variance detected in association mapping.


Subject(s)
Arabidopsis/genetics , Nitrates/metabolism , Plant Shoots/genetics , Arabidopsis/metabolism , Arabidopsis Proteins/genetics , Gene Expression Regulation, Plant/genetics , Genes, Plant/genetics , Meristem/growth & development , Phenotype , Plant Leaves/metabolism , Plant Roots/genetics , Plant Shoots/growth & development , Plant Shoots/metabolism
15.
Mol Biol Evol ; 36(12): 2922-2924, 2019 12 01.
Article in English | MEDLINE | ID: mdl-31411700

ABSTRACT

Comparing newly obtained and previously known nucleotide and amino-acid sequences underpins modern biological research. BLAST is a well-established tool for such comparisons but is challenging to use on new data sets. We combined a user-centric design philosophy with sustainable software development approaches to create Sequenceserver, a tool for running BLAST and visually inspecting BLAST results for biological interpretation. Sequenceserver uses simple algorithms to prevent potential analysis errors and provides flexible text-based and visual outputs to support researcher productivity. Our software can be rapidly installed for use by individuals or on shared servers.


Subject(s)
Computational Biology/methods , Genetic Techniques , Software
16.
Database (Oxford) ; 20172017 01 01.
Article in English | MEDLINE | ID: mdl-28605774

ABSTRACT

Database URL: http://GenomeHubs.org.As the generation and use of genomic datasets is becoming increasingly common in all areas of biology, the need for resources to collate, analyse and present data from one or more genome projects is becoming more pressing. The Ensembl platform is a powerful tool to make genome data and cross-species analyses easily accessible through a web interface and a comprehensive application programming interface. Here we introduce GenomeHubs, which provide a containerized environment to facilitate the setup and hosting of custom Ensembl genome browsers. This simplifies mirroring of existing content and import of new genomic data into the Ensembl database schema. GenomeHubs also provide a set of analysis containers to decorate imported genomes with results of standard analyses and functional annotations and support export to flat files, including EMBL format for submission of assemblies and annotations to International Nucleotide Sequence Database Collaboration.


Subject(s)
Data Warehousing/methods , Databases, Nucleic Acid , Genome , Internet , Sequence Analysis, DNA/methods , Web Browser , Animals , Humans
17.
Gigascience ; 6(7): 1-7, 2017 07 01.
Article in English | MEDLINE | ID: mdl-28486658

ABSTRACT

The mycalesine butterfly Bicyclus anynana, the "Squinting bush brown," is a model organism in the study of lepidopteran ecology, development, and evolution. Here, we present a draft genome sequence for B. anynana to serve as a genomics resource for current and future studies of this important model species. Seven libraries with insert sizes ranging from 350 bp to 20 kb were constructed using DNA from an inbred female and sequenced using both Illumina and PacBio technology; 128 Gb of raw Illumina data was filtered to 124 Gb and assembled to a final size of 475 Mb (∼×260 assembly coverage). Contigs were scaffolded using mate-pair, transcriptome, and PacBio data into 10 800 sequences with an N50 of 638 kb (longest scaffold 5 Mb). The genome is comprised of 26% repetitive elements and encodes a total of 22 642 predicted protein-coding genes. Recovery of a BUSCO set of core metazoan genes was almost complete (98%). Overall, these metrics compare well with other recently published lepidopteran genomes. We report a high-quality draft genome sequence for Bicyclus anynana. The genome assembly and annotated gene models are available at LepBase (http://ensembl.lepbase.org/index.html).


Subject(s)
Butterflies/genetics , Genome, Insect , Animals , Molecular Sequence Annotation , Whole Genome Sequencing
18.
Science ; 350(6267): 1493-1498, 2015 Dec 18.
Article in English | MEDLINE | ID: mdl-26680190

ABSTRACT

The genomic causes and effects of divergent ecological selection during speciation are still poorly understood. Here we report the discovery and detailed characterization of early-stage adaptive divergence of two cichlid fish ecomorphs in a small (700 meters in diameter) isolated crater lake in Tanzania. The ecomorphs differ in depth preference, male breeding color, body shape, diet, and trophic morphology. With whole-genome sequences of 146 fish, we identified 98 clearly demarcated genomic "islands" of high differentiation and demonstrated the association of genotypes across these islands with divergent mate preferences. The islands contain candidate adaptive genes enriched for functions in sensory perception (including rhodopsin and other twilight-vision-associated genes), hormone signaling, and morphogenesis. Our study suggests mechanisms and genomic regions that may play a role in the closely related mega-radiation of Lake Malawi.


Subject(s)
Adaptation, Physiological/genetics , Cichlids/genetics , Cichlids/physiology , Genomic Islands , Mating Preference, Animal , Animals , Cichlids/classification , Lakes , Phylogeny , Polymorphism, Single Nucleotide , Species Specificity , Tanzania
19.
J Acoust Soc Am ; 138(2): 1023-9, 2015 Aug.
Article in English | MEDLINE | ID: mdl-26328718

ABSTRACT

Estimates of particle size distributions (PSDs) in solid-in-liquid suspensions can be made on the basis of measurements of ultrasonic wave attenuation combined with a mathematical propagation model, which typically requires seven physical parameters to describe each phase of the mixture. The estimation process is insensitive to all of these except the density of the solid particles, which may not be known or difficult to measure. This paper proposes that an unknown density value is incorporated into the sizing computation as a free variable. It is shown that this leads to an accurate estimate of PSD, as well as the unknown density.

20.
Article in English | MEDLINE | ID: mdl-25389162

ABSTRACT

Measurements of the frequency dependence of ultrasonic attenuation can be used as the basis for the estimation of particle size distributions (PSDs) in solid-in-liquid suspensions. The method requires matching the attenuation simulated by a candidate PSD in combination with a wave propagation model to the measured function in a fitting procedure. Uncertainty in the type of candidate PSD, whether based on fractional volume or fractional number of the dispersed particles, can cause errors in the overall estimation process, particularly for the median particle size. These uncertainties are investigated in the first part of this paper. The second part deals with uncertainties associated with the values for the physical properties of the suspended particles, seven of which are required in the simulation stage. It is shown that the particle sizing exercise is relatively insensitive to all of the physical properties except density, for which values are necessary to an accuracy commensurable with that required for the two principal parameters associated with the PSD-median size and standard deviation. The discussion is limited to small (less than 1-µm) silica particles dispersed in water. The results will have more general application.

SELECTION OF CITATIONS
SEARCH DETAIL
...