Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35042802

RESUMO

A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific community: Sample Collection and Processing, Sequencing and Assembly, Annotation, Analysis, and IT and Informatics. The current versions of the resulting standards documents are available on the EBP website, with the recognition that opportunities, technologies, and challenges may improve or change in the future, requiring flexibility for the EBP to meet its goals. Here, we describe some highlights from the proposed standards, and areas where additional challenges will need to be met.


Assuntos
Sequência de Bases/genética , Eucariotos/genética , Genômica/normas , Animais , Biodiversidade , Genômica/métodos , Humanos , Padrões de Referência , Valores de Referência , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas
2.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34251419

RESUMO

Online, open access databases for biological knowledge serve as central repositories for research communities to store, find and analyze integrated, multi-disciplinary datasets. With increasing volumes, complexity and the need to integrate genomic, transcriptomic, metabolomic, proteomic, phenomic and environmental data, community databases face tremendous challenges in ongoing maintenance, expansion and upgrades. A common infrastructure framework using community standards shared by many databases can reduce development burden, provide interoperability, ensure use of common standards and support long-term sustainability. Tripal is a mature, open source platform built to meet this need. With ongoing improvement since its first release in 2009, Tripal provides full functionality for searching, browsing, loading and curating numerous types of data and is a primary technology powering at least 31 publicly available databases spanning plants, animals and human data, primarily storing genomics, genetics and breeding data. Tripal software development is managed by a shared, inclusive governance structure including both project management and advisory teams. Here, we report on the most important and innovative aspects of Tripal after 11 years development, including integration of diverse types of biological data, successful collaborative projects across member databases, and support for implementing FAIR principles.


Assuntos
Cruzamento , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Plantas/genética , Software , Produtos Agrícolas/genética , Variação Genética , Filogenia , Plantas/metabolismo , Proteômica , Navegador
3.
BMC Biol ; 18(1): 142, 2020 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-33070780

RESUMO

BACKGROUND: The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set. RESULTS: We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) a comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta. CONCLUSIONS: Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species.


Assuntos
Genoma de Inseto , Características de História de Vida , Tisanópteros/fisiologia , Transcriptoma , Animais , Produtos Agrícolas , Comportamento Alimentar , Cadeia Alimentar , Imunidade Inata/genética , Percepção , Filogenia , Reprodução/genética , Tisanópteros/genética , Tisanópteros/imunologia
5.
BMC Genomics ; 21(1): 227, 2020 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-32171258

RESUMO

BACKGROUND: Halyomorpha halys (Stål), the brown marmorated stink bug, is a highly invasive insect species due in part to its exceptionally high levels of polyphagy. This species is also a nuisance due to overwintering in human-made structures. It has caused significant agricultural losses in recent years along the Atlantic seaboard of North America and in continental Europe. Genomic resources will assist with determining the molecular basis for this species' feeding and habitat traits, defining potential targets for pest management strategies. RESULTS: Analysis of the 1.15-Gb draft genome assembly has identified a wide variety of genetic elements underpinning the biological characteristics of this formidable pest species, encompassing the roles of sensory functions, digestion, immunity, detoxification and development, all of which likely support H. halys' capacity for invasiveness. Many of the genes identified herein have potential for biomolecular pesticide applications. CONCLUSIONS: Availability of the H. halys genome sequence will be useful for the development of environmentally friendly biomolecular pesticides to be applied in concert with more traditional, synthetic chemical-based controls.


Assuntos
Heterópteros/genética , Proteínas de Insetos/genética , Resistência a Inseticidas , Sequenciamento Completo do Genoma/métodos , Animais , Ecossistema , Transferência Genética Horizontal , Tamanho do Genoma , Heterópteros/classificação , Espécies Introduzidas , Filogenia
6.
BMC Genomics ; 19(1): 832, 2018 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-30463532

RESUMO

BACKGROUND: Having conquered water surfaces worldwide, the semi-aquatic bugs occupy ponds, streams, lakes, mangroves, and even open oceans. The diversity of this group has inspired a range of scientific studies from ecology and evolution to developmental genetics and hydrodynamics of fluid locomotion. However, the lack of a representative water strider genome hinders our ability to more thoroughly investigate the molecular mechanisms underlying the processes of adaptation and diversification within this group. RESULTS: Here we report the sequencing and manual annotation of the Gerris buenoi (G. buenoi) genome; the first water strider genome to be sequenced thus far. The size of the G. buenoi genome is approximately 1,000 Mb, and this sequencing effort has recovered 20,949 predicted protein-coding genes. Manual annotation uncovered a number of local (tandem and proximal) gene duplications and expansions of gene families known for their importance in a variety of processes associated with morphological and physiological adaptations to a water surface lifestyle. These expansions may affect key processes associated with growth, vision, desiccation resistance, detoxification, olfaction and epigenetic regulation. Strikingly, the G. buenoi genome contains three insulin receptors, suggesting key changes in the rewiring and function of the insulin pathway. Other genomic changes affecting with opsin genes may be associated with wavelength sensitivity shifts in opsins, which is likely to be key in facilitating specific adaptations in vision for diverse water habitats. CONCLUSIONS: Our findings suggest that local gene duplications might have played an important role during the evolution of water striders. Along with these findings, the sequencing of the G. buenoi genome now provides us the opportunity to pursue exciting research opportunities to further understand the genomic underpinnings of traits associated with the extreme body plan and life history of water striders.


Assuntos
Genoma , Heterópteros/genética , Heterópteros/fisiologia , Proteínas de Insetos/genética , Adaptação Fisiológica , Animais , Evolução Molecular , Genômica , Heterópteros/classificação , Fenótipo , Filogenia
7.
Environ Sci Technol ; 52(10): 6009-6022, 2018 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-29634279

RESUMO

Hyalella azteca is a cryptic species complex of epibenthic amphipods of interest to ecotoxicology and evolutionary biology. It is the primary crustacean used in North America for sediment toxicity testing and an emerging model for molecular ecotoxicology. To provide molecular resources for sediment quality assessments and evolutionary studies, we sequenced, assembled, and annotated the genome of the H. azteca U.S. Lab Strain. The genome quality and completeness is comparable with other ecotoxicological model species. Through targeted investigation and use of gene expression data sets of H. azteca exposed to pesticides, metals, and other emerging contaminants, we annotated and characterized the major gene families involved in sequestration, detoxification, oxidative stress, and toxicant response. Our results revealed gene loss related to light sensing, but a large expansion in chemoreceptors, likely underlying sensory shifts necessary in their low light habitats. Gene family expansions were also noted for cytochrome P450 genes, cuticle proteins, ion transporters, and include recent gene duplications in the metal sequestration protein, metallothionein. Mapping of differentially expressed transcripts to the genome significantly increased the ability to functionally annotate toxicant responsive genes. The H. azteca genome will greatly facilitate development of genomic tools for environmental assessments and promote an understanding of how evolution shapes toxicological pathways with implications for environmental and human health.


Assuntos
Anfípodes , Poluentes Químicos da Água , Animais , Ecotoxicologia , Sedimentos Geológicos , América do Norte , Testes de Toxicidade
8.
Nucleic Acids Res ; 43(Database issue): D714-9, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25332403

RESUMO

The 5000 arthropod genomes initiative (i5k) has tasked itself with coordinating the sequencing of 5000 insect or related arthropod genomes. The resulting influx of data, mostly from small research groups or communities with little bioinformatics experience, will require visualization, dissemination and curation, preferably from a centralized platform. The National Agricultural Library (NAL) has implemented the i5k Workspace@NAL (http://i5k.nal.usda.gov/) to help meet the i5k initiative's genome hosting needs. Any i5k member is encouraged to contact the i5k Workspace with their genome project details. Once submitted, new content will be accessible via organism pages, genome browsers and BLAST search engines, which are implemented via the open-source Tripal framework, a web interface for the underlying Chado database schema. We also implement the Web Apollo software for groups that choose to curate gene models. New content will add to the existing body of 35 arthropod species, which include species relevant for many aspects of arthropod genomic research, including agriculture, invasion biology, systematics, ecology and evolution, and developmental research.


Assuntos
Artrópodes/genética , Bases de Dados Genéticas , Genômica , Animais , Gráficos por Computador , Genoma , Internet , Anotação de Sequência Molecular
9.
Proc Biol Sci ; 280(1759): 20130143, 2013 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-23516243

RESUMO

Seasonal environments present fundamental physiological challenges to a wide range of insects. Many temperate insects surmount the exigencies of winter by undergoing photoperiodic diapause, in which photoperiod provides a token cue that initiates an alternative developmental programme leading to dormancy. Pre-diapause is a crucial preparatory phase of this process, preceding developmental arrest. However, the regulatory and physiological mechanisms of diapause preparation are largely unknown. Using high-throughput gene expression profiling in the Asian tiger mosquito, Aedes albopictus, we reveal major shifts in endocrine signalling, cell proliferation, metabolism, energy production and cellular structure across pre-diapause development. While some hallmarks of diapause, such as insulin signalling and stress response, were not important at the transcriptional level, two genes, Pepck and PCNA, appear to show diapause-induced transcriptional changes across insect taxa. These processes demonstrate physiological commonalities between Ae. albopictus pre-diapause and diapause strategies across insects, and support the idea of a genetic 'toolkit' for diapause. Observations of gene expression trends from a comparative developmental perspective suggest that individual physiological processes are delayed against a background of a fixed morphological ontogeny. Our results demonstrate how deep sequencing can provide new insights into elusive molecular bases of complex ecological adaptations.


Assuntos
Aedes/fisiologia , Transcriptoma , Adaptação Fisiológica , Aedes/genética , Aedes/crescimento & desenvolvimento , Animais , Drosophila melanogaster/genética , Drosophila melanogaster/crescimento & desenvolvimento , Drosophila melanogaster/fisiologia , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Dados de Sequência Molecular , Fotoperíodo , Reação em Cadeia da Polimerase , Pupa/genética , Pupa/crescimento & desenvolvimento , Pupa/fisiologia , Estações do Ano , Análise de Sequência de DNA , Fatores de Tempo
10.
J Exp Biol ; 216(Pt 21): 4082-90, 2013 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-23913949

RESUMO

Dormancy is a crucial adaptation allowing insects to withstand harsh environmental conditions. The pre-programmed developmental arrest of diapause is a form of dormancy that is distinct from quiescence, in which development arrests in immediate response to hardship. Much progress has been made in understanding the environmental and hormonal controls of diapause. However, studies identifying transcriptional changes unique to diapause, rather than quiescence, are lacking, making it difficult to disentangle the transcriptional profiles of diapause from dormancy in general. The Asian tiger mosquito, Aedes albopictus, presents an ideal model for such a study, as diapausing and quiescent eggs can be staged and collected for global gene expression profiling using a newly developed transcriptome. Here, we use RNA-Seq to contrast gene expression during diapause with quiescence to identify transcriptional changes specific to the diapause response. We identify global trends in gene expression that show gradual convergence of diapause gene expression upon gene expression during quiescence. Functionally, early diapause A. albopictus show strong expression differences of genes involved in metabolism, which diminish over time. Of these, only expression of lipid metabolism genes remained distinct in late diapause. We identify several genes putatively related to hormonal control of development that are persistently differentially expressed throughout diapause, suggesting these might be involved in the maintenance of diapause. Our results identify key biological differences between diapausing and quiescent pharate larvae, and suggest candidate pathways for studying metabolism and the hormonal control of development during diapause in other species.


Assuntos
Aedes/crescimento & desenvolvimento , Aedes/genética , Regulação da Expressão Gênica , Transcriptoma , Aedes/metabolismo , Animais , Feminino , Larva/genética , Larva/crescimento & desenvolvimento , Larva/metabolismo , Metabolismo dos Lipídeos , Metamorfose Biológica , Dados de Sequência Molecular , Análise de Sequência de RNA , Fatores de Tempo
11.
MicroPubl Biol ; 20232023.
Artigo em Inglês | MEDLINE | ID: mdl-37662054

RESUMO

JBrowse 2 is a next-generation genome browser that can be run as a web or desktop application. We describe a new plugin, the Linkout Plugin, that enables users to link features to external databases based on their IDs and the remote URLs on JBrowse 2 desktop or web. As a result, genome analysis time and effort are reduced, enabling researchers to gain insights more quickly. The Linkout Plugin fills a common need scientists have: looking for more information on a gene. Overall, the Linkout Plugin is a valuable and practical addition to the JBrowse functionality.

12.
Database (Oxford) ; 20232023 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-37971715

RESUMO

Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources (https://www.agbiodata.org/databases) covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, respectively, conducted a Consortium-wide survey to assess the current status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data-sharing practices by AgBioData databases are in a fairly healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that, ontology use has not substantially changed since a similar survey was conducted in 2017. Based on our evaluation of the survey results, we recommend (i) providing training for database personnel in a specific data-sharing techniques, as well as in ontology use; (ii) further study on what metadata is shared, and how well it is shared among databases; (iii) promoting an understanding of data sharing and ontologies in the stakeholder community; (iv) improving data sharing and ontologies for specific phenotypic data types and formats; and (v) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means. Database URL  https://www.agbiodata.org/databases.


Assuntos
Gerenciamento de Dados , Melhoramento Vegetal , Animais , Genômica/métodos , Bases de Dados Factuais , Disseminação de Informação
13.
Mol Ecol ; 21(20): 4970-82, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22988889

RESUMO

Landscape genetic studies use spatially explicit population genetic information to determine the physical and environmental causes of population genetic structure on regional scales. Comparative studies that identify common barriers to gene flow across multiple species within a community are important to both understand the evolutionary trajectories of populations and prioritize habitat conservation. Here, we use a comparative landscape genetic approach to ask whether gradients in temperature or precipitation seasonality structure genetic variation across three codistributed tree species in Central America, or whether a simpler (geographic distance) or more complex, species-specific environmental niche model is necessary to individually explain population genetic structure. Using descriptive statistics and causal modelling, we find that different factors best explain genetic distance in each of the three species: environmental niche distance in Bursera simaruba, geographic distance in Ficus insipida and historical barriers to gene flow or cryptic reproductive barriers for Brosimum alicastrum. This study confirms suggestions from previous studies of Central American tree species that imply that population genetic structure of trees in this region is determined by complex interactions of both historical and current barriers to gene flow.


Assuntos
Meio Ambiente , Fluxo Gênico , Variação Genética , Árvores/genética , Bursera/genética , América Central , DNA de Plantas/genética , Ecossistema , Ficus/genética , Genética Populacional , Modelos Genéticos , Moraceae/genética
14.
J Med Entomol ; 49(3): 777-82, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22679889

RESUMO

The genes period (per) and timeless (tim) are core components of the circadian clock that regulates a wide range of rhythmic biochemical, physiological, and behavioral processes in prokaryotes and eukaryotes. We used degenerate polymerase chain reaction (PCR) and Rapid Amplification of cDNA Ends (RACE) to clone and sequence the entire cDNAs of both the per and tim genes in Aedes albopictus (Skuse). We also used the 5' end of the Ae. albopictus per cDNA to identify previously unannotated sequence coding for the N-terminal region of the PERIOD protein in Aedes aegypti L. We sequenced genomic DNA of one mosquito from each of three geographically distinct populations (New Jersey, Florida, and Brazil), and identified three introns in the per gene and eight introns in the tim gene. Phylogenetic analyses and comparison of functional domains support the orthology of the newly identified per and tim genes. Analysis of nonsynonymous to synonymous substitution rates indicates that both the per and tim genes have evolved under strong selective constraint subsequent to the divergence ofAe. albopictus and Ae. aegypti. Taken together, these results provide resources that can be used to investigate the molecular genetics of circadian phenotypes in Ae. albopictus and other culicids, to perform comparative analyses of insect circadian clock function, and also to conduct phylogeographic analyses using single-copy nuclear introns.


Assuntos
Aedes/genética , Proteínas de Insetos/genética , Proteínas Circadianas Period/genética , Animais , Relógios Circadianos , Clonagem Molecular , Feminino , Íntrons , Masculino , Filogeografia , Análise de Sequência de DNA
15.
Genes (Basel) ; 13(3)2022 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-35328000

RESUMO

The lesser grain borer, Rhyzopertha dominica (F.) (Coleoptera: Bostrichidae), is a major global pest of cereal grains. Infestations are difficult to control as larvae feed inside grain kernels, and many populations are resistant to both contact insecticides and fumigants. We sequenced the genome of R. dominica to identify genes responsible for important biological functions and develop more targeted and efficacious management strategies. The genome was assembled from long read sequencing and long-range scaffolding technologies. The genome assembly is 479.1 Mb, close to the predicted genome size of 480.4 Mb by flow cytometry. This assembly is among the most contiguous beetle assemblies published to date, with 139 scaffolds, an N50 of 53.6 Mb, and L50 of 4, indicating chromosome-scale scaffolds. Predicted genes from biologically relevant groups were manually annotated using transcriptome data from adults and different larval tissues to guide annotation. The expansion of carbohydrase and serine peptidase genes suggest that they combine to enable efficient digestion of cereal proteins. A reduction in the copy number of several detoxification gene families relative to other coleopterans may reflect the low selective pressure on these genes in an insect that spends most of its life feeding internally. Chemoreceptor genes contain elevated numbers of pseudogenes for odorant receptors that also may be related to the recent ontogenetic shift of R. dominica to a diet consisting primarily of stored grains. Analysis of repetitive sequences will further define the evolution of bostrichid beetles compared to other species. The data overall contribute significantly to coleopteran genetic research.


Assuntos
Besouros , Inseticidas , Aclimatação , Animais , Besouros/genética , Dominica , Larva/genética
16.
BMC Genomics ; 12: 619, 2011 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-22185595

RESUMO

BACKGROUND: Many temperate insects survive the harsh conditions of winter by undergoing photoperiodic diapause, a pre-programmed developmental arrest initiated by short day lengths. Despite the well-established ecological significance of photoperiodic diapause, the molecular basis of this crucial adaptation remains largely unresolved. The Asian tiger mosquito, Aedes albopictus (Skuse), represents an outstanding emerging model to investigate the molecular basis of photoperiodic diapause in a well-defined ecological and evolutionary context. Ae. albopictus is a medically significant vector and is currently considered the most invasive mosquito in the world. Traits related to diapause appear to be important factors contributing to the rapid spread of this mosquito. To generate novel sequence information for this species, as well as to discover transcripts involved in diapause preparation, we sequenced the transcriptome of Ae. albopictus oocytes destined to become diapausing or non-diapausing pharate larvae. RESULTS: 454 GS-FLX transcriptome sequencing yielded >1.1 million quality-filtered reads, which we assembled into 69,474 contigs (N50 = 1,009 bp). Our contig filtering approach, where we took advantage of strong sequence similarity to the fully sequenced genome of Aedes aegypti, as well as other reference organisms, resulted in 11,561 high-quality, conservative ESTs. Differential expression estimates based on normalized read counts revealed 57 genes with higher expression, and 257 with lower expression under diapause-inducing conditions. Analysis of expression by qPCR for 47 of these genes indicated a high correlation of expression levels between 454 sequence data and qPCR, but congruence of statistically significant differential expression was low. Seven genes identified as differentially expressed based on qPCR have putative functions that are consistent with the insect diapause syndrome; three genes have unknown function and represent novel candidates for the transcriptional basis of diapause. CONCLUSIONS: Our transcriptome database provides a rich resource for the comparative genomics and functional genetics of Ae. albopictus, an invasive and medically important mosquito. Additionally, the identification of differentially expressed transcripts related to diapause enriches the limited knowledge base for the molecular basis of insect diapause, in particular for the preparatory stage. Finally, our analysis illustrates a useful approach that draws from a closely related reference genome to generate high-confidence ESTs in a non-model organism.


Assuntos
Aedes/genética , RNA Mensageiro/genética , Transcriptoma , Aedes/fisiologia , Animais , Insetos Vetores , Fotoperíodo , Reação em Cadeia da Polimerase
17.
Insects ; 12(8)2021 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-34442314

RESUMO

Genome sequencing of a diverse array of arthropod genomes is already underway, and these genomes will be used to study human health, agriculture, biodiversity, and ecology. These new genomes are intended to serve as community resources and provide the foundational information required to apply 'omics technologies to a more diverse set of species. However, biologists require genome annotation to use these genomes and derive a better understanding of complex biological systems. Genome annotation incorporates two related, but distinct, processes: Demarcating genes and other elements present in genome sequences (structural annotation); and associating a function with genetic elements (functional annotation). While there are well-established and freely available workflows for structural annotation of gene identification in newly assembled genomes, workflows for providing the functional annotation required to support functional genomics studies are less well understood. Genome-scale functional annotation is required for functional modeling (enrichment, networks, etc.). A first-pass genome-wide functional annotation effort can rapidly identify under-represented gene sets for focused community annotation efforts. We present an open-source, open access, and containerized pipeline for genome-scale functional annotation of insect proteomes and apply it to various arthropod species. We show that the performance of the predictions is consistent across a set of arthropod genomes with varying assembly and annotation quality.

18.
Insects ; 12(7)2021 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-34357286

RESUMO

The phylum Arthropoda includes species crucial for ecosystem stability, soil health, crop production, and others that present obstacles to crop and animal agriculture. The United States Department of Agriculture's Agricultural Research Service initiated the Ag100Pest Initiative to generate reference genome assemblies of arthropods that are (or may become) pests to agricultural production and global food security. We describe the project goals, process, status, and future. The first three years of the project were focused on species selection, specimen collection, and the construction of lab and bioinformatics pipelines for the efficient production of assemblies at scale. Contig-level assemblies of 47 species are presented, all of which were generated from single specimens. Lessons learned and optimizations leading to the current pipeline are discussed. The project name implies a target of 100 species, but the efficiencies gained during the project have supported an expansion of the original goal and a total of 158 species are currently in the pipeline. We anticipate that the processes described in the paper will help other arthropod research groups or other consortia considering genome assembly at scale.

19.
Genome Biol ; 21(1): 15, 2020 01 23.
Artigo em Inglês | MEDLINE | ID: mdl-31969194

RESUMO

BACKGROUND: Arthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods. RESULTS: Using 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality, and chemoperception. CONCLUSIONS: These analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.


Assuntos
Artrópodes/genética , Evolução Molecular , Animais , Artrópodes/classificação , Metilação de DNA , Especiação Genética , Variação Genética , Filogenia
20.
Methods Mol Biol ; 1858: 75-87, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30414112

RESUMO

The GFF3toolkit ( https://github.com/NAL-i5K/GFF3toolkit ) supported by the i5k Workspace@NAL provides a suite of tools to handle gene annotations in GFF3 format from arthropod genome projects and their research communities. To improve GFF3 formatting of gene annotations, a quality control and merge procedure is proposed along with the GFF3toolkit. In particular, the toolkit provides functions to sort a GFF3 file, detect GFF3 format errors, merge two GFF3 files, and generate biological sequences from a GFF3 file. This chapter explains when and how to use the provided tools to obtain nonredundant arthropod gene sets in high quality.


Assuntos
Biologia Computacional/métodos , Genoma de Inseto , Insetos/genética , Anotação de Sequência Molecular/métodos , Controle de Qualidade , Análise de Sequência de DNA/métodos , Software , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA