RESUMO
The Evidence and Conclusion Ontology (ECO) is a community resource that provides an ontology of terms used to capture the type of evidence that supports biomedical annotations and assertions. Consistent capture of evidence information with ECO allows tracking of annotation provenance, establishment of quality control measures, and evidence-based data mining. ECO is in use by dozens of data repositories and resources with both specific and general areas of focus. ECO is continually being expanded and enhanced in response to user requests as well as our aim to adhere to community best-practices for ontology development. The ECO support team engages in multiple collaborations with other ontologies and annotating groups. Here we report on recent updates to the ECO ontology itself as well as associated resources that are available through this project. ECO project products are freely available for download from the project website (https://evidenceontology.org/) and GitHub (https://github.com/evidenceontology/evidenceontology). ECO is released into the public domain under a CC0 1.0 Universal license.
Assuntos
Biologia Computacional/normas , Bases de Dados Genéticas , Ontologia Genética , Software , Humanos , Anotação de Sequência MolecularRESUMO
BACKGROUND: With the emergence and spread of SARS-CoV-2 variants, genomic epidemiology and surveillance have proven invaluable tools for variant tracking. Here, we analyzed SARS-CoV-2 samples collected from personnel located at the US/NATO bases across Afghanistan. RESULTS: Sequencing and phylogenetic analyses revealed at least 16 independent introductions of SARS-CoV-2 into four of these relatively isolated compounds during April and May 2021, including multiple introductions of Alpha and Delta variants. Four of the introductions resulted in sustained spread of the virus within, and in two cases between, the compounds. Three of these outbreaks, one Delta and two Alpha, occurred simultaneously. CONCLUSIONS: Even in rigorously controlled and segregated environments, SARS-CoV-2 introduction and spread may occur frequently.
Assuntos
COVID-19 , Militares , Afeganistão/epidemiologia , COVID-19/epidemiologia , Surtos de Doenças , Genômica , Humanos , Filogenia , SARS-CoV-2/genéticaRESUMO
The Evidence and Conclusion Ontology (ECO) contains terms (classes) that describe types of evidence and assertion methods. ECO terms are used in the process of biocuration to capture the evidence that supports biological assertions (e.g. gene product X has function Y as supported by evidence Z). Capture of this information allows tracking of annotation provenance, establishment of quality control measures and query of evidence. ECO contains over 1500 terms and is in use by many leading biological resources including the Gene Ontology, UniProt and several model organism databases. ECO is continually being expanded and revised based on the needs of the biocuration community. The ontology is freely available for download from GitHub (https://github.com/evidenceontology/) or the project's website (http://evidenceontology.org/). Users can request new terms or changes to existing terms through the project's GitHub site. ECO is released into the public domain under CC0 1.0 Universal.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Ontologia Genética , Proteínas/genética , Animais , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Proteínas/metabolismo , Análise de Sequência de Proteína , Interface Usuário-ComputadorRESUMO
BACKGROUND: More than 20% of the world's population is at risk for infection by filarial nematodes and >180 million people worldwide are already infected. Along with infection comes significant morbidity that has a socioeconomic impact. The eight filarial nematodes that infect humans are Wuchereria bancrofti, Brugia malayi, Brugia timori, Onchocerca volvulus, Loa loa, Mansonella perstans, Mansonella streptocerca, and Mansonella ozzardi, of which three have published draft genome sequences. Since all have humans as the definitive host, standard avenues of research that rely on culturing and genetics have often not been possible. Therefore, genome sequencing provides an important window into understanding the biology of these parasites. The need for large amounts of high quality genomic DNA from homozygous, inbred lines; the availability of only short sequence reads from next-generation sequencing platforms at a reasonable expense; and the lack of random large insert libraries has limited our ability to generate high quality genome sequences for these parasites. However, the Pacific Biosciences single molecule, real-time sequencing platform holds great promise in reducing input amounts and generating sufficiently long sequences that bypass the need for large insert paired libraries. RESULTS: Here, we report on efforts to generate a more complete genome assembly for L. loa using genetically heterogeneous DNA isolated from a single clinical sample and sequenced on the Pacific Biosciences platform. To obtain the best assembly, numerous assemblers and sequencing datasets were analyzed, combined, and compared. Quiver-informed trimming of an assembly of only Pacific Biosciences reads by HGAP2 was selected as the final assembly of 96.4 Mbp in 2,250 contigs. This results in ~9% more of the genome in ~85% fewer contigs from ~80% less starting material at a fraction of the cost of previous Roche 454-based sequencing efforts. CONCLUSIONS: The result is the most complete filarial nematode assembly produced thus far and demonstrates the utility of single molecule sequencing on the Pacific Biosciences platform for genetically heterogeneous metazoan genomes.
Assuntos
Genoma Helmíntico , Loa/isolamento & purificação , Loíase/parasitologia , Análise de Sequência de DNA/métodos , Animais , Humanos , Loa/genética , Dados de Sequência Molecular , Análise de Sequência de DNA/economia , Análise de Sequência de DNA/instrumentaçãoRESUMO
BACKGROUND: Halyomorpha halys (Stål) (Insecta:Hemiptera;Pentatomidae), commonly known as the Brown Marmorated Stink Bug (BMSB), is an invasive pest of the mid-Atlantic region of the United States, causing economically important damage to a wide range of crops. Native to Asia, BMSB was first observed in Allentown, PA, USA, in 1996, and this pest is now well-established throughout the US mid-Atlantic region and beyond. In addition to the serious threat BMSB poses to agriculture, BMSB has become a nuisance to homeowners, invading home gardens and congregating in large numbers in human-made structures, including homes, to overwinter. Despite its significance as an agricultural pest with limited control options, only 100 bp of BMSB sequence data was available in public databases when this project began. RESULTS: Transcriptome sequencing was undertaken to provide a molecular resource to the research community to inform the development of pest control strategies and to provide molecular data for population genetics studies of BMSB. Using normalized, strand-specific libraries, we sequenced pools of all BMSB life stages on the Illumina HiSeq. Trinity was used to assemble 200,000 putative transcripts in >100,000 components. A novel bioinformatic method that analyzed the strand-specificity of the data reduced this to 53,071 putative transcripts from 18,573 components. By integrating multiple other data types, we narrowed this further to 13,211 representative transcripts. CONCLUSIONS: Bacterial endosymbiont genes were identified in this dataset, some of which have a copy number consistent with being lateral gene transfers between endosymbiont genomes and Hemiptera, including ankyrin-repeat related proteins, lysozyme, and mannanase. Such genes and endosymbionts may provide novel targets for BMSB-specific biocontrol. This study demonstrates the utility of strand-specific sequencing in generating shotgun transcriptomes and that rapid sequencing shotgun transcriptomes is possible without the need for extensive inbreeding to generate homozygous lines. Such sequencing can provide a rapid response to pest invasions similar to that already described for disease epidemiology.
Assuntos
Perfilação da Expressão Gênica/métodos , Heterópteros/genética , Proteínas de Insetos/genética , Análise de Sequência de RNA/métodos , Animais , Bactérias/genética , Proteínas de Bactérias/genética , Biologia Computacional/métodos , Feminino , Transferência Genética Horizontal , Heterópteros/microbiologia , Espécies Introduzidas , Masculino , Dados de Sequência Molecular , Filogenia , SimbioseRESUMO
BACKGROUND: Phenotypic data are routinely used to elucidate gene function in organisms amenable to genetic manipulation. However, previous to this work, there was no generalizable system in place for the structured storage and retrieval of phenotypic information for bacteria. RESULTS: The Ontology of Microbial Phenotypes (OMP) has been created to standardize the capture of such phenotypic information from microbes. OMP has been built on the foundations of the Basic Formal Ontology and the Phenotype and Trait Ontology. Terms have logical definitions that can facilitate computational searching of phenotypes and their associated genes. OMP can be accessed via a wiki page as well as downloaded from SourceForge. Initial annotations with OMP are being made for Escherichia coli using a wiki-based annotation capture system. New OMP terms are being concurrently developed as annotation proceeds. CONCLUSIONS: We anticipate that diverse groups studying microbial genetics and associated phenotypes will employ OMP for standardizing microbial phenotype annotation, much as the Gene Ontology has standardized gene product annotation. The resulting OMP resource and associated annotations will facilitate prediction of phenotypes for unknown genes and result in new experimental characterization of phenotypes and functions.
Assuntos
Fenômenos Fisiológicos Bacterianos , Biologia Computacional/métodos , Software , FenótipoRESUMO
The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available, web-based resource for researchers studying fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD curators have now completed comprehensive review of the entire published literature about Aspergillus nidulans and Aspergillus fumigatus, and this annotation is provided with streamlined, ortholog-based navigation of the multispecies information. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. AspGD also provides resources to foster interaction and dissemination of community information and resources. We welcome and encourage feedback at aspergillus-curator@lists.stanford.edu.
Assuntos
Aspergillus/genética , Bases de Dados Genéticas , Genoma Fúngico , Aspergillus fumigatus/genética , Aspergillus nidulans/genética , Genes Fúngicos , Genômica , Anotação de Sequência MolecularRESUMO
We report the draft genome sequences of the collection referred to as the Escherichia coli DECA collection, which was assembled to contain representative isolates of the 15 most common diarrheagenic clones in humans (http://shigatox.net/new/). These genomes represent a valuable resource to the community of researchers who examine these enteric pathogens.
Assuntos
Diarreia/microbiologia , Infecções por Escherichia coli/microbiologia , Escherichia coli/genética , Genoma Bacteriano , Sequência de Bases , Pré-Escolar , Escherichia coli/classificação , Escherichia coli/isolamento & purificação , Feminino , Humanos , Lactente , Masculino , Dados de Sequência MolecularRESUMO
The rice gene Polyamine Uptake Transporter1 (PUT1) was originally identified based on its homology to the polyamine uptake transporters LmPOT1 and TcPAT12 in Leishmania major and Trypanosoma cruzi, respectively. Here we show that five additional transporters from rice and Arabidopsis that cluster in the same clade as PUT1 all function as high affinity spermidine uptake transporters. Yeast expression assays of these genes confirmed that uptake of spermidine was minimally affected by 166 fold or greater concentrations of amino acids. Characterized polyamine transporters from both Arabidopsis thaliana and Oryza sativa along with the two polyamine transporters from L. major and T. cruzi were aligned and used to generate a hidden Markov model. This model was used to identify significant matches to proteins in other angiosperms, bryophytes, chlorophyta, discicristates, excavates, stramenopiles and amoebozoa. No significant matches were identified in fungal or metazoan genomes. Phylogenic analysis showed that some sequences from the haptophyte, Emiliania huxleyi, as well as sequences from oomycetes and diatoms clustered closer to sequences from plant genomes than from a homologous sequence in the red algal genome Galdieria sulphuraria, consistent with the hypothesis that these polyamine transporters were acquired by horizontal transfer from green algae. Leishmania and Trypansosoma formed a separate cluster with genes from other Discicristates and two Entamoeba species. We surmise that the genes in Entamoeba species were acquired by phagotrophy of Discicristates. In summary, phylogenetic and functional analysis has identified two clades of genes that are predictive of polyamine transport activity.
Assuntos
Arabidopsis/genética , Proteínas de Membrana Transportadoras/genética , Oryza/genética , Filogenia , Poliaminas/metabolismo , Arabidopsis/metabolismo , Transporte Biológico , Proteínas de Transporte de Cátions/genética , Proteínas de Transporte de Cátions/metabolismo , Evolução Molecular , Transferência Genética Horizontal , Teste de Complementação Genética , Cinética , Leishmania major/genética , Leishmania major/metabolismo , Proteínas de Membrana Transportadoras/metabolismo , Especificidade de Órgãos , Oryza/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Proteínas de Protozoários/genética , Proteínas de Protozoários/metabolismo , Putrescina/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/crescimento & desenvolvimento , Saccharomyces cerevisiae/metabolismo , Espermidina/metabolismo , Especificidade por Substrato , Fatores de Tempo , Trypanosoma cruzi/genética , Trypanosoma cruzi/metabolismoRESUMO
The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at aspergillus-curator@genome.stanford.edu.
Assuntos
Aspergillus nidulans/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Genoma Fúngico , Biologia Computacional/tendências , Bases de Dados de Proteínas , Proteínas Fúngicas/metabolismo , Genes Fúngicos , Genética , Armazenamento e Recuperação da Informação/métodos , Internet , Modelos Genéticos , Fenótipo , Estrutura Terciária de Proteína , SoftwareRESUMO
On 28 May 2021, leisure travel restrictions in place to control coronavirus disease 2019 (COVID-19) were eased among vaccinated U.S. military personnel and beneficiaries stationed in South Korea (USFK) allowing access to bars and clubs which were off limits. We describe results from an investigation of the largest severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak as of November 2021 among USFK personnel following this change in policy. Data such as SARS-CoV-2 real-time polymerase chain reaction (RT-PCR) test results, demographic characteristics, symptom and vaccination histories, and genome sequences were analyzed. Of a total 207 new cases of COVID-19 diagnosed among USFK members from 15 June to 27 July 2021, 113 (57%) eligible cases were fully vaccinated, of whom 86 (76%) were symptomatic. RT-PCR cycling threshold values were similar among vaccinated and unvaccinated members. Whole genomic sequencing of 54 outbreak samples indicated all infections were due to the Delta variant. Phylogenetic analysis revealed two sources of SARS-CoV-2 accounted for 41% of infections among vaccinated and unvaccinated members. Vaccinated personnel were not at risk of severe illness; however, 86% experienced symptoms following infection. There were no hospitalizations among COVID-19 cases, most of whom were young military service members. Rescinded restrictions were reinstated to control the outbreak. Masking was mandated among all personnel predating U.S. national recommendations for indoor masking in high COVID-19 transmission areas. Increased vaccination with continued vigilance and extension of COVID-19 mitigation measures are warranted to contain the spread of SARS-CoV-2 variants of concern.
RESUMO
The Ontology for Biomedical Investigations (OBI) underwent a focused review of assay term annotations, logic and hierarchy with a goal to improve and standardize these terms. As a result, inconsistencies in W3C Web Ontology Language (OWL) expressions were identified and corrected, and additionally, standardized design patterns and a formalized template to maintain them were developed. We describe here this informative and productive process to describe the specific benefits and obstacles for OBI and the universal lessons for similar projects.
Assuntos
Ontologias Biológicas , Idioma , Padrões de ReferênciaRESUMO
Olfactory receptors (ORs), encoded by the largest vertebrate multigene family, enable the detection of thousands of unique odorants in the environment and consequently play a critical role in species survival. Here, we advance our knowledge of OR gene evolution in procellariiform seabirds, an avian group which relies on the sense of olfaction for critical ecological functions. We built a cosmid library of Cory's Shearwater (Calonectris borealis) genomic DNA, a model species for the study of olfaction-based navigation, and sequence OR gene-positive cosmid clones with a combination of sequencing technologies. We identified 220 OR open reading frames, 20 of which are full length, intact OR genes, and found a large ratio of partial and pseudogenes to intact OR genes (2:1), suggestive of a dynamic mode of evolution. Phylogenetic analyses revealed that while a few genes cluster with those of other sauropsid species in a γ (gamma) clade that predates the divergence of different avian lineages, most genes belong to an avian-specific γ-c clade, within which sequences cluster by species, suggesting frequent duplication and/or gene conversion events. We identified evidence of positive selection on full length γ-c clade genes. These patterns are consistent with a key role of adaptation in the functional diversification of olfactory receptor genes in a bird lineage that relies extensively on olfaction.
Assuntos
Adaptação Fisiológica/genética , Aves/genética , Aves/fisiologia , Evolução Molecular , Receptores Odorantes/genética , Animais , Modelos Moleculares , Filogenia , Estrutura Secundária de Proteína , Receptores Odorantes/química , Receptores Odorantes/metabolismoRESUMO
A critical function for symbionts is the acquisition of nutrients from their host. Relationships between hosts and symbionts range from biotrophic mutualism to necrotrophic parasitism, with a corresponding range of structures to facilitate nutrient flow between host and symbiont. Here, we review common themes among the nutrient acquisition strategies of a range of plant symbiotic microorganisms, including mutualistic symbionts, biotrophic pathogens that feed from living tissue, necrotrophic pathogens that kill host tissue, and hemibiotrophic pathogens that switch from biotrophy to necrotrophy. We show how Gene Ontology (GO) terms developed by the Plant-Associated Microbe Gene Ontology (PAMGO) Consortium can be used for describing commonalities in nutrient acquisition among diverse plant symbionts. Where appropriate, parallels found among animal symbionts are also highlighted.
Assuntos
Plantas/microbiologia , Simbiose , Terminologia como Assunto , Animais , Bactérias/metabolismo , Interações Hospedeiro-Patógeno , Micorrizas/metabolismo , Nematoides/metabolismo , Fenômenos Fisiológicos da Nutrição , Oomicetos/metabolismo , Plantas/metabolismo , Vocabulário ControladoRESUMO
Plant diseases caused by fungi and oomycetes result in significant economic losses every year. Although phylogenetically distant, the infection processes by these organisms share many common features. These include dispersal of an infectious particle, host adhesion, recognition, penetration, invasive growth, and lesion development. Previously, many of these common processes did not have corresponding Gene Ontology (GO) terms. For example, no GO terms existed to describe processes related to the appressorium, an important structure for infection by many fungi and oomycetes. In this mini-review, we identify common features of the pathogenic processes of fungi and oomycetes and create a pathogenesis model using 256 newly developed and 38 extant GO terms, with an emphasis on the appressorium and signal transduction. This set of standardized GO terms provides a solid base to further compare and contrast the molecular underpinnings of fungal and oomycete pathogenesis.
Assuntos
Fungos/patogenicidade , Oomicetos/patogenicidade , Doenças das Plantas/microbiologia , Plantas/microbiologia , Terminologia como Assunto , Interações Hospedeiro-Patógeno , Transdução de Sinais , Esporos/patogenicidade , Vocabulário ControladoRESUMO
Manipulation of programmed cell death (PCD) is central to many host microbe interactions. Both plant and animal cells use PCD as a powerful weapon against biotrophic pathogens, including viruses, which draw their nutrition from living tissue. Thus, diverse biotrophic pathogens have evolved many mechanisms to suppress programmed cell death, and mutualistic and commensal microbes may employ similar mechanisms. Necrotrophic pathogens derive their nutrition from dead tissue, and many produce toxins specifically to trigger programmed cell death in their hosts. Hemibiotrophic pathogens manipulate PCD in a most exquisite way, suppressing PCD during the biotrophic phase and stimulating it during the necrotrophic phase. This mini-review will summarize the mechanisms that have evolved in diverse microbes and hosts for controlling PCD and the Gene Ontology terms developed by the Plant-Associated Microbe Gene Ontology (PAMGO) Consortium for describing those mechanisms.
Assuntos
Apoptose , Interações Hospedeiro-Patógeno , Simbiose , Terminologia como Assunto , Bactérias/metabolismo , Bactérias/patogenicidade , Fungos/metabolismo , Fungos/patogenicidade , Oomicetos/metabolismo , Oomicetos/patogenicidade , Doenças das Plantas/genética , Doenças das Plantas/microbiologia , Vírus/metabolismo , Vírus/patogenicidade , Vocabulário ControladoRESUMO
BACKGROUND: Microbial genetics has formed a foundation for understanding many aspects of biology. Systematic annotation that supports computational data mining should reveal further insights for microbes, microbiomes, and conserved functions beyond microbes. The Ontology of Microbial Phenotypes (OMP) was created to support such annotation. RESULTS: We define standards for an OMP-based annotation framework that supports the capture of a variety of phenotypes and provides flexibility for different levels of detail based on a combination of pre- and post-composition using OMP and other Open Biomedical Ontology (OBO) projects. A system for entering and viewing OMP annotations has been added to our online, public, web-based data portal. CONCLUSIONS: The annotation framework described here is ready to support projects to capture phenotypes from the experimental literature for a variety of microbes. Defining the OMP annotation standard should support the development of new software tools for data mining and analysis in comparative phenomics.
Assuntos
Ontologias Biológicas , Curadoria de Dados/métodos , Microbiologia , Fenótipo , MetadadosRESUMO
High-throughput studies constitute an essential and valued source of information for researchers. However, high-throughput experimental workflows are often complex, with multiple data sets that may contain large numbers of false positives. The representation of high-throughput data in the Gene Ontology (GO) therefore presents a challenging annotation problem, when the overarching goal of GO curation is to provide the most precise view of a gene's role in biology. To address this, representatives from annotation teams within the GO Consortium reviewed high-throughput data annotation practices. We present an annotation framework for high-throughput studies that will facilitate good standards in GO curation and, through the use of new high-throughput evidence codes, increase the visibility of these annotations to the research community.
Assuntos
Bases de Dados Genéticas , Ontologia Genética , Genômica/métodos , Anotação de Sequência Molecular/métodos , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNARESUMO
Six unique expressed sequence tag (EST) libraries were generated from four developmental stages of Phytophthora sojae P6497. RNA was extracted from mycelia, swimming zoospores, germinating cysts, and soybean (Glycine max (L.) Merr.) cv. Harosoy tissues heavily infected with P. sojae. Three libraries were created from mycelia growing on defined medium, complex medium, and nutrient-limited medium. The 26,943 high-quality sequences obtained clustered into 7,863 unigenes composed of 2,845 contigs and 5,018 singletons. The total number of P. sojae unigenes matching sequences in the genome assembly was 7,412 (94%). Of these unigenes, 7,088 (90%) matched gene models predicted from the P. sojae sequence assembly, but only 2,047 (26%) matched P. ramorum gene models. Analysis of EST frequency from different growth conditions and morphological stages revealed genes that were specific to or highly represented in particular growth conditions and life stages. Additionally, our results indicate that, during infection, the pathogen derives most of its carbon and energy via glycolysis of sugars in the plant. Sequences identified with putative roles in pathogenesis included avirulence homologs possessing the RxLR motif, elicitins, and hydrolytic enzymes. This large collection of P. sojae ESTs will serve as a valuable public genomic resource.
Assuntos
Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica , Genes Fúngicos , Phytophthora/genética , Análise por Conglomerados , Biblioteca Gênica , Dados de Sequência Molecular , Phytophthora/crescimento & desenvolvimento , Análise de Sequência de DNA , Glycine max/microbiologiaRESUMO
The Evidence and Conclusion Ontology (ECO) is a community resource for describing the various types of evidence that are generated during the course of a scientific study and which are typically used to support assertions made by researchers. ECO describes multiple evidence types, including evidence resulting from experimental (i.e., wet lab) techniques, evidence arising from computational methods, statements made by authors (whether or not supported by evidence), and inferences drawn by researchers curating the literature. In addition to summarizing the evidence that supports a particular assertion, ECO also offers a means to document whether a computer or a human performed the process of making the annotation. Incorporating ECO into an annotation system makes it possible to leverage the structure of the ontology such that associated data can be grouped hierarchically, users can select data associated with particular evidence types, and quality control pipelines can be optimized. Today, over 30 resources, including the Gene Ontology, use the Evidence and Conclusion Ontology to represent both evidence and how annotations are made.