Búsqueda | Portal de Búsqueda de la BVS Colombia

Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation.

Schäffer, Alejandro A; McVeigh, Richard; Robbertse, Barbara; Schoch, Conrad L; Johnston, Anjanette; Underwood, Beverly A; Karsch-Mizrachi, Ilene; Nawrocki, Eric P.

BMC Bioinformatics ; 22(1): 400, 2021 Aug 12.

Artículo en Inglés | MEDLINE | ID: mdl-34384346

RESUMEN

BACKGROUND: The DNA sequences encoding ribosomal RNA genes (rRNAs) are commonly used as markers to identify species, including in metagenomics samples that may combine many organismal communities. The 16S small subunit ribosomal RNA (SSU rRNA) gene is typically used to identify bacterial and archaeal species. The nuclear 18S SSU rRNA gene, and 28S large subunit (LSU) rRNA gene have been used as DNA barcodes and for phylogenetic studies in different eukaryote taxonomic groups. Because of their popularity, the National Center for Biotechnology Information (NCBI) receives a disproportionate number of rRNA sequence submissions and BLAST queries. These sequences vary in quality, length, origin (nuclear, mitochondria, plastid), and organism source and can represent any region of the ribosomal cistron. RESULTS: To improve the timely verification of quality, origin and loci boundaries, we developed Ribovore, a software package for sequence analysis of rRNA sequences. The ribotyper and ribosensor programs are used to validate incoming sequences of bacterial and archaeal SSU rRNA. The ribodbmaker program is used to create high-quality datasets of rRNAs from different taxonomic groups. Key algorithmic steps include comparing candidate sequences against rRNA sequence profile hidden Markov models (HMMs) and covariance models of rRNA sequence and secondary-structure conservation, as well as other tests. Nine freely available blastn rRNA databases created and maintained with Ribovore are used for checking incoming GenBank submissions and used by the blastn browser interface at NCBI. Since 2018, Ribovore has been used to analyze more than 50 million prokaryotic SSU rRNA sequences submitted to GenBank, and to select at least 10,435 fungal rRNA RefSeq records from type material of 8350 taxa. CONCLUSION: Ribovore combines single-sequence and profile-based methods to improve GenBank processing and analysis of rRNA sequences. It is a standalone, portable, and extensible software package for the alignment, classification and validation of rRNA sequences. Researchers planning on submitting SSU rRNA sequences to GenBank are encouraged to download and use Ribovore to analyze their sequences prior to submission to determine which sequences are likely to be automatically accepted into GenBank.

Asunto(s)

Bases de Datos de Ácidos Nucleicos , ARN Ribosómico , ADN Ribosómico , Filogenia , ARN Ribosómico 16S/genética , ARN Ribosómico 18S/genética , Análisis de Secuencia de ARN

Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package.

Griffiths, Emma J; Timme, Ruth E; Mendes, Catarina Inês; Page, Andrew J; Alikhan, Nabil-Fareed; Fornika, Dan; Maguire, Finlay; Campos, Josefina; Park, Daniel; Olawoye, Idowu B; Oluniyi, Paul E; Anderson, Dominique; Christoffels, Alan; da Silva, Anders Gonçalves; Cameron, Rhiannon; Dooley, Damion; Katz, Lee S; Black, Allison; Karsch-Mizrachi, Ilene; Barrett, Tanya; Johnston, Anjanette; Connor, Thomas R; Nicholls, Samuel M; Witney, Adam A; Tyson, Gregory H; Tausch, Simon H; Raphenya, Amogelang R; Alcock, Brian; Aanensen, David M; Hodcroft, Emma; Hsiao, William W L; Vasconcelos, Ana Tereza R; MacCannell, Duncan R.

Gigascience ; 112022 02 16.

Artículo en Inglés | MEDLINE | ID: mdl-35169842

RESUMEN

BACKGROUND: The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. RESULTS: As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. CONCLUSIONS: Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.

Asunto(s)

COVID-19 , SARS-CoV-2 , Genómica , Humanos , Metadatos , Salud Pública , Reproducibilidad de los Resultados

Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities.

Vangay, Pajau; Burgin, Josephine; Johnston, Anjanette; Beck, Kristen L; Berrios, Daniel C; Blumberg, Kai; Canon, Shane; Chain, Patrick; Chandonia, John-Marc; Christianson, Danielle; Costes, Sylvain V; Damerow, Joan; Duncan, William D; Dundore-Arias, Jose Pablo; Fagnan, Kjiersten; Galazka, Jonathan M; Gibbons, Sean M; Hays, David; Hervey, Judson; Hu, Bin; Hurwitz, Bonnie L; Jaiswal, Pankaj; Joachimiak, Marcin P; Kinkel, Linda; Ladau, Joshua; Martin, Stanton L; McCue, Lee Ann; Miller, Kayd; Mouncey, Nigel; Mungall, Chris; Pafilis, Evangelos; Reddy, T B K; Richardson, Lorna; Roux, Simon; Schriml, Lynn M.; Shaffer, Justin P; Sundaramurthi, Jagadish Chandrabose; Thompson, Luke R; Timme, Ruth E; Zheng, Jie; Wood-Charlson, Elisha M; Eloe-Fadrosh, Emiley A.

mSystems ; 6(1)2021 02 23.

Artículo en Inglés | MEDLINE | ID: mdl-33622857

RESUMEN

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.

Plant specimen contextual data consensus.

Hoopen, Petra Ten; Walls, Ramona L; Cannon, Ethalinda Ks; Cochrane, Guy; Cole, James; Johnston, Anjanette; Karsch-Mizrachi, Ilene; Yilmaz, Pelin.

Gigascience ; 5(1): 1-4, 2016 12 01.

Artículo en Inglés | MEDLINE | ID: mdl-28369359

RESUMEN

The Compliance and Interoperability Working Group of the Genomic Standards Consortium facilitates the establishment of a community of experts and the development of recommendations to describe genomic data and associated information. Here we present our ongoing conation to harmonise the reporting of contextual plant specimen data associated with genomics and functional genomics. This commentary summarises the current state of our plant sample contextual data harmonisation efforts to engage a broad plant science community.

Asunto(s)

Genoma de Planta , Genómica/normas , Metadatos , Plantas/genética , Sociedades Científicas , Genómica/métodos

Correction for Vangay et al., "Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities".

Vangay, Pajau; Burgin, Josephine; Johnston, Anjanette; Beck, Kristen L; Berrios, Daniel C; Blumberg, Kai; Canon, Shane; Chain, Patrick; Chandonia, John-Marc; Christianson, Danielle; Costes, Sylvain V; Damerow, Joan; Duncan, William D; Dundore-Arias, Jose Pablo; Fagnan, Kjiersten; Galazka, Jonathan M; Gibbons, Sean M; Hays, David; Hervey, Judson; Hu, Bin; Hurwitz, Bonnie L; Jaiswal, Pankaj; Joachimiak, Marcin P; Kinkel, Linda; Ladau, Joshua; Martin, Stanton L; McCue, Lee Ann; Miller, Kayd; Mouncey, Nigel; Mungall, Chris; Pafilis, Evangelos; Reddy, T B K; Richardson, Lorna; Roux, Simon; Schriml, Lynn M; Shaffer, Justin P; Sundaramurthi, Jagadish Chandrabose; Thompson, Luke R; Timme, Ruth E; Zheng, Jie; Wood-Charlson, Elisha M; Eloe-Fadrosh, Emiley A.

mSystems ; 6(3)2021 May 04.

Artículo en Inglés | MEDLINE | ID: mdl-33947809

Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn; Knight, Rob; Cole, James R; Amaral-Zettler, Linda; Gilbert, Jack A; Karsch-Mizrachi, Ilene; Johnston, Anjanette; Cochrane, Guy; Vaughan, Robert; Hunter, Christopher; Park, Joonhong; Morrison, Norman; Rocca-Serra, Philippe; Sterk, Peter; Arumugam, Manimozhiyan; Bailey, Mark; Baumgartner, Laura; Birren, Bruce W; Blaser, Martin J; Bonazzi, Vivien; Booth, Tim; Bork, Peer; Bushman, Frederic D; Buttigieg, Pier Luigi; Chain, Patrick S G; Charlson, Emily; Costello, Elizabeth K; Huot-Creasy, Heather; Dawyndt, Peter; DeSantis, Todd; Fierer, Noah; Fuhrman, Jed A; Gallery, Rachel E; Gevers, Dirk; Gibbs, Richard A; San Gil, Inigo; Gonzalez, Antonio; Gordon, Jeffrey I; Guralnick, Robert; Hankeln, Wolfgang; Highlander, Sarah; Hugenholtz, Philip; Jansson, Janet; Kau, Andrew L; Kelley, Scott T; Kennedy, Jerry; Knights, Dan; Koren, Omry.

Nat Biotechnol ; 29(5): 415-20, 2011 May.

Artículo en Inglés | MEDLINE | ID: mdl-21552244

RESUMEN

Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental packages' apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.

Asunto(s)

Biomarcadores , Ambiente , Metagenómica/normas , Análisis de Secuencia de ADN/normas , Lista de Verificación , Bases de Datos Genéticas , Genes de ARNr , Variación Genética , Humanos , Almacenamiento y Recuperación de la Información/normas , Internet , Lenguajes de Programación , Programas Informáticos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA