Pesquisa | Biblioteca Virtual em Saúde

NCI Cancer Research Data Commons: Core Standards and Services.

Brady, Arthur; Charbonneau, Amanda; Grossman, Robert L; Creasy, Heather H; Renner, Robinette; Pihl, Todd; Otridge, John; Kim, Erika; Barnholtz-Sloan, Jill S; Kerlavage, Anthony R.

Cancer Res ; 84(9): 1384-1387, 2024 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-38488505

RESUMO

The NCI Cancer Research Data Commons (CRDC) is a collection of data commons, analysis platforms, and tools that make existing cancer data more findable and accessible by the cancer research community. In practice, the two biggest hurdles to finding and using data for discovery are the wide variety of models and ontologies used to describe data, and the dispersed storage of that data. Here, we outline core CRDC services to aggregate descriptive information from multiple studies for findability via a single interface and to provide a single access method that spans multiple data commons. See related articles by Wang et al., p. 1388, Pot et al., p. 1396, and Kim et al., p. 1404.

Assuntos

National Cancer Institute (U.S.) , Neoplasias , Humanos , Estados Unidos , Neoplasias/terapia , Pesquisa Biomédica/normas , Bases de Dados Factuais

Matrix and analysis metadata standards (MAMS) to facilitate harmonization and reproducibility of single-cell data.

Wang, Yichen; Sarfraz, Irzam; Teh, Wei Kheng; Sokolov, Artem; Herb, Brian R; Creasy, Heather H; Virshup, Isaac; Dries, Ruben; Degatano, Kylee; Mahurkar, Anup; Schnell, Daniel J; Madrigal, Pedro; Hilton, Jason; Gehlenborg, Nils; Tickle, Timothy; Campbell, Joshua D.

bioRxiv ; 2023 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-36945543

RESUMO

A large number of genomic and imaging datasets are being produced by consortia that seek to characterize healthy and disease tissues at single-cell resolution. While much effort has been devoted to capturing information related to biospecimen information and experimental procedures, the metadata standards that describe data matrices and the analysis workflows that produced them are relatively lacking. Detailed metadata schema related to data analysis are needed to facilitate sharing and interoperability across groups and to promote data provenance for reproducibility. To address this need, we developed the Matrix and Analysis Metadata Standards (MAMS) to serve as a resource for data coordinating centers and tool developers. We first curated several simple and complex "use cases" to characterize the types of feature-observation matrices (FOMs), annotations, and analysis metadata produced in different workflows. Based on these use cases, metadata fields were defined to describe the data contained within each matrix including those related to processing, modality, and subsets. Suggested terms were created for the majority of fields to aid in harmonization of metadata terms across groups. Additional provenance metadata fields were also defined to describe the software and workflows that produced each FOM. Finally, we developed a simple list-like schema that can be used to store MAMS information and implemented in multiple formats. Overall, MAMS can be used as a guide to harmonize analysis-related metadata which will ultimately facilitate integration of datasets across tools and consortia. MAMS specifications, use cases, and examples can be found at https://github.com/single-cell-mams/mams/.

The Neuroscience Multi-Omic Archive: a BRAIN Initiative resource for single-cell transcriptomic and epigenomic data from the mammalian brain.

Ament, Seth A; Adkins, Ricky S; Carter, Robert; Chrysostomou, Elena; Colantuoni, Carlo; Crabtree, Jonathan; Creasy, Heather H; Degatano, Kylee; Felix, Victor; Gandt, Peter; Garden, Gwenn A; Giglio, Michelle; Herb, Brian R; Khajouei, Farzaneh; Kiernan, Elizabeth; McCracken, Carrie; McDaniel, Kennedy; Nadendla, Suvarna; Nickel, Lance; Olley, Dustin; Orvis, Joshua; Receveur, Joseph P; Schor, Mike; Sonthalia, Shreyash; Tickle, Timothy L; Way, Jessica; Hertzano, Ronna; Mahurkar, Anup A; White, Owen R.

Nucleic Acids Res ; 51(D1): D1075-D1085, 2023 01 06.

Artigo em Inglês | MEDLINE | ID: mdl-36318260

RESUMO

Scalable technologies to sequence the transcriptomes and epigenomes of single cells are transforming our understanding of cell types and cell states. The Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative Cell Census Network (BICCN) is applying these technologies at unprecedented scale to map the cell types in the mammalian brain. In an effort to increase data FAIRness (Findable, Accessible, Interoperable, Reusable), the NIH has established repositories to make data generated by the BICCN and related BRAIN Initiative projects accessible to the broader research community. Here, we describe the Neuroscience Multi-Omic Archive (NeMO Archive; nemoarchive.org), which serves as the primary repository for genomics data from the BRAIN Initiative. Working closely with other BRAIN Initiative researchers, we have organized these data into a continually expanding, curated repository, which contains transcriptomic and epigenomic data from over 50 million brain cells, including single-cell genomic data from all of the major regions of the adult and prenatal human and mouse brains, as well as substantial single-cell genomic data from non-human primates. We make available several tools for accessing these data, including a searchable web portal, a cloud-computing interface for large-scale data processing (implemented on Terra, terra.bio), and a visualization and analysis platform, NeMO Analytics (nemoanalytics.org).

Assuntos

Encéfalo , Bases de Dados Genéticas , Epigenômica , Multiômica , Transcriptoma , Animais , Camundongos , Genômica , Mamíferos , Primatas , Encéfalo/citologia , Encéfalo/metabolismo

Making Common Fund data more findable: catalyzing a data ecosystem.

Charbonneau, Amanda L; Brady, Arthur; Czajkowski, Karl; Aluvathingal, Jain; Canchi, Saranya; Carter, Robert; Chard, Kyle; Clarke, Daniel J B; Crabtree, Jonathan; Creasy, Heather H; D'Arcy, Mike; Felix, Victor; Giglio, Michelle; Gingrich, Alicia; Harris, Rayna M; Hodges, Theresa K; Ifeonu, Olukemi; Jeon, Minji; Kropiwnicki, Eryk; Lim, Marisa C W; Liming, R Lee; Lumian, Jessica; Mahurkar, Anup A; Mandal, Meisha; Munro, James B; Nadendla, Suvarna; Richter, Rudyard; Romano, Cia; Rocca-Serra, Philippe; Schor, Michael; Schuler, Robert E; Tangmunarunkit, Hongsuda; Waldrop, Alex; Williams, Cris; Word, Karen; Sansone, Susanna-Assunta; Ma'ayan, Avi; Wagner, Rick; Foster, Ian; Kesselman, Carl; Brown, C Titus; White, Owen.

Gigascience ; 112022 11 21.

Artigo em Inglês | MEDLINE | ID: mdl-36409836

RESUMO

The Common Fund Data Ecosystem (CFDE) has created a flexible system of data federation that enables researchers to discover datasets from across the US National Institutes of Health Common Fund without requiring that data owners move, reformat, or rehost those data. This system is centered on a catalog that integrates detailed descriptions of biomedical datasets from individual Common Fund Programs' Data Coordination Centers (DCCs) into a uniform metadata model that can then be indexed and searched from a centralized portal. This Crosscut Metadata Model (C2M2) supports the wide variety of data types and metadata terms used by individual DCCs and can readily describe nearly all forms of biomedical research data. We detail its use to ingest and index data from 11 DCCs.

Assuntos

Ecossistema , Administração Financeira , Metadados

Erratum: Strains, functions and dynamics in the expanded Human Microbiome Project.

Lloyd-Price, Jason; Mahurkar, Anup; Rahnavard, Gholamali; Crabtree, Jonathan; Orvis, Joshua; Hall, A Brantley; Brady, Arthur; Creasy, Heather H; McCracken, Carrie; Giglio, Michelle G; McDonald, Daniel; Franzosa, Eric A; Knight, Rob; White, Owen; Huttenhower, Curtis.

Nature ; 551(7679): 256, 2017 11 09.

Artigo em Inglês | MEDLINE | ID: mdl-29022944

RESUMO

This corrects the article DOI: 10.1038/nature23889.

Strains, functions and dynamics in the expanded Human Microbiome Project.

Nature ; 550(7674): 61-66, 2017 10 05.

Artigo em Inglês | MEDLINE | ID: mdl-28953883

RESUMO

The characterization of baseline microbial and functional diversity in the human microbiome has enabled studies of microbiome-related disease, diversity, biogeography, and molecular function. The National Institutes of Health Human Microbiome Project has provided one of the broadest such characterizations so far. Here we introduce a second wave of data from the study, comprising 1,631 new metagenomes (2,355 total) targeting diverse body sites with multiple time points in 265 individuals. We applied updated profiling and assembly methods to provide new characterizations of microbiome personalization. Strain identification revealed subspecies clades specific to body sites; it also quantified species with phylogenetic diversity under-represented in isolate genomes. Body-wide functional profiling classified pathways into universal, human-enriched, and body site-enriched subsets. Finally, temporal analysis decomposed microbial variation into rapidly variable, moderately variable, and stable subsets. This study furthers our knowledge of baseline human microbial diversity and enables an understanding of personalized microbiome function and dynamics.

Assuntos

Microbiota/fisiologia , Filogenia , Conjuntos de Dados como Assunto , Humanos , Metagenoma/genética , Metagenoma/fisiologia , Microbiota/genética , Anotação de Sequência Molecular , National Institutes of Health (U.S.) , Especificidade de Órgãos , Análise Espaço-Temporal , Fatores de Tempo , Estados Unidos

The IGS Standard Operating Procedure for Automated Prokaryotic Annotation.

Galens, Kevin; Orvis, Joshua; Daugherty, Sean; Creasy, Heather H; Angiuoli, Sam; White, Owen; Wortman, Jennifer; Mahurkar, Anup; Giglio, Michelle Gwinn.

Stand Genomic Sci ; 4(2): 244-51, 2011 Apr 29.

Artigo em Inglês | MEDLINE | ID: mdl-21677861

RESUMO

The Institute for Genome Sciences (IGS) has developed a prokaryotic annotation pipeline that is used for coding gene/RNA prediction and functional annotation of Bacteria and Archaea. The fully automated pipeline accepts one or many genomic sequences as input and produces output in a variety of standard formats. Functional annotation is primarily based on similarity searches and motif finding combined with a hierarchical rule based annotation system. The output annotations can also be loaded into a relational database and accessed through visualization tools.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA