Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Bioinformatics ; 35(9): 1562-1565, 2019 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-30256906

RESUMO

MOTIVATION: Standardization and semantic alignment have been considered one of the major challenges for data integration in clinical research. The inclusion of the CDISC SDTM clinical data standard into the tranSMART i2b2 via a guiding master ontology tree positively impacts and supports the efficacy of data sharing, visualization and exploration across datasets. RESULTS: We present here a schema for the organization of SDTM variables into the tranSMART i2b2 tree along with a script and test dataset to exemplify the mapping strategy. The eTRIKS master tree concept is demonstrated by making use of fictitious data generated for four patients, including 16 SDTM clinical domains. We describe how the usage of correct visit names and data labels can help to integrate multiple readouts per patient and avoid ETL crashes when running a tranSMART loading routine. AVAILABILITY AND IMPLEMENTATION: The eTRIKS Master Tree package and test datasets are publicly available at https://doi.org/10.5281/zenodo.1009098 and a functional demo installation at https://public.etriks.org/transmart/datasetExplorer/ under eTRIKS-Master Tree branch, where the discussed examples can be visualized.


Assuntos
Armazenamento e Recuperação da Informação , Confiabilidade dos Dados , Coleta de Dados , Humanos , Disseminação de Informação
2.
Bioinformatics ; 31(10): 1655-62, 2015 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-25573920

RESUMO

MOTIVATION: The probability of effective treatment of cancer with a targeted therapeutic can be improved for patients with defined genotypes containing actionable mutations. To this end, many human cancer biobanks are integrating more tightly with genomic sequencing facilities and with those creating and maintaining patient-derived xenografts (PDX) and cell lines to provide renewable resources for translational research. RESULTS: To support the complex data management needs and workflows of several such biobanks, we developed Acquire. It is a robust, secure, web-based, database-backed open-source system that supports all major needs of a modern cancer biobank. Its modules allow for i) up-to-the-minute 'scoreboard' and graphical reporting of collections; ii) end user roles and permissions; iii) specimen inventory through caTissue Suite; iv) shipping forms for distribution of specimens to pathology, genomic analysis and PDX/cell line creation facilities; v) robust ad hoc querying; vi) molecular and cellular quality control metrics to track specimens' progress and quality; vii) public researcher request; viii) resource allocation committee distribution request review and oversight and ix) linkage to available derivatives of specimen.


Assuntos
Bancos de Espécimes Biológicos , Mineração de Dados/métodos , Armazenamento e Recuperação da Informação/métodos , Neoplasias , Controle de Qualidade , Software , Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Genômica , Humanos , Interface Usuário-Computador
3.
Physiol Genomics ; 44(17): 853-63, 2012 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-22786849

RESUMO

The nuclear receptor (NR) superfamily of ligand-regulated transcription factors directs ligand- and tissue-specific transcriptomes in myriad developmental, metabolic, immunological, and reproductive processes. The NR signaling field has generated a wealth of genome-wide expression data points, but due to deficits in their accessibility, annotation, and integration, the full potential of these studies has not yet been realized. We searched public gene expression databases and MEDLINE for global transcriptomic datasets relevant to NRs, their ligands, and coregulators. We carried out extensive, deep reannotation of the datasets using controlled vocabularies for RNA Source and regulating molecule and resolved disparate gene identifiers to official gene symbols to facilitate comparison of fold changes and their significance across multiple datasets. We assembled these data points into a database, Transcriptomine (http://www.nursa.org/transcriptomine), that allows for multiple, menu-driven querying strategies of this transcriptomic "superdataset," including single and multiple genes, Gene Ontology terms, disease terms, and uploaded custom gene lists. Experimental variables such as regulating molecule, RNA Source, as well as fold-change and P value cutoff values can be modified, and full data records can be either browsed or downloaded for downstream analysis. We demonstrate the utility of Transcriptomine as a hypothesis generation and validation tool using in silico and experimental use cases. Our resource empowers users to instantly and routinely mine the collective biology of millions of previously disparate transcriptomic data points. By incorporating future transcriptome-wide datasets in the NR signaling field, we anticipate Transcriptomine developing into a powerful resource for the NR- and other signal transduction research communities.


Assuntos
Bases de Dados Genéticas , Internet , Receptores Citoplasmáticos e Nucleares/metabolismo , Transdução de Sinais/genética , Software , Transcriptoma/genética , Animais , Diferenciação Celular/fisiologia , Primers do DNA/genética , Células-Tronco Embrionárias/citologia , Humanos , Camundongos , Ratos , Reação em Cadeia da Polimerase em Tempo Real
4.
Learn Health Syst ; 3(1): e10076, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31245598

RESUMO

The benefits of reusing EHR data for clinical research studies are numerous. They portend the opportunity to bring new therapies to patients sooner, potentially at a lower cost, and to accelerate learning health cycles-through faster data acquisition in clinical research studies. Metrics have proven that time can be saved, workflow and processes streamlined, and data quality increased significantly. Pilot projects and now actual investigational trials used for regulatory submissions have shown that these benefits support the transformation of clinical research by leveraging EHRs for research. Panelists at a recent collaborative focused on bridging clinical research and clinical care offered varying perspectives on how the latest standards and technologies could be leveraged to facilitate data transfer from EHR systems into clinical research databases, as well as the associated improvements in data quality. Panelists also discussed other avenues to leverage EHR in clinical research. Improvements and exciting possibilities notwithstanding, much work remains. Data ownership and access, attention to metadata and structured data for data sharing, and broader adoption of global standards are key areas for collaboration. With the steady increase in adoption of EHRs around the world, this is an excellent time for all stakeholders to work together and create an environment such that EHRs can be used more readily for research. The capacity for research can thus be increased to provide more high-quality information that will contribute to rapid continuous learning health systems from which all patients can benefit.

5.
Sci Data ; 6(1): 252, 2019 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-31672983

RESUMO

Mining of integrated public transcriptomic and ChIP-Seq (cistromic) datasets can illuminate functions of mammalian cellular signaling pathways not yet explored in the research literature. Here, we designed a web knowledgebase, the Signaling Pathways Project (SPP), which incorporates community classifications of signaling pathway nodes (receptors, enzymes, transcription factors and co-nodes) and their cognate bioactive small molecules. We then mapped over 10,000 public transcriptomic or cistromic experiments to their pathway node or biosample of study. To enable prediction of pathway node-gene target transcriptional regulatory relationships through SPP, we generated consensus 'omics signatures, or consensomes, which ranked genes based on measures of their significant differential expression or promoter occupancy across transcriptomic or cistromic experiments mapped to a specific node family. Consensomes were validated using alignment with canonical literature knowledge, gene target-level integration of transcriptomic and cistromic data points, and in bench experiments confirming previously uncharacterized node-gene target regulatory relationships. To expose the SPP knowledgebase to researchers, a web browser interface was designed that accommodates numerous routine data mining strategies. SPP is freely accessible at https://www.signalingpathways.org .


Assuntos
Bases de Dados Factuais , Transdução de Sinais , Animais , Humanos , Bases de Conhecimento , Mamíferos , Transcriptoma
6.
AMIA Jt Summits Transl Sci Proc ; 2017: 94-103, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29888049

RESUMO

The Clinical Data Interchange Standards Consortium (CDISC) is a global non-profit standards development organization that creates consensus-based standards for clinical and translational research. Several of these standards are now required by regulators for electronic submissions of regulated clinical trials' data and by government funding agencies. These standards are free and open, available for download on the CDISC Website as PDFs. While these documents are human readable, they are not amenable to ready use by electronic systems. CDISC launched the CDISC Shared Health And Research Electronic library (SHARE) to provide the standards metadata in machine-readable formats to facilitate the automated management and implementation of the standards. This paper describes how CDISC SHARE'S standards can facilitate collecting, aggregating and analyzing standardized data from early design to end analysis; and its role as a central resource providing information systems with metadata that drives process automation including study setup and data pipelining.

7.
J Am Med Inform Assoc ; 24(2): 388-393, 2017 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-27413121

RESUMO

Although omics datasets represent valuable assets for hypothesis generation, model testing, and data validation, the infrastructure supporting their reuse lacks organization and consistency. Using nuclear receptor signaling transcriptomic datasets as proof of principle, we developed a model to improve the discoverability, accessibility, and citability of published omics datasets. Primary datasets were retrieved from archives, processed to extract data points, then subjected to metadata enrichment and gap filling. The resulting secondary datasets were exposed on responsive web pages to support mining of gene lists, discovery of related datasets, and single-click citation integration with popular reference managers. Automated processes were established to embed digital object identifier-driven links to the secondary datasets in associated journal articles, small molecule and gene-centric databases, and a dataset search engine. Our model creates multiple points of access to reprocessed and reannotated derivative datasets across the digital biomedical research ecosystem, promoting their visibility and usability across disparate research communities.


Assuntos
Conjuntos de Dados como Assunto , Transcriptoma , Pesquisa Biomédica , Bases de Dados Genéticas , Genômica , Humanos , Metadados
8.
Sci Signal ; 10(476)2017 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-28442630

RESUMO

We previously developed a web tool, Transcriptomine, to explore expression profiling data sets involving small-molecule or genetic manipulations of nuclear receptor signaling pathways. We describe advances in biocuration, query interface design, and data visualization that enhance the discovery of uncharacterized biology in these pathways using this tool. Transcriptomine currently contains about 45 million data points encompassing more than 2000 experiments in a reference library of nearly 550 data sets retrieved from public archives and systematically curated. To make the underlying data points more accessible to bench biologists, we classified experimental small molecules and gene manipulations into signaling pathways and experimental tissues and cell lines into physiological systems and organs. Incorporation of these mappings into Transcriptomine enables the user to readily evaluate tissue-specific regulation of gene expression by nuclear receptor signaling pathways. Data points from animal and cell model experiments and from clinical data sets elucidate the roles of nuclear receptor pathways in gene expression events accompanying various normal and pathological cellular processes. In addition, data sets targeting non-nuclear receptor signaling pathways highlight transcriptional cross-talk between nuclear receptors and other signaling pathways. We demonstrate with specific examples how data points that exist in isolation in individual data sets validate each other when connected and made accessible to the user in a single interface. In summary, Transcriptomine allows bench biologists to routinely develop research hypotheses, validate experimental data, or model relationships between signaling pathways, genes, and tissues.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica , Genes , Receptores Citoplasmáticos e Nucleares/genética , Software , Transcriptoma , Animais , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Internet , Especificidade de Órgãos , Receptores Citoplasmáticos e Nucleares/metabolismo , Transdução de Sinais
9.
J Am Med Inform Assoc ; 24(5): 882-890, 2017 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-28339791

RESUMO

BACKGROUND: It is critical to integrate and analyze data from biological, translational, and clinical studies with data from health systems; however, electronic artifacts are stored in thousands of disparate systems that are often unable to readily exchange data. OBJECTIVE: To facilitate meaningful data exchange, a model that presents a common understanding of biomedical research concepts and their relationships with health care semantics is required. The Biomedical Research Integrated Domain Group (BRIDG) domain information model fulfills this need. Software systems created from BRIDG have shared meaning "baked in," enabling interoperability among disparate systems. For nearly 10 years, the Clinical Data Standards Interchange Consortium, the National Cancer Institute, the US Food and Drug Administration, and Health Level 7 International have been key stakeholders in developing BRIDG. METHODS: BRIDG is an open-source Unified Modeling Language-class model developed through use cases and harmonization with other models. RESULTS: With its 4+ releases, BRIDG includes clinical and now translational research concepts in its Common, Protocol Representation, Study Conduct, Adverse Events, Regulatory, Statistical Analysis, Experiment, Biospecimen, and Molecular Biology subdomains. INTERPRETATION: The model is a Clinical Data Standards Interchange Consortium, Health Level 7 International, and International Standards Organization standard that has been utilized in national and international standards-based software development projects. It will continue to mature and evolve in the areas of clinical imaging, pathology, ontology, and vocabulary support. BRIDG 4.1.1 and prior releases are freely available at https://bridgmodel.nci.nih.gov .


Assuntos
Pesquisa Biomédica , Interoperabilidade da Informação em Saúde/normas , Web Semântica , Web Semântica/normas , Software , Terminologia como Assunto
10.
Sci Data ; 3: 160010, 2016 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-26882539

RESUMO

Genomic data sharing in cancer has been restricted to aggregate or controlled-access initiatives to protect the privacy of research participants. By limiting access to these data, it has been argued that the autonomy of individuals who decide to participate in data sharing efforts has been superseded and the utility of the data as research and educational tools reduced. In a pilot Open Access (OA) project from the CPRIT-funded Texas Cancer Research Biobank, many Texas cancer patients were willing to openly share genomic data from tumor and normal matched pair specimens. For the first time, genetic data from 7 human cancer cases with matched normal are freely available without requirement for data use agreements nor any major restriction except that end users cannot attempt to re-identify the participants (http://txcrb.org/open.html).


Assuntos
DNA de Neoplasias , Bases de Dados Genéticas , Genoma Humano , Neoplasias Pancreáticas/genética , Acesso à Informação , Bancos de Espécimes Biológicos , Humanos , Disseminação de Informação , Texas
11.
PLoS One ; 10(9): e0135615, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26325041

RESUMO

Signaling pathways involving nuclear receptors (NRs), their ligands and coregulators, regulate tissue-specific transcriptomes in diverse processes, including development, metabolism, reproduction, the immune response and neuronal function, as well as in their associated pathologies. The Nuclear Receptor Signaling Atlas (NURSA) is a Consortium focused around a Hub website (www.nursa.org) that annotates and integrates diverse 'omics datasets originating from the published literature and NURSA-funded Data Source Projects (NDSPs). These datasets are then exposed to the scientific community on an Open Access basis through user-friendly data browsing and search interfaces. Here, we describe the redesign of the Hub, version 3.0, to deploy "Web 2.0" technologies and add richer, more diverse content. The Molecule Pages, which aggregate information relevant to NR signaling pathways from myriad external databases, have been enhanced to include resources for basic scientists, such as post-translational modification sites and targeting miRNAs, and for clinicians, such as clinical trials. A portal to NURSA's Open Access, PubMed-indexed journal Nuclear Receptor Signaling has been added to facilitate manuscript submissions. Datasets and information on reagents generated by NDSPs are available, as is information concerning periodic new NDSP funding solicitations. Finally, the new website integrates the Transcriptomine analysis tool, which allows for mining of millions of richly annotated public transcriptomic data points in the field, providing an environment for dataset re-use and citation, bench data validation and hypothesis generation. We anticipate that this new release of the NURSA database will have tangible, long term benefits for both basic and clinical research in this field.


Assuntos
Atlas como Assunto , Receptores Citoplasmáticos e Nucleares/fisiologia , Transdução de Sinais/fisiologia , Animais , Conjuntos de Dados como Assunto , Humanos , Disseminação de Informação , Internet
13.
PLoS One ; 8(6): e65961, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23824211

RESUMO

Current efforts to understand antibiotic resistance on the whole genome scale tend to focus on known genes even as high throughput sequencing strategies uncover novel mechanisms. To identify genomic variations associated with antibiotic resistance, we employed a modified genome-wide association study; we sequenced genomic DNA from pools of E. coli clinical isolates with similar antibiotic resistance phenotypes using SOLiD technology to uncover single nucleotide polymorphisms (SNPs) unanimously conserved in each pool. The multidrug-resistant pools were genotypically similar to SMS-3-5, a previously sequenced multidrug-resistant isolate from a polluted environment. The similarity was evenly spread across the entire genome and not limited to plasmid or pathogenicity island loci. Among the pools of clinical isolates, genomic variation was concentrated adjacent to previously reported inversion and duplication differences between the SMS-3-5 isolate and the drug-susceptible laboratory strain, DH10B. SNPs that result in non-synonymous changes in gyrA (encoding the well-known S83L allele associated with fluoroquinolone resistance), mutM, ligB, and recG were unanimously conserved in every fluoroquinolone-resistant pool. Alleles of the latter three genes are tightly linked among most sequenced E. coli genomes, and had not been implicated in antibiotic resistance previously. The changes in these genes map to amino acid positions in alpha helices that are involved in DNA binding. Plasmid-encoded complementation of null strains with either allelic variant of mutM or ligB resulted in variable responses to ultraviolet light or hydrogen peroxide treatment as markers of induced DNA damage, indicating their importance in DNA metabolism and revealing a potential mechanism for fluoroquinolone resistance. Our approach uncovered evidence that additional DNA binding enzymes may contribute to fluoroquinolone resistance and further implicate environmental bacteria as a reservoir for antibiotic resistance.


Assuntos
Antibacterianos/farmacologia , Farmacorresistência Bacteriana/genética , Escherichia coli/efeitos dos fármacos , Fluoroquinolonas/farmacologia , Genótipo , DNA Bacteriano/genética , Escherichia coli/genética , Testes de Sensibilidade Microbiana , Polimorfismo de Nucleotídeo Único
14.
Mol Endocrinol ; 26(10): 1660-74, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22902541

RESUMO

The proteome represents the identity, expression levels, interacting partners, and posttranslational modifications of proteins expressed within any given cell. Proteomic studies aim to census the quantitative and qualitative factors regulating the biological relationships of proteins acting in concert as functional cellular networks. In the field of endocrinology, proteomics has been of considerable value in determining the function and mechanism of action of endocrine signaling molecules in the cell membrane, cytoplasm, and nucleus and for the discovery of proteins as candidates for clinical biomarkers. The volume of data that can be generated by proteomics methodologies, up to gigabytes of data within a few hours, brings with it its own logistical hurdles and presents significant challenges to realizing the full potential of these datasets. In this minireview, we describe selected current proteomics methodologies and their application in basic and translational endocrinology before focusing on mass spectrometry as a model for current progress and challenges in data analysis, management, sharing, and integration.


Assuntos
Gestão da Informação , Armazenamento e Recuperação da Informação , Animais , Interpretação Estatística de Dados , Humanos , Disseminação de Informação , Espectrometria de Massas , Análise Serial de Proteínas , Proteoma/química , Proteoma/metabolismo , Proteômica , Transdução de Sinais , Eletroforese em Gel Diferencial Bidimensional
15.
Mol Endocrinol ; 26(10): 1675-81, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22734043

RESUMO

The National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK) supports multiple basic science consortia that generate high-content datasets, reagent resources, and methodologies, in the fields of kidney, urology, hematology, digestive, and endocrine diseases, as well as metabolic diseases such as diabetes and obesity. These currently include the Beta Cell Biology Consortium, the Nuclear Receptor Signaling Atlas, the Diabetic Complications Consortium, and the Mouse Metabolic Phenotyping Centers. Recognizing the synergy that would accrue from aggregating information generated and curated by these initiatives in a contiguous informatics network, we created the NIDDK Consortium Interconnectivity Network (dkCOIN; www.dkcoin.org). The goal of this pilot project, organized by the NIDDK, was to establish a single point of access to a toolkit of interconnected resources (datasets, reagents, and protocols) generated from individual consortia that could be readily accessed by biologists of diverse backgrounds and research interests. During the pilot phase of this activity dkCOIN collected nearly 2000 consortium-curated resources, including datasets (functional genomics) and reagents (mouse strains, antibodies, and adenoviral constructs) and built nearly 3000 resource-to-resource connections, thereby demonstrating the feasibility of further extending this database in the future. Thus, dkCOIN promises to be a useful informatics solution for rapidly identifying useful resources generated by participating research consortia.


Assuntos
Disseminação de Informação , Gestão da Informação , Academias e Institutos , Animais , Coleta de Dados , Bases de Dados Factuais/estatística & dados numéricos , Humanos , Internet , Camundongos , National Institute of Diabetes and Digestive and Kidney Diseases (U.S.) , Projetos Piloto , Pesquisa , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa