Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Biol Psychiatry ; 86(5): 365-376, 2019 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-31151762

RESUMO

BACKGROUND: Habitual alcohol use can be an indicator of alcohol dependence, which is associated with a wide range of serious health problems. METHODS: We completed a genome-wide association study in 126,936 European American and 17,029 African American subjects in the Veterans Affairs Million Veteran Program for a quantitative phenotype based on maximum habitual alcohol consumption. RESULTS: ADH1B, on chromosome 4, was the lead locus for both populations: for the European American sample, rs1229984 (p = 4.9 × 10-47); for African American, rs2066702 (p = 2.3 × 10-12). In the European American sample, we identified three additional genome-wide-significant maximum habitual alcohol consumption loci: on chromosome 17, rs77804065 (p = 1.5 × 10-12), at CRHR1 (corticotropin-releasing hormone receptor 1); the protein product of this gene is involved in stress and immune responses; and on chromosomes 8 and 10. European American and African American samples were then meta-analyzed; the associated region at CRHR1 increased in significance to 1.02 × 10-13, and we identified two additional genome-wide significant loci, FGF14 (p = 9.86 × 10-9) (chromosome 13) and a locus on chromosome 11. Besides ADH1B, none of the five loci have prior genome-wide significant support. Post-genome-wide association study analysis identified genetic correlation to other alcohol-related traits, smoking-related traits, and many others. Replications were observed in UK Biobank data. Genetic correlation between maximum habitual alcohol consumption and alcohol dependence was 0.87 (p = 4.78 × 10-9). Enrichment for cell types included dopaminergic and gamma-aminobutyric acidergic neurons in midbrain, and pancreatic delta cells. CONCLUSIONS: The present study supports five novel alcohol-use risk loci, with particularly strong statistical support for CRHR1. Additionally, we provide novel insight regarding the biology of harmful alcohol use.


Assuntos
Consumo de Bebidas Alcoólicas/genética , Negro ou Afro-Americano/estatística & dados numéricos , Receptores de Hormônio Liberador da Corticotropina/genética , População Branca/estatística & dados numéricos , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Consumo de Bebidas Alcoólicas/etnologia , Alcoolismo/etnologia , Alcoolismo/genética , Feminino , Estudo de Associação Genômica Ampla , Humanos , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Estados Unidos , Veteranos , Adulto Jovem
2.
Front Immunol ; 9: 1877, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30166985

RESUMO

The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists' ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Receptores de Antígenos de Linfócitos B/genética , Receptores de Antígenos de Linfócitos T/genética , Software , Biologia Computacional/organização & administração , Mineração de Dados , Ontologia Genética , Humanos , Metadados , Reprodutibilidade dos Testes , Interface Usuário-Computador , Fluxo de Trabalho
3.
AMIA Jt Summits Transl Sci Proc ; 2017: 295-301, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28815144

RESUMO

This paper describes a natural language processing (NLP)-based clinical decision support (CDS) system that is geared towards colon cancer care coordinators as the end users. The system is implemented using a metadata- driven Structured Query Language (SQL) function (discriminant function). For our pilot study, we have developed a training corpus consisting of 2,085 pathology reports from the VA Connecticut Health Care System (VACHS). We categorized reports as "actionable"- requiring close follow up, or "non-actionable"- requiring standard or no follow up. We then used 600 distinct pathology reports from 6 different VA sites as our test corpus. Analysis of our test corpus shows that our NLP approach yields 98.5% accuracy in identifying cases that required close clinical follow up. By integrating this into our cancer care tracking system, our goal is to ensure that patients with worrisome pathology receive appropriate and timely follow-up and care.

4.
BMC Bioinformatics ; 15: 231, 2014 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-24990767

RESUMO

BACKGROUND: Current research suggests that a small set of "driver" mutations are responsible for tumorigenesis while a larger body of "passenger" mutations occur in the tumor but do not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical. RESULTS: We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html. CONCLUSION: SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structure.


Assuntos
Biologia Computacional/métodos , Mutação , Proteínas/química , Proteínas/genética , Algoritmos , Análise por Conglomerados , Bases de Dados de Proteínas , Genes Neoplásicos/genética , Humanos , Neoplasias/genética , Estrutura Terciária de Proteína
5.
BMC Bioinformatics ; 15: 86, 2014 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-24669769

RESUMO

BACKGROUND: It is well known that the development of cancer is caused by the accumulation of somatic mutations within the genome. For oncogenes specifically, current research suggests that there is a small set of "driver" mutations that are primarily responsible for tumorigenesis. Further, due to recent pharmacological successes in treating these driver mutations and their resulting tumors, a variety of approaches have been developed to identify potential driver mutations using methods such as machine learning and mutational clustering. We propose a novel methodology that increases our power to identify mutational clusters by taking into account protein tertiary structure via a graph theoretical approach. RESULTS: We have designed and implemented GraphPAC (Graph Protein Amino acid Clustering) to identify mutational clustering while considering protein spatial structure. Using GraphPAC, we are able to detect novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of prior clustering based on current methods. Specifically, by utilizing the spatial information available in the Protein Data Bank (PDB) along with the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC), GraphPAC identifies new mutational clusters in well known oncogenes such as EGFR and KRAS. Further, by utilizing graph theory to account for the tertiary structure, GraphPAC discovers clusters in DPP4, NRP1 and other proteins not identified by existing methods. The R package is available at: http://bioconductor.org/packages/release/bioc/html/GraphPAC.html. CONCLUSION: GraphPAC provides an alternative to iPAC and an extension to current methodology when identifying potential activating driver mutations by utilizing a graph theoretic approach when considering protein tertiary structure.


Assuntos
Mutação , Estrutura Terciária de Proteína/genética , Análise por Conglomerados , Genes Neoplásicos , Proteínas/genética
6.
BMC Bioinformatics ; 14: 190, 2013 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-23758891

RESUMO

BACKGROUND: Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key "driver" mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. However, these methods consider proteins as a single strand without taking their spatial structures into account. We propose an extension to current methodology that incorporates protein tertiary structure in order to increase our power when identifying mutation clustering. RESULTS: We have developed iPAC (identification of Protein Amino acid Clustering), an algorithm that identifies non-random somatic mutations in proteins while taking into account the three dimensional protein structure. By using the tertiary information, we are able to detect both novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of clustering based on existing methods. For example, by combining the data in the Protein Data Bank (PDB) and the Catalogue of Somatic Mutations in Cancer, our algorithm identifies new mutational clusters in well known cancer proteins such as KRAS and PI3KC α. Further, by utilizing the tertiary structure, our algorithm also identifies clusters in EGFR, EIF2AK2, and other proteins that are not identified by current methodology. The R package is available at: http://www.bioconductor.org/packages/2.12/bioc/html/iPAC.html. CONCLUSION: Our algorithm extends the current methodology to identify oncogenic activating driver mutations by utilizing tertiary protein structure when identifying nonrandom somatic residue mutation clusters.


Assuntos
Algoritmos , Mutação , Proteínas de Neoplasias/genética , Estrutura Terciária de Proteína , Análise por Conglomerados , Humanos , Proteínas de Neoplasias/química
7.
BMC Bioinformatics ; 13 Suppl 1: S10, 2012 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-22373303

RESUMO

BACKGROUND: The RDF triple provides a simple linguistic means of describing limitless types of information. Triples can be flexibly combined into a unified data source we call a semantic model. Semantic models open new possibilities for the integration of variegated biological data. We use Semantic Web technology to explicate high throughput clinical data in the context of fundamental biological knowledge. We have extended Corvus, a data warehouse which provides a uniform interface to various forms of Omics data, by providing a SPARQL endpoint. With the querying and reasoning tools made possible by the Semantic Web, we were able to explore quantitative semantic models retrieved from Corvus in the light of systematic biological knowledge. RESULTS: For this paper, we merged semantic models containing genomic, transcriptomic and epigenomic data from melanoma samples with two semantic models of functional data - one containing Gene Ontology (GO) data, the other, regulatory networks constructed from transcription factor binding information. These two semantic models were created in an ad hoc manner but support a common interface for integration with the quantitative semantic models. Such combined semantic models allow us to pose significant translational medicine questions. Here, we study the interplay between a cell's molecular state and its response to anti-cancer therapy by exploring the resistance of cancer cells to Decitabine, a demethylating agent. CONCLUSIONS: We were able to generate a testable hypothesis to explain how Decitabine fights cancer - namely, that it targets apoptosis-related gene promoters predominantly in Decitabine-sensitive cell lines, thus conveying its cytotoxic effect by activating the apoptosis pathway. Our research provides a framework whereby similar hypotheses can be developed easily.


Assuntos
Epigenômica , Perfilação da Expressão Gênica , Internet , Melanoma/genética , Semântica , Azacitidina/análogos & derivados , Azacitidina/farmacologia , Sistemas de Gerenciamento de Base de Dados , Decitabina , Resistencia a Medicamentos Antineoplásicos/genética , Ontologia Genética , Redes Reguladoras de Genes , Humanos , Melanoma/tratamento farmacológico , Melanoma/patologia , Fatores de Transcrição/metabolismo , Pesquisa Translacional Biomédica
8.
Cancer Inform ; 8: 19-30, 2009 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-19458791

RESUMO

We demonstrate the use of Semantic Web technology to integrate the ALFRED allele frequency database and the Starpath pathway resource. The linking of population-specific genotype data with cancer-related pathway data is potentially useful given the growing interest in personalized medicine and the exploitation of pathway knowledge for cancer drug discovery. We model our data using the Web Ontology Language (OWL), drawing upon ideas from existing standard formats BioPAX for pathway data and PML for allele frequency data. We store our data within an Oracle database, using Oracle Semantic Technologies. We then query the data using Oracle's rule-based inference engine and SPARQL-like RDF query language. The ability to perform queries across the domains of population genetics and pathways offers the potential to answer a number of cancer-related research questions. Among the possibilities is the ability to identify genetic variants which are associated with cancer pathways and whose frequency varies significantly between ethnic groups. This sort of information could be useful for designing clinical studies and for providing background data in personalized medicine. It could also assist with the interpretation of genetic analysis results such as those from genome-wide association studies.

9.
J Biomed Inform ; 41(5): 694-705, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18487092

RESUMO

We describe the potential of current Web 2.0 technologies to achieve data mashup in the health care and life sciences (HCLS) domains, and compare that potential to the nascent trend of performing semantic mashup. After providing an overview of Web 2.0, we demonstrate two scenarios of data mashup, facilitated by the following Web 2.0 tools and sites: Yahoo! Pipes, Dapper, Google Maps and GeoCommons. In the first scenario, we exploited Dapper and Yahoo! Pipes to implement a challenging data integration task in the context of DNA microarray research. In the second scenario, we exploited Yahoo! Pipes, Google Maps, and GeoCommons to create a geographic information system (GIS) interface that allows visualization and integration of diverse categories of public health data, including cancer incidence and pollution prevalence data. Based on these two scenarios, we discuss the strengths and weaknesses of these Web 2.0 mashup technologies. We then describe Semantic Web, the mainstream Web 3.0 technology that enables more powerful data integration over the Web. We discuss the areas of intersection of Web 2.0 and Semantic Web, and describe the potential benefits that can be brought to HCLS research by combining these two sets of technologies.


Assuntos
Disciplinas das Ciências Biológicas/tendências , Sistemas de Gerenciamento de Base de Dados , Atenção à Saúde/tendências , Internet/organização & administração , Design de Software , Sistemas de Gerenciamento de Base de Dados/provisão & distribuição , Sistemas de Gerenciamento de Base de Dados/tendências , Poluição Ambiental/estatística & dados numéricos , Sistemas de Informação Geográfica/provisão & distribuição , Humanos , Hipermídia/provisão & distribuição , Disseminação de Informação/métodos , Armazenamento e Recuperação da Informação/estatística & dados numéricos , Armazenamento e Recuperação da Informação/tendências , Comunicação Interdisciplinar , Internet/tendências , Processamento de Linguagem Natural , Neoplasias/epidemiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Informática em Saúde Pública/organização & administração , Informática em Saúde Pública/tendências , Integração de Sistemas , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA