RESUMO
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
Assuntos
Análise Mutacional de DNA/métodos , Genes Sintéticos , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Trombofilia/genética , Alelos , Sequência de Bases , Feminino , Predisposição Genética para Doença , Genoma Humano , Genótipo , Haplótipos , Humanos , Masculino , Linhagem , Padrões de Referência , Medição de Risco , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
The drug-metabolizing enzyme thiopurine methyltransferase (TPMT) has become one of the best examples of pharmacogenomics to be translated into routine clinical practice. TPMT metabolizes the thiopurines 6-mercaptopurine, 6-thioguanine, and azathioprine, drugs that are widely used for treatment of acute leukemias, inflammatory bowel diseases, and other disorders of immune regulation. Since the discovery of genetic polymorphisms in the TPMT gene, many sequence variants that cause a decreased enzyme activity have been identified and characterized. Increasingly, to optimize dose, pretreatment determination of TPMT status before commencing thiopurine therapy is now routine in many countries. Novel TPMT sequence variants are currently numbered sequentially using PubMed as a source of information; however, this has caused some problems as exemplified by two instances in which authors' articles appeared on PubMed at the same time, resulting in the same allele numbers given to different polymorphisms. Hence, there is an urgent need to establish an order and consensus to the numbering of known and novel TPMT sequence variants. To address this problem, a TPMT nomenclature committee was formed in 2010, to define the nomenclature and numbering of novel variants for the TPMT gene. A website (http://www.imh.liu.se/tpmtalleles) serves as a platform for this work. Researchers are encouraged to submit novel TPMT alleles to the committee for designation and reservation of unique allele numbers. The committee has decided to renumber two alleles: nucleotide position 106 (G>A) from TPMT*24 to TPMT*30 and position 611 (T>C, rs79901429) from TPMT*28 to TPMT*31. Nomenclature for all other known alleles remains unchanged.
Assuntos
Doenças Inflamatórias Intestinais/enzimologia , Metiltransferases/classificação , Metiltransferases/genética , Polimorfismo Genético , Alelos , Azatioprina/metabolismo , Genótipo , Humanos , Mercaptopurina/metabolismo , Metiltransferases/metabolismo , Farmacogenética , Tioguanina/metabolismoRESUMO
BACKGROUND: The cost of genomic information has fallen steeply, but the clinical translation of genetic risk estimates remains unclear. We aimed to undertake an integrated analysis of a complete human genome in a clinical context. METHODS: We assessed a patient with a family history of vascular disease and early sudden death. Clinical assessment included analysis of this patient's full genome sequence, risk prediction for coronary artery disease, screening for causes of sudden cardiac death, and genetic counselling. Genetic analysis included the development of novel methods for the integration of whole genome and clinical risk. Disease and risk analysis focused on prediction of genetic risk of variants associated with mendelian disease, recognised drug responses, and pathogenicity for novel variants. We queried disease-specific mutation databases and pharmacogenomics databases to identify genes and mutations with known associations with disease and drug response. We estimated post-test probabilities of disease by applying likelihood ratios derived from integration of multiple common variants to age-appropriate and sex-appropriate pre-test probabilities. We also accounted for gene-environment interactions and conditionally dependent risks. FINDINGS: Analysis of 2.6 million single nucleotide polymorphisms and 752 copy number variations showed increased genetic risk for myocardial infarction, type 2 diabetes, and some cancers. We discovered rare variants in three genes that are clinically associated with sudden cardiac death-TMEM43, DSP, and MYBPC3. A variant in LPA was consistent with a family history of coronary artery disease. The patient had a heterozygous null mutation in CYP2C19 suggesting probable clopidogrel resistance, several variants associated with a positive response to lipid-lowering therapy, and variants in CYP4F2 and VKORC1 that suggest he might have a low initial dosing requirement for warfarin. Many variants of uncertain importance were reported. INTERPRETATION: Although challenges remain, our results suggest that whole-genome sequencing can yield useful and clinically relevant information for individual patients. FUNDING: National Institute of General Medical Sciences; National Heart, Lung And Blood Institute; National Human Genome Research Institute; Howard Hughes Medical Institute; National Library of Medicine, Lucile Packard Foundation for Children's Health; Hewlett Packard Foundation; Breetwor Family Foundation.
Assuntos
Predisposição Genética para Doença/genética , Testes Genéticos , Genoma Humano , Análise de Sequência de DNA , Doenças Vasculares/genética , Adulto , Hidrocarboneto de Aril Hidroxilases/genética , Proteínas de Transporte/genética , Citocromo P-450 CYP2C19 , Sistema Enzimático do Citocromo P-450/genética , Família 4 do Citocromo P450 , Morte Súbita Cardíaca , Desmoplaquinas/genética , Meio Ambiente , Saúde da Família , Aconselhamento Genético , Humanos , Lipoproteína(a)/genética , Masculino , Proteínas de Membrana/genética , Oxigenases de Função Mista/genética , Mutação , Osteoartrite/genética , Linhagem , Farmacogenética , Polimorfismo de Nucleotídeo Único , Medição de Risco , Vitamina K Epóxido RedutasesRESUMO
PharmGKB is a knowledge base that captures the relationships between drugs, diseases/phenotypes and genes involved in pharmacokinetics (PK) and pharmacodynamics (PD). This information includes literature annotations, primary data sets, PK and PD pathways, and expert-generated summaries of PK/PD relationships between drugs, diseases/phenotypes and genes. PharmGKB's website is designed to effectively disseminate knowledge to meet the needs of our users. PharmGKB currently has literature annotations documenting the relationship of over 500 drugs, 450 diseases and 600 variant genes. In order to meet the needs of whole genome studies, PharmGKB has added new functionalities, including browsing the variant display by chromosome and cytogenetic locations, allowing the user to view variants not located within a gene. We have developed new infrastructure for handling whole genome data, including increased methods for quality control and tools for comparison across other data sources, such as dbSNP, JSNP and HapMap data. PharmGKB has also added functionality to accept, store, display and query high throughput SNP array data. These changes allow us to capture more structured information on phenotypes for better cataloging and comparison of data. PharmGKB is available at www.pharmgkb.org.
Assuntos
Bases de Dados Factuais , Farmacogenética , Genes , Predisposição Genética para Doença , Variação Genética , Genômica , Internet , Preparações Farmacêuticas/metabolismo , Fenótipo , Interface Usuário-ComputadorAssuntos
Antineoplásicos/antagonistas & inibidores , Carcinoma Pulmonar de Células não Pequenas/genética , Receptores ErbB/genética , Anticorpos Monoclonais/uso terapêutico , Antineoplásicos/uso terapêutico , Biomarcadores Tumorais , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Ensaios Clínicos Fase III como Assunto , Resistencia a Medicamentos Antineoplásicos/genética , Variação Genética , Humanos , Terapia de Alvo Molecular , Farmacogenética , Polimorfismo de Nucleotídeo Único , Estudos Prospectivos , Inibidores de Proteínas Quinases/uso terapêuticoRESUMO
The Stanford Microarray Database (SMD) (http://smd.stanford.edu) is a research tool for hundreds of Stanford researchers and their collaborators. In addition, SMD functions as a resource for the entire biological research community by providing unrestricted access to microarray data published by SMD users and by disseminating its source code. In addition to storing GenePix (Axon Instruments) and ScanAlyze output from spotted microarrays, SMD has recently added the ability to store, retrieve, display and analyze the complete raw data produced by several additional microarray platforms and image analysis software packages, so that we can also now accept data from Affymetrix GeneChips (MAS5/GCOS or dChip), Agilent Catalog or Custom arrays (using Agilent's Feature Extraction software) or data created by SpotReader (Niles Scientific). We have implemented software that allows us to accept MAGE-ML documents from array manufacturers and to submit MIAME-compliant data in MAGE-ML format directly to ArrayExpress and GEO, greatly increasing the ease with which data from SMD can be published adhering to accepted standards and also increasing the accessibility of published microarray data to the general public. We have introduced a new tool to facilitate data sharing among our users, so that datasets can be shared during, before or after the completion of data analysis. The latest version of the source code for the complete database package was released in November 2004 (http://smd.stanford.edu/download/), allowing researchers around the world to deploy their own installations of SMD.
Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , California , Sistemas de Gerenciamento de Base de Dados , Integração de SistemasRESUMO
The Stanford Microarray Database (SMD; http://genome-www.stanford.edu/microarray/) serves as a microarray research database for Stanford investigators and their collaborators. In addition, SMD functions as a resource for the entire scientific community, by making freely available all of its source code and providing full public access to data published by SMD users, along with many tools to explore and analyze those data. SMD currently provides public access to data from 3500 microarrays, including data from 85 publications, and this total is increasing rapidly. In this article, we describe some of SMD's newer tools for accessing public data, assessing data quality and for data analysis.
Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Animais , California , Gráficos por Computador , Bases de Dados Genéticas/normas , Humanos , Armazenamento e Recuperação da Informação , Controle de Qualidade , SoftwareAssuntos
Metiltransferases , Alelos , Frequência do Gene , Variação Genética , Haplótipos , Humanos , Metiltransferases/genética , Metiltransferases/metabolismo , Metiltransferases/uso terapêutico , Farmacogenética , Polimorfismo de Nucleotídeo Único , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamento farmacológicoRESUMO
The need for efficient text-mining tools that support curation of the biomedical literature is ever increasing. In this article, we describe an experiment aimed at verifying whether a text-mining tool capable of extracting meaningful relationships among domain entities can be successfully integrated into the curation workflow of a major biological database. We evaluate in particular (i) the usability of the system's interface, as perceived by users, and (ii) the correlation of the ranking of interactions, as provided by the text-mining system, with the choices of the curators.