RESUMEN
The Drug-Gene Interaction database (DGIdb) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development. It provides an interface for searching lists of genes against a compendium of drug-gene interactions and potentially 'druggable' genes. DGIdb can be accessed at http://dgidb.org/.
Asunto(s)
Minería de Datos/métodos , Bases de Datos Genéticas , Descubrimiento de Drogas/métodos , Antineoplásicos/química , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Biología Computacional/métodos , Interacciones Farmacológicas , Regulación de la Expresión Génica/efectos de los fármacos , Variación Genética , Genoma , Genómica/métodos , Humanos , Neoplasias Pulmonares/tratamiento farmacológico , Neoplasias Pulmonares/genética , Mutación , Programas Informáticos , Tecnología Farmacéutica/métodosRESUMEN
In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.
Asunto(s)
Mapeo Cromosómico/métodos , Genoma Humano/genética , Bases del Conocimiento , Modelos Genéticos , Análisis de Secuencia de ADN/métodos , Interfaz Usuario-Computador , Algoritmos , Simulación por Computador , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Humanos , Alineación de Secuencia/métodosRESUMEN
Massively parallel sequencing technology and the associated rapidly decreasing sequencing costs have enabled systemic analyses of somatic mutations in large cohorts of cancer cases. Here we introduce a comprehensive mutational analysis pipeline that uses standardized sequence-based inputs along with multiple types of clinical data to establish correlations among mutation sites, affected genes and pathways, and to ultimately separate the commonly abundant passenger mutations from the truly significant events. In other words, we aim to determine the Mutational Significance in Cancer (MuSiC) for these large data sets. The integration of analytical operations in the MuSiC framework is widely applicable to a broad set of tumor types and offers the benefits of automation as well as standardization. Herein, we describe the computational structure and statistical underpinnings of the MuSiC pipeline and demonstrate its performance using 316 ovarian cancer samples from the TCGA ovarian cancer project. MuSiC correctly confirms many expected results, and identifies several potentially novel avenues for discovery.