Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
2.
Nat Commun ; 11(1): 729, 2020 02 05.
Artigo em Inglês | MEDLINE | ID: mdl-32024854

RESUMO

The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.


Assuntos
Regulação Neoplásica da Expressão Gênica , Mutação , Neoplasias/genética , Splicing de RNA , Montagem e Desmontagem da Cromatina , Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Humano , Humanos , Redes e Vias Metabólicas/genética , Neoplasias/metabolismo , Regiões Promotoras Genéticas
3.
Clin Nutr ; 39(1): 265-275, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-30857909

RESUMO

BACKGROUND: Individuals respond differently to dietary intake leading to different associations between diet and traits. Most studies have investigated large cohorts without subgrouping them. OBJECTIVE: The purpose was to identify non-uniform associations between diets and anthropometric traits that appeared to be in conflict with one another across subgroups. DESIGN: We used a cohort comprising 43,790 women and men, the Danish Diet, Cancer and Health study, which includes a baseline examination at age 50-64 years and a follow-up about 5 years later. The baseline examination involved anthropometrics, body fat percentage, a food frequency questionnaire and information on lifestyle. From the questionnaire data we computed association rules between the intake of food groups and changes in waist circumference and body weight. Using association rule mining on subgroups and gender-specific cohorts, we identified non-uniform associations. The two gender-specific cohorts were stratified into subgroups using a non-linear, self-organizing map based method. RESULTS: We found 22 and 7 cases of conflicting rules in 8 participant subgroups for different anthropometric traits in women and men, respectively. For example, in a subgroup of women moderate waist loss was associated with a dietary pattern characterized by low intake in both cabbages and wine, in conflict with the association trends of both dietary factors in the female cohort. The finding of more conflicting rules in women suggests that inter-individual differences in response to dietary intake are stronger in women than in men. CONCLUSIONS: This combined stratification and association discovery approach revealed epidemiological relationships between dietary factors and changes in anthropometric traits in subgroups that take food group interactions into account. Conflicting rules adds an additional layer of complexity that should be integrated into the study of these relationships, for example in relation to genotypes.


Assuntos
Antropometria , Dieta/métodos , Dieta/estatística & dados numéricos , Estudos de Coortes , Mineração de Dados , Dinamarca , Feminino , Humanos , Estilo de Vida , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Fatores Sexuais
4.
NPJ Syst Biol Appl ; 5: 27, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31396397

RESUMO

Non-oncogene addiction (NOA) genes are essential for supporting the stress-burdened phenotype of tumours and thus vital for their survival. Although NOA genes are acknowledged to be potential drug targets, there has been no large-scale attempt to identify and characterise them as a group across cancer types. Here we provide the first method for the identification of conditional NOA genes and their rewired neighbours using a systems approach. Using copy number data and expression profiles from The Cancer Genome Atlas (TCGA) we performed comparative analyses between high and low genomic stress tumours for 15 cancer types. We identified 101 condition-specific differential coexpression modules, mapped to a high-confidence human interactome, comprising 133 candidate NOA rewiring hub genes. We observe that most modules lose coexpression in the high-stress state and that activated stress modules and hubs take part in homoeostasis maintenance processes such as chromosome segregation, oxireductase activity, mitotic checkpoint (PLK1 signalling), DNA replication initiation and synaptic signalling. We furthermore show that candidate NOA rewiring hubs are unique for each cancer type, but that their respective rewired neighbour genes largely are shared across cancer types.


Assuntos
Biologia Computacional/métodos , Neoplasias/genética , Vício Oncogênico/genética , Algoritmos , Bases de Dados Genéticas , Redes Reguladoras de Genes , Genômica , Humanos , Mapeamento de Interação de Proteínas , Transcriptoma
5.
J Infect Dis ; 220(8): 1312-1324, 2019 09 13.
Artigo em Inglês | MEDLINE | ID: mdl-31253993

RESUMO

BACKGROUND: Viruses and other infectious agents cause more than 15% of human cancer cases. High-throughput sequencing-based studies of virus-cancer associations have mainly focused on cancer transcriptome data. METHODS: In this study, we applied a diverse selection of presequencing enrichment methods targeting all major viral groups, to characterize the viruses present in 197 samples from 18 sample types of cancerous origin. Using high-throughput sequencing, we generated 710 datasets constituting 57 billion sequencing reads. RESULTS: Detailed in silico investigation of the viral content, including exclusion of viral artefacts, from de novo assembled contigs and individual sequencing reads yielded a map of the viruses detected. Our data reveal a virome dominated by papillomaviruses, anelloviruses, herpesviruses, and parvoviruses. More than half of the included samples contained 1 or more viruses; however, no link between specific viruses and cancer types were found. CONCLUSIONS: Our study sheds light on viral presence in cancers and provides highly relevant virome data for future reference.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenoma/genética , Neoplasias/virologia , Anelloviridae/genética , Anelloviridae/isolamento & purificação , Biópsia , Conjuntos de Dados como Assunto , Feminino , Herpesviridae/genética , Herpesviridae/isolamento & purificação , Humanos , Masculino , Neoplasias/patologia , Papillomaviridae/genética , Papillomaviridae/isolamento & purificação , Parvovirus/genética , Parvovirus/isolamento & purificação
6.
J Immunol ; 201(2): 524-532, 2018 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-29848752

RESUMO

Despite the essential role of thymic epithelial cells (TEC) in T cell development, the signals regulating TEC differentiation and homeostasis remain incompletely understood. In this study, we show a key in vivo role for the vitamin A metabolite, retinoic acid (RA), in TEC homeostasis. In the absence of RA signaling in TEC, cortical TEC (cTEC) and CD80loMHC class IIlo medullary TEC displayed subset-specific alterations in gene expression, which in cTEC included genes involved in epithelial proliferation, development, and differentiation. Mice whose TEC were unable to respond to RA showed increased cTEC proliferation, an accumulation of stem cell Ag-1hi cTEC, and, in early life, a decrease in medullary TEC numbers. These alterations resulted in reduced thymic cellularity in early life, a reduction in CD4 single-positive and CD8 single-positive numbers in both young and adult mice, and enhanced peripheral CD8+ T cell survival upon TCR stimulation. Collectively, our results identify RA as a regulator of TEC homeostasis that is essential for TEC function and normal thymopoiesis.


Assuntos
Células Epiteliais/imunologia , Transdução de Sinais/imunologia , Timo/imunologia , Tretinoína/imunologia , Animais , Linfócitos T CD4-Positivos/imunologia , Linfócitos T CD8-Positivos/imunologia , Diferenciação Celular/imunologia , Linhagem da Célula/imunologia , Proliferação de Células/fisiologia , Feminino , Homeostase/imunologia , Masculino , Camundongos , Camundongos Endogâmicos C57BL
7.
Oncotarget ; 9(10): 9043-9060, 2018 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-29507673

RESUMO

Colorectal cancer (CRC) is a leading cause of death worldwide. Surgical intervention is a successful treatment for stage I patients, whereas other more advanced cases may require adjuvant chemotherapy. The selection of effective adjuvant treatments remains, however, challenging. Accurate patient stratification is necessary for the identification of the subset of patients likely responding to treatment, while sparing others from pernicious treatment. Targeted sequencing approaches may help in this regard, enabling rapid genetic investigation, and at the same time easily applicable in routine diagnosis. We propose a set of guidelines for the identification, including variant calling and filtering, of somatic mutations driving tumorigenesis in the absence of matched healthy tissue. We also discuss the inclusion criteria for the generation of our gene panel. Furthermore, we evaluate the prognostic impact of individual genes, using Cox regression models in the context of overall survival and disease-free survival. These analyses confirmed the role of commonly used biomarkers, and shed light on controversial genes such as CYP2C8. Applying those guidelines, we created a novel gene panel to investigate the onset and progression of CRC in 273 patients. Our comprehensive biomarker set includes 266 genes that may play a role in the progression through the different stages of the disease. Tracing the developmental state of the tumour, and its resistances, is instrumental in patient stratification and reliable decision making in precision clinical practice.

8.
Emerg Infect Dis ; 23(2): 363-365, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28098541

RESUMO

A novel human protoparvovirus related to human bufavirus and preliminarily named cutavirus has been discovered. We detected cutavirus in a sample of cutaneous malignant melanoma by using viral enrichment and high-throughput sequencing. The role of cutaviruses in cutaneous cancers remains to be investigated.


Assuntos
Melanoma/etiologia , Infecções por Parvoviridae/complicações , Infecções por Parvoviridae/virologia , Parvovirus , Neoplasias Cutâneas/etiologia , DNA Viral , Genes Virais , Humanos , Melanoma/diagnóstico , Infecções por Parvoviridae/diagnóstico , Filogenia , Análise de Sequência de DNA , Neoplasias Cutâneas/diagnóstico , Melanoma Maligno Cutâneo
9.
Curr Opin Rheumatol ; 28(4): 398-404, 2016 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26986247

RESUMO

PURPOSE OF REVIEW: Systemic lupus erythematosus (SLE) is caused by a combination of genetic and acquired immunodeficiencies and environmental factors including infections. An association with Epstein-Barr virus (EBV) has been established by numerous studies over the past decades. Here, we review recent experimental studies on EBV, and present our integrated theory of SLE development. RECENT FINDINGS: SLE patients have dysfunctional control of EBV infection resulting in frequent reactivations and disease progression. These comprise impaired functions of EBV-specific T-cells with an inverse correlation to disease activity and elevated serum levels of antibodies against lytic cycle EBV antigens. The presence of EBV proteins in renal tissue from SLE patients with nephritis suggests direct involvement of EBV in SLE development. As expected for patients with immunodeficiencies, studies reveal that SLE patients show dysfunctional responses to other viruses as well. An association with EBV infection has also been demonstrated for other autoimmune diseases, including Sjögren's syndrome, rheumatoid arthritis, and multiple sclerosis. SUMMARY: Collectively, the interplay between an impaired immune system and the cumulative effects of EBV and other viruses results in frequent reactivation of EBV and enhanced cell death, causing development of SLE and concomitant autoreactivities.


Assuntos
Doenças Autoimunes/virologia , Infecções por Vírus Epstein-Barr/complicações , Lúpus Eritematoso Sistêmico/virologia , Artrite Reumatoide/imunologia , Artrite Reumatoide/virologia , Doenças Autoimunes/imunologia , Medicina Baseada em Evidências/métodos , Herpesvirus Humano 4/isolamento & purificação , Herpesvirus Humano 4/fisiologia , Humanos , Lúpus Eritematoso Sistêmico/imunologia , Síndrome de Sjogren/imunologia , Síndrome de Sjogren/virologia , Ativação Viral/imunologia
10.
Viruses ; 8(2)2016 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-26907326

RESUMO

Virus discovery from high throughput sequencing data often follows a bottom-up approach where taxonomic annotation takes place prior to association to disease. Albeit effective in some cases, the approach fails to detect novel pathogens and remote variants not present in reference databases. We have developed a species independent pipeline that utilises sequence clustering for the identification of nucleotide sequences that co-occur across multiple sequencing data instances. We applied the workflow to 686 sequencing libraries from 252 cancer samples of different cancer and tissue types, 32 non-template controls, and 24 test samples. Recurrent sequences were statistically associated to biological, methodological or technical features with the aim to identify novel pathogens or plausible contaminants that may associate to a particular kit or method. We provide examples of identified inhabitants of the healthy tissue flora as well as experimental contaminants. Unmapped sequences that co-occur with high statistical significance potentially represent the unknown sequence space where novel pathogens can be identified.


Assuntos
Neoplasias/virologia , Vírus/genética , Vírus/isolamento & purificação , Biologia Computacional , Sequência Conservada , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA Viral/genética , Vírus/classificação
11.
J Clin Microbiol ; 54(4): 980-7, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26818667

RESUMO

Propionibacterium acnesis the most abundant bacterium on human skin, particularly in sebaceous areas.P. acnesis suggested to be an opportunistic pathogen involved in the development of diverse medical conditions but is also a proven contaminant of human clinical samples and surgical wounds. Its significance as a pathogen is consequently a matter of debate. In the present study, we investigated the presence ofP. acnesDNA in 250 next-generation sequencing data sets generated from 180 samples of 20 different sample types, mostly of cancerous origin. The samples were subjected to either microbial enrichment, involving nuclease treatment to reduce the amount of host nucleic acids, or shotgun sequencing. We detected high proportions ofP. acnesDNA in enriched samples, particularly skin tissue-derived and other tissue samples, with the levels being higher in enriched samples than in shotgun-sequenced samples.P. acnesreads were detected in most samples analyzed, though the proportions in most shotgun-sequenced samples were low. Our results show thatP. acnescan be detected in practically all sample types when molecular methods, such as next-generation sequencing, are employed. The possibility of contamination from the patient or other sources, including laboratory reagents or environment, should therefore always be considered carefully whenP. acnesis detected in clinical samples. We advocate that detection ofP. acnesalways be accompanied by experiments validating the association between this bacterium and any clinical condition.


Assuntos
Infecções Bacterianas/microbiologia , Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias/complicações , Propionibacterium acnes/isolamento & purificação , Humanos , Propionibacterium acnes/genética
12.
Hum Mutat ; 37(1): 36-42, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26443060

RESUMO

Most genomic alterations are tolerated while only a minor fraction disrupts molecular function sufficiently to drive disease. Protein kinases play a central biological function and the functional consequences of their variants are abundantly characterized. However, this heterogeneous information is often scattered across different sources, which makes the integrative analysis complex and laborious. wKinMut-2 constitutes a solution to facilitate the interpretation of the consequences of human protein kinase variation. Nine methods predict their pathogenicity, including a kinase-specific random forest approach. To understand the biological mechanisms causative of human diseases and cancer, information from pertinent reference knowledge bases and the literature is automatically mined, digested, and homogenized. Variants are visualized in their structural contexts and residues affecting catalytic and drug binding are identified. Known protein-protein interactions are reported. Altogether, this information is intended to assist the generation of new working hypothesis to be corroborated with ulterior experimental work. The wKinMut-2 system, along with a user manual and examples, is freely accessible at http://kinmut2.bioinfo.cnio.es, the code for local installations can be downloaded from https://github.com/Rbbt-Workflows/KinMut2.


Assuntos
Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Predisposição Genética para Doença , Variação Genética , Genômica/métodos , Proteínas Quinases/genética , Software , Mineração de Dados , Bases de Dados Genéticas , Fator 1 de Crescimento de Fibroblastos/química , Fator 1 de Crescimento de Fibroblastos/genética , Fator 1 de Crescimento de Fibroblastos/metabolismo , Humanos , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Proteínas Proto-Oncogênicas B-raf/química , Proteínas Proto-Oncogênicas B-raf/genética , Proteínas Proto-Oncogênicas B-raf/metabolismo , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/química , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/genética , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/metabolismo , Reprodutibilidade dos Testes , Relação Estrutura-Atividade , Navegador
13.
Sci Rep ; 5: 13201, 2015 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-26285800

RESUMO

Although nearly one fifth of all human cancers have an infectious aetiology, the causes for the majority of cancers remain unexplained. Despite the enormous data output from high-throughput shotgun sequencing, viral DNA in a clinical sample typically constitutes a proportion of host DNA that is too small to be detected. Sequence variation among virus genomes complicates application of sequence-specific, and highly sensitive, PCR methods. Therefore, we aimed to develop and characterize a method that permits sensitive detection of sequences despite considerable variation. We demonstrate that our low-stringency in-solution hybridization method enables detection of <100 viral copies. Furthermore, distantly related proviral sequences may be enriched by orders of magnitude, enabling discovery of hitherto unknown viral sequences by high-throughput sequencing. The sensitivity was sufficient to detect retroviral sequences in clinical samples. We used this method to conduct an investigation for novel retrovirus in samples from three cancer types. In accordance with recent studies our investigation revealed no retroviral infections in human B-cell lymphoma cells, cutaneous T-cell lymphoma or colorectal cancer biopsies. Nonetheless, our generally applicable method makes sensitive detection possible and permits sequencing of distantly related sequences from complex material.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/virologia , Retroviridae/genética , Animais , Sequência de Bases , Sondas de DNA/metabolismo , DNA Viral/genética , Biblioteca Gênica , Genoma Humano , Células HEK293 , HIV-1/genética , Humanos , Provírus/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase em Tempo Real , Sus scrofa
14.
BMC Bioinformatics ; 14: 345, 2013 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-24289158

RESUMO

BACKGROUND: Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. RESULTS: The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases.Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. CONCLUSIONS: wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases.wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at http://wkinmut.bioinfo.cnio.es.


Assuntos
Biologia Computacional/métodos , Leucemia Linfocítica Crônica de Células B/enzimologia , Leucemia Linfocítica Crônica de Células B/genética , Mutação/genética , Proteínas Quinases/química , Bases de Dados de Proteínas/tendências , Receptores ErbB/genética , Humanos , Armazenamento e Recuperação da Informação/métodos , Leucemia Linfocítica Crônica de Células B/etiologia , Fenótipo , Valor Preditivo dos Testes , Proteínas Quinases/classificação , Proteínas Quinases/genética , Estabilidade Proteica
15.
PLoS One ; 8(11): e80023, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24265793

RESUMO

BACKGROUND: Increased number of single nucleotide substitutions is seen in breast and ovarian cancer genomes carrying disease-associated mutations in BRCA1 or BRCA2. The significance of these genome-wide mutations is unknown. We hypothesize genome-wide mutation burden mirrors deficiencies in DNA repair and is associated with treatment outcome in ovarian cancer. METHODS AND RESULTS: The total number of synonymous and non-synonymous exome mutations (Nmut), and the presence of germline or somatic mutation in BRCA1 or BRCA2 (mBRCA) were extracted from whole-exome sequences of high-grade serous ovarian cancers from The Cancer Genome Atlas (TCGA). Cox regression and Kaplan-Meier methods were used to correlate Nmut with chemotherapy response and outcome. Higher Nmut correlated with a better response to chemotherapy after surgery. In patients with mBRCA-associated cancer, low Nmut was associated with shorter progression-free survival (PFS) and overall survival (OS), independent of other prognostic factors in multivariate analysis. Patients with mBRCA-associated cancers and a high Nmut had remarkably favorable PFS and OS. The association with survival was similar in cancers with either BRCA1 or BRCA2 mutations. In cancers with wild-type BRCA, tumor Nmut was associated with treatment response in patients with no residual disease after surgery. CONCLUSIONS: Tumor Nmut was associated with treatment response and with both PFS and OS in patients with high-grade serous ovarian cancer carrying BRCA1 or BRCA2 mutations. In the TCGA cohort, low Nmut predicted resistance to chemotherapy, and for shorter PFS and OS, while high Nmut forecasts a remarkably favorable outcome in mBRCA-associated ovarian cancer. Our observations suggest that the total mutation burden coupled with BRCA1 or BRCA2 mutations in ovarian cancer is a genomic marker of prognosis and predictor of treatment response. This marker may reflect the degree of deficiency in BRCA-mediated pathways, or the extent of compensation for the deficiency by alternative mechanisms.


Assuntos
Genes BRCA1 , Genes BRCA2 , Mutação , Neoplasias Ovarianas/genética , Fatores Etários , Aberrações Cromossômicas , Resistencia a Medicamentos Antineoplásicos/genética , Exoma , Feminino , Estudo de Associação Genômica Ampla , Mutação em Linhagem Germinativa , Humanos , Perda de Heterozigosidade , Gradação de Tumores , Estadiamento de Neoplasias , Neoplasias Ovarianas/mortalidade , Neoplasias Ovarianas/patologia , Neoplasias Ovarianas/terapia , Prognóstico , Resultado do Tratamento
16.
PLoS One ; 8(7): e68370, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23935863

RESUMO

We have developed a sequence conservation-based artificial neural network predictor called NetDiseaseSNP which classifies nsSNPs as disease-causing or neutral. Our method uses the excellent alignment generation algorithm of SIFT to identify related sequences and a combination of 31 features assessing sequence conservation and the predicted surface accessibility to produce a single score which can be used to rank nsSNPs based on their potential to cause disease. NetDiseaseSNP classifies successfully disease-causing and neutral mutations. In addition, we show that NetDiseaseSNP discriminates cancer driver and passenger mutations satisfactorily. Our method outperforms other state-of-the-art methods on several disease/neutral datasets as well as on cancer driver/passenger mutation datasets and can thus be used to pinpoint and prioritize plausible disease candidates among nsSNPs for further investigation. NetDiseaseSNP is publicly available as an online tool as well as a web service: http://www.cbs.dtu.dk/services/NetDiseaseSNP.


Assuntos
Biologia Computacional/métodos , Predisposição Genética para Doença , Mutação , Neoplasias/diagnóstico , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Algoritmos , Sequência de Bases , Sequência Conservada , Feminino , Humanos , Masculino , Dados de Sequência Molecular , Redes Neurais de Computação , Alinhamento de Sequência
17.
Front Physiol ; 3: 323, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23055974

RESUMO

Protein kinases play a crucial role in a plethora of significant physiological functions and a number of mutations in this superfamily have been reported in the literature to disrupt protein structure and/or function. Computational and experimental research aims to discover the mechanistic connection between mutations in protein kinases and disease with the final aim of predicting the consequences of mutations on protein function and the subsequent phenotypic alterations. In this article, we will review the possibilities and limitations of current computational methods for the prediction of the pathogenicity of mutations in the protein kinase superfamily. In particular we will focus on the problem of benchmarking the predictions with independent gold standard datasets. We will propose a pipeline for the curation of mutations automatically extracted from the literature. Since many of these mutations are not included in the databases that are commonly used to train the computational methods to predict the pathogenicity of protein kinase mutations we propose them to build a valuable gold standard dataset in the benchmarking of a number of these predictors. Finally, we will discuss how text mining approaches constitute a powerful tool for the interpretation of the consequences of mutations in the context of disease genome analysis with particular focus on cancer.

18.
BMC Genomics ; 13 Suppl 4: S3, 2012 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-22759651

RESUMO

BACKGROUND: Most of the many mutations described in human protein kinases are tolerated without significant disruption of the corresponding structures or molecular functions, while some of them have been associated to a variety of human diseases, including cancer. In the last decade, a plethora of computational methods to predict the effect of missense single-nucleotide variants (SNVs) have been developed. Still, current high-throughput sequencing efforts and the concomitant need for massive interpretation of protein sequence variants will demand for more efficient and/or accurate computational methods in the forthcoming years. RESULTS: We present KinMut, a support vector machine (SVM) approach, to identify pathogenic mutations in the protein kinase superfamily. KinMut relays on a combination of sequence-derived features that describe mutations at different levels: (1) Gene level: membership to a specific group in Kinbase and the annotation with GO terms; (2) Domain level: annotated PFAM domains; and (3) Residue level: physicochemical features of amino acids, specificity determining positions, and functional annotations from SwissProt and FireDB. The system has been trained with the set of 3492 human kinase mutations in UniProt for which experimental validation of their pathogenic or neutral character exists. In addition, we discuss the relative importance of these independent properties and their combination for the development of a kinase-specific predictor. Finally, we compare KinMut with other state-of-the-art prediction methods. CONCLUSIONS: Family-specific features appear among the most discriminative information sources, which allow us to produce accurate results in a reliable and very simple way with minimal supervision. Our study aims to broaden the knowledge on the mechanisms by which mutations in the human kinome contribute to disease with a particular focus in cancer. The classifier as well as further documentation is available at http://kinmut.bioinfo.cnio.es/.


Assuntos
Proteínas Quinases/genética , Bases de Dados de Proteínas , Humanos , Mutação , Neoplasias/genética , Máquina de Vetores de Suporte
19.
BMC Bioinformatics ; 12 Suppl 4: S1, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21992016

RESUMO

BACKGROUND: Protein Kinases are a superfamily of proteins involved in crucial cellular processes such as cell cycle regulation and signal transduction. Accordingly, they play an important role in cancer biology. To contribute to the study of the relation between kinases and disease we compared pathogenic mutations to neutral mutations as an extension to our previous analysis of cancer somatic mutations. First, we analyzed native and mutant proteins in terms of amino acid composition. Secondly, mutations were characterized according to their potential structural effects and finally, we assessed the location of the different classes of polymorphisms with respect to kinase-relevant positions in terms of subfamily specificity, conservation, accessibility and functional sites. RESULTS: Pathogenic Protein Kinase mutations perturb essential aspects of protein function, including disruption of substrate binding and/or effector recognition at family-specific positions. Interestingly these mutations in Protein Kinases display a tendency to avoid structurally relevant positions, what represents a significant difference with respect to the average distribution of pathogenic mutations in other protein families. CONCLUSIONS: Disease-associated mutations display sound differences with respect to neutral mutations: several amino acids are specific of each mutation type, different structural properties characterize each class and the distribution of pathogenic mutations within the consensus structure of the Protein Kinase domain is substantially different to that for non-pathogenic mutations. This preferential distribution confirms previous observations about the functional and structural distribution of the controversial cancer driver and passenger somatic mutations and their use as a proxy for the study of the involvement of somatic mutations in cancer development.


Assuntos
Mutação em Linhagem Germinativa , Mutação Puntual , Proteínas Quinases/genética , Humanos , Modelos Moleculares , Neoplasias/genética , Membro 2 do Grupo A da Subfamília 4 de Receptores Nucleares , Ligação Proteica , Proteínas Quinases/química , Proteínas Quinases/metabolismo , Estrutura Terciária de Proteína , Transdução de Sinais
20.
BMC Bioinformatics ; 10 Suppl 8: S1, 2009 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-19758464

RESUMO

BACKGROUND: There is a considerable interest in characterizing the biological role of specific protein residue substitutions through mutagenesis experiments. Additionally, recent efforts related to the detection of disease-associated SNPs motivated both the manual annotation, as well as the automatic extraction, of naturally occurring sequence variations from the literature, especially for protein families that play a significant role in signaling processes such as kinases. Systematic integration and comparison of kinase mutation information from multiple sources, covering literature, manual annotation databases and large-scale experiments can result in a more comprehensive view of functional, structural and disease associated aspects of protein sequence variants. Previously published mutation extraction approaches did not sufficiently distinguish between two fundamentally different variation origin categories, namely natural occurring and induced mutations generated through in vitro experiments. RESULTS: We present a literature mining pipeline for the automatic extraction and disambiguation of single-point mutation mentions from both abstracts as well as full text articles, followed by a sequence validation check to link mutations to their corresponding kinase protein sequences. Each mutation is scored according to whether it corresponds to an induced mutation or a natural sequence variant. We were able to provide direct literature links for a considerable fraction of previously annotated kinase mutations, enabling thus more efficient interpretation of their biological characterization and experimental context. In order to test the capabilities of the presented pipeline, the mutations in the protein kinase domain of the kinase family were analyzed. Using our literature extraction system, we were able to recover a total of 643 mutations-protein associations from PubMed abstracts and 6,970 from a large collection of full text articles. When compared to state-of-the-art annotation databases and high throughput genotyping studies, the mutation mentions extracted from the literature overlap to a good extent with the existing knowledgebases, whereas the remaining mentions suggest new mutation records that were not previously annotated in the databases. CONCLUSION: Using the proposed residue disambiguation and classification approach, we were able to differentiate between natural variant and mutagenesis types of mutations with an accuracy of 93.88. The resulting system is useful for constructing a Gold Standard set of mutations extracted from the literature by human experts with minimal manual curation effort, providing direct pointers to relevant evidence sentences. Our system is able to recover mutations from the literature that are not present in state-of-the-art databases. Human expert manual validation of a subset of the literature extracted mutations conducted on 100 mutations from PubMed abstracts highlights that almost three quarters (72%) of the extracted mutations turned out to be correct, and more than half of these had not been previously annotated in databases.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Mutação , Proteínas Quinases/genética , Bases de Dados Genéticas , Receptores ErbB/genética , Genômica , Genótipo , Humanos , Publicações Periódicas como Assunto , Reprodutibilidade dos Testes , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA