Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Stat Appl Genet Mol Biol ; 18(6)2019 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-31734658

RESUMO

DNA methylation and gene expression are interdependent and both implicated in cancer development and progression, with many individual biomarkers discovered. A joint analysis of the two data types can potentially lead to biological insights that are not discoverable with separate analyses. To optimally leverage the joint data for identifying perturbed genes and classifying clinical cancer samples, it is important to accurately model the interactions between the two data types. Here, we present EBADIMEX for jointly identifying differential expression and methylation and classifying samples. The moderated t-test widely used with empirical Bayes priors in current differential expression methods is generalised to a multivariate setting by developing: (1) a moderated Welch t-test for equality of means with unequal variances; (2) a moderated F-test for equality of variances; and (3) a multivariate test for equality of means with equal variances. This leads to parametric models with prior distributions for the parameters, which allow fast evaluation and robust analysis of small data sets. EBADIMEX is demonstrated on simulated data as well as a large breast cancer (BRCA) cohort from TCGA. We show that the use of empirical Bayes priors and moderated tests works particularly well on small data sets.


Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Metilação de DNA , Epigenômica/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Bases de Dados Genéticas , Regulação da Expressão Gênica , Humanos , Modelos Estatísticos , Reprodutibilidade dos Testes , Transcriptoma
2.
Bioinformatics ; 32(17): 2626-35, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153612

RESUMO

MOTIVATION: Recently, new RNA secondary structure probing techniques have been developed, including Next Generation Sequencing based methods capable of probing transcriptome-wide. These techniques hold great promise for improving structure prediction accuracy. However, each new data type comes with its own signal properties and biases, which may even be experiment specific. There is therefore a growing need for RNA structure prediction methods that can be automatically trained on new data types and readily extended to integrate and fully exploit multiple types of data. RESULTS: Here, we develop and explore a modular probabilistic approach for integrating probing data in RNA structure prediction. It can be automatically trained given a set of known structures with probing data. The approach is demonstrated on SHAPE datasets, where we evaluate and selectively model specific correlations. The approach often makes superior use of the probing data signal compared to other methods. We illustrate the use of ProbFold on multiple data types using both simulations and a small set of structures with both SHAPE, DMS and CMCT data. Technically, the approach combines stochastic context-free grammars (SCFGs) with probabilistic graphical models. This approach allows rapid adaptation and integration of new probing data types. AVAILABILITY AND IMPLEMENTATION: ProbFold is implemented in C ++. Models are specified using simple textual formats. Data reformatting is done using separate C ++ programs. Source code, statically compiled binaries for x86 Linux machines, C ++ programs, example datasets and a tutorial is available from http://moma.ki.au.dk/prj/probfold/ CONTACT: : jakob.skou@clin.au.dk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Modelos Estatísticos , RNA , Algoritmos , Conformação de Ácido Nucleico
3.
Bioinformatics ; 32(9): 1353-65, 2016 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-26740525

RESUMO

MOTIVATION: Cancer development and progression is driven by a complex pattern of genomic and epigenomic perturbations. Both types of perturbations can affect gene expression levels and disease outcome. Integrative analysis of cancer genomics data may therefore improve detection of perturbed genes and prediction of disease state. As different data types are usually dependent, analysis based on independence assumptions will make inefficient use of the data and potentially lead to false conclusions. MODEL: Here, we present PINCAGE (Probabilistic INtegration of CAncer GEnomics data), a method that uses probabilistic integration of cancer genomics data for combined evaluation of RNA-seq gene expression and 450k array DNA methylation measurements of promoters as well as gene bodies. It models the dependence between expression and methylation using modular graphical models, which also allows future inclusion of additional data types. RESULTS: We apply our approach to a Breast Invasive Carcinoma dataset from The Cancer Genome Atlas consortium, which includes 82 adjacent normal and 730 cancer samples. We identify new biomarker candidates of breast cancer development (PTF1A, RABIF, RAG1AP1, TIMM17A, LOC148145) and progression (SERPINE3, ZNF706). PINCAGE discriminates better between normal and tumour tissue and between progressing and non-progressing tumours in comparison with established methods that assume independence between tested data types, especially when using evidence from multiple genes. Our method can be applied to any type of cancer or, more generally, to any genomic disease for which sufficient amount of molecular data is available. AVAILABILITY AND IMPLEMENTATION: R scripts available at http://moma.ki.au.dk/prj/pincage/ CONTACT: : michal.switnicki@clin.au.dk or jakob.skou@clin.au.dk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias da Mama , Regulação Neoplásica da Expressão Gênica , Genômica , Metilação de DNA , Epigenômica , Genômica/métodos , Humanos
4.
NPJ Genom Med ; 3: 1, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29354286

RESUMO

Cancer develops by accumulation of somatic driver mutations, which impact cellular function. Mutations in non-coding regulatory regions can now be studied genome-wide and further characterized by correlation with gene expression and clinical outcome to identify driver candidates. Using a new two-stage procedure, called ncDriver, we first screened 507 ICGC whole-genomes from 10 cancer types for non-coding elements, in which mutations are both recurrent and have elevated conservation or cancer specificity. This identified 160 significant non-coding elements, including the TERT promoter, a well-known non-coding driver element, as well as elements associated with known cancer genes and regulatory genes (e.g., PAX5, TOX3, PCF11, MAPRE3). However, in some significant elements, mutations appear to stem from localized mutational processes rather than recurrent positive selection in some cases. To further characterize the driver potential of the identified elements and shortlist candidates, we identified elements where presence of mutations correlated significantly with expression levels (e.g., TERT and CDH10) and survival (e.g., CDH9 and CDH10) in an independent set of 505 TCGA whole-genome samples. In a larger pan-cancer set of 4128 TCGA exomes with expression profiling, we identified mutational correlation with expression for additional elements (e.g., near GATA3, CDC6, ZNF217, and CTCF transcription factor binding sites). Survival analysis further pointed to MIR122, a known marker of poor prognosis in liver cancer. In conclusion, the screen for significant mutation patterns coupled with correlative mutational analysis identified new individual driver candidates and suggest that some non-coding mutations recurrently affect expression and play a role in cancer development.

5.
Elife ; 62017 03 31.
Artigo em Inglês | MEDLINE | ID: mdl-28362259

RESUMO

Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5'UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance.


Assuntos
Carcinogênese , Taxa de Mutação , Mutação , Neoplasias/patologia , Neoplasias/fisiopatologia , Bioestatística/métodos , Perfilação da Expressão Gênica , Humanos , Análise de Sobrevida
6.
Oncotarget ; 8(4): 5774-5788, 2017 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-28052017

RESUMO

PURPOSE: The lack of biomarkers that can distinguish aggressive from indolent prostate cancer has caused substantial overtreatment of clinically insignificant disease. Here, by genome-wide DNA methylome profiling, we sought to identify new biomarkers to improve the accuracy of prostate cancer diagnosis and prognosis. EXPERIMENTAL DESIGN: Eight novel candidate markers, COL4A6, CYBA, TCAF1 (FAM115A), HLF, LINC01341 (LOC149134), LRRC4, PROM1, and RHCG, were selected from Illumina Infinium HumanMethylation450 BeadChip analysis of 21 tumor (T) and 21 non-malignant (NM) prostate specimens. Diagnostic potential was further investigated by methylation-specific qPCR analysis of 80 NM vs. 228 T tissue samples. Prognostic potential was assessed by Kaplan-Meier, uni- and multivariate Cox regression analysis in 203 Danish radical prostatectomy (RP) patients (cohort 1), and validated in an independent cohort of 286 RP patients from Switzerland and the U.S. (cohort 2). RESULTS: Hypermethylation of the 8 candidates was highly cancer-specific (area under the curves: 0.79-1.00). Furthermore, high methylation of the 2-gene panel RHCG-TCAF1 was predictive of biochemical recurrence (BCR) in cohort 1, independent of the established clinicopathological parameters Gleason score, pathological tumor stage, and pre-operative PSA (HR (95% confidence interval (CI)): 2.09 (1.26 - 3.46); P = 0.004), and this was successfully validated in cohort 2 (HR (95% CI): 1.81 (1.05 - 3.12); P = 0.032). CONCLUSION: Methylation of the RHCG-TCAF1 panel adds significant independent prognostic value to established prognostic parameters for prostate cancer and thus may help to guide treatment decisions in the future. Further investigation in large independent cohorts is necessary before translation into clinical utility.


Assuntos
Biomarcadores Tumorais/genética , Proteínas de Transporte de Cátions/genética , Metilação de DNA , Glicoproteínas de Membrana/genética , Proteínas de Membrana/genética , Neoplasias da Próstata/cirurgia , Adulto , Idoso , Dinamarca , Epigênese Genética , Humanos , Masculino , Pessoa de Meia-Idade , Gradação de Tumores , Prognóstico , Regiões Promotoras Genéticas , Prostatectomia , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , Análise de Sobrevida , Suíça , Estados Unidos
7.
PLoS One ; 9(2): e81186, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24558356

RESUMO

Thyroid hormone (TH) receptors (TRs) play central roles in metabolism and are major targets for pharmaceutical intervention. Presently, however, there is limited information about genome wide localizations of TR binding sites. Thus, complexities of TR genomic distribution and links between TRß binding events and gene regulation are not fully appreciated. Here, we employ a BioChIP approach to capture TR genome-wide binding events in a liver cell line (HepG2). Like other NRs, TRß appears widely distributed throughout the genome. Nevertheless, there is striking enrichment of TRß binding sites immediately 5' and 3' of transcribed genes and TRß can be detected near 50% of T3 induced genes. In contrast, no significant enrichment of TRß is seen at negatively regulated genes or genes that respond to unliganded TRs in this system. Canonical TRE half-sites are present in more than 90% of TRß peaks and classical TREs are also greatly enriched, but individual TRE organization appears highly variable with diverse half-site orientation and spacing. There is also significant enrichment of binding sites for TR associated transcription factors, including AP-1 and CTCF, near TR peaks. We conclude that T3-dependent gene induction commonly involves proximal TRß binding events but that far-distant binding events are needed for T3 induction of some genes and that distinct, indirect, mechanisms are often at play in negative regulation and unliganded TR actions. Better understanding of genomic context of TR binding sites will help us determine why TR regulates genes in different ways and determine possibilities for selective modulation of TR action.


Assuntos
Sítios de Ligação , Receptores beta dos Hormônios Tireóideos/metabolismo , Animais , Linhagem Celular , Regulação da Expressão Gênica , Genoma , Células Hep G2 , Humanos , Ligantes , Fígado/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos , Plasmídeos/metabolismo , Ligação Proteica , RNA/química , Elementos de Resposta , Análise de Sequência de DNA
8.
Infect Genet Evol ; 12(8): 1911-6, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22986003

RESUMO

It was observed that pressure of host immune system leads to diversifying selection (which can be measured in terms of pN/pS ratio). In this research we checked whether Plasmodium falciparum proteins containing experimentally evident epitopes from the IEDB database are subject to diversifying selection. We also investigated which life stage of this parasite and which proteins are subject to the strongest immune pressure. To answer these questions we used information about experimentally evident epitopes from P. falciparum, that interact with human immune system and sequences of different isolates of P. falciparum obtained from PlasmoDB. We confirmed the expectations that proteins containing IEDB epitopes are subject to stronger diversifying selection which is evidenced by higher pN/pS ratio. A stage characterized by the highest average pN/pS ratio is that of the sporozoite. The greatest fraction of putative antigens is also present at this stage. We also found that the sporozoite stage is particularly interesting for further analysis as it potentially contains the highest number of unidentified epitopes.


Assuntos
Malária Falciparum/parasitologia , Plasmodium falciparum/genética , Plasmodium falciparum/imunologia , Proteínas de Protozoários/genética , Proteínas de Protozoários/imunologia , Animais , Bases de Dados Factuais , Epitopos/genética , Epitopos/imunologia , Interações Hospedeiro-Parasita , Humanos , Fenômenos Imunogenéticos , Estágios do Ciclo de Vida , Malária Falciparum/imunologia , Modelos Moleculares , Polimorfismo de Nucleotídeo Único/genética , Proteoma/análise , Proteoma/genética , Proteoma/imunologia , Proteoma/metabolismo , Proteínas de Protozoários/metabolismo , Seleção Genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa