Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
PLoS Comput Biol ; 9(10): e1003234, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24098098

RESUMEN

Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently.


Asunto(s)
Biología Computacional/métodos , Cadenas de Markov , Análisis de Secuencia/métodos , Teorema de Bayes , Islas de CpG/genética
2.
bioRxiv ; 2024 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-38260545

RESUMEN

Research and medical genomics require comprehensive and scalable solutions to drive the discovery of novel disease targets, evolutionary drivers, and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size (e.g., SNV/SV) or location (e.g., repeats). Here we present DRAGEN that utilizes novel methods based on multigenomes, hardware acceleration, and machine learning based variant detection to provide novel insights into individual genomes with ~30min computation time (from raw reads to variant detection). DRAGEN outperforms all other state-of-the-art methods in speed and accuracy across all variant types (SNV, indel, STR, SV, CNV) and further incorporates specialized methods to obtain key insights in medically relevant genes (e.g., HLA, SMN, GBA). We showcase DRAGEN across 3,202 genomes and demonstrate its scalability, accuracy, and innovations to further advance the integration of comprehensive genomics for research and medical applications.

3.
Genome Biol ; 23(1): 2, 2022 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-34980216

RESUMEN

BACKGROUND: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.


Asunto(s)
Genoma Humano , Polimorfismo de Nucleótido Simple , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Reproducibilidad de los Resultados , Secuenciación Completa del Genoma
4.
Science ; 361(6409)2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-30139913

RESUMEN

To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.


Asunto(s)
Desequilibrio Alélico , Metilación de ADN , Enfermedad/genética , Epigénesis Genética , Genoma Humano , Polimorfismo de Nucleótido Simple , Alelos , Sitios de Unión , Islas de CpG , Redes Reguladoras de Genes , Sitios Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Análisis de Secuencia de ADN , Sulfitos/química , Factores de Transcripción/metabolismo
5.
Cell Rep ; 17(8): 2075-2086, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27851969

RESUMEN

Cancer progression depends on both cell-intrinsic processes and interactions between different cell types. However, large-scale assessment of cell type composition and molecular profiles of individual cell types within tumors remains challenging. To address this, we developed epigenomic deconvolution (EDec), an in silico method that infers cell type composition of complex tissues as well as DNA methylation and gene transcription profiles of constituent cell types. By applying EDec to The Cancer Genome Atlas (TCGA) breast tumors, we detect changes in immune cell infiltration related to patient prognosis, and a striking change in stromal fibroblast-to-adipocyte ratio across breast cancer subtypes. Furthermore, we show that a less adipose stroma tends to display lower levels of mitochondrial activity and to be associated with cancerous cells with higher levels of oxidative metabolism. These findings highlight the role of stromal composition in the metabolic coupling between distinct cell types within tumors.


Asunto(s)
Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Epigenómica , Tejido Adiposo/patología , Neoplasias de la Mama/inmunología , Neoplasias de la Mama/patología , Carcinogénesis/genética , Carcinogénesis/patología , Línea Celular Tumoral , Simulación por Computador , Metilación de ADN/genética , Progresión de la Enfermedad , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Oxidación-Reducción , Fenotipo , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Células del Estroma/patología , Microambiente Tumoral/genética
6.
Nat Commun ; 6: 6370, 2015 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-25691256

RESUMEN

Tissue-specific expression of lincRNAs suggests developmental and cell-type-specific functions, yet tissue specificity was established for only a small fraction of lincRNAs. Here, by analysing 111 reference epigenomes from the NIH Roadmap Epigenomics project, we determine tissue-specific epigenetic regulation for 3,753 (69% examined) lincRNAs, with 54% active in one of the 14 cell/tissue clusters and an additional 15% in two or three clusters. A larger fraction of lincRNA TSSs is marked in a tissue-specific manner by H3K4me1 than by H3K4me3. The tissue-specific lincRNAs are strongly linked to tissue-specific pathways and undergo distinct chromatin state transitions during cellular differentiation. Polycomb-regulated lincRNAs reside in the bivalent state in embryonic stem cells and many of them undergo H3K27me3-mediated silencing at early stages of differentiation. The exquisitely tissue-specific epigenetic regulation of lincRNAs and the assignment of a majority of them to specific tissue types will inform future studies of this newly discovered class of genes.


Asunto(s)
Diferenciación Celular , Epigénesis Genética , Epigenómica , ARN Largo no Codificante/metabolismo , Elementos Reguladores de la Transcripción , Células Madre Embrionarias/fisiología , Humanos , Especificidad de Órganos , Fenotipo , Proteínas del Grupo Polycomb/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA