Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Genome Biol ; 24(1): 240, 2023 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-37864197

RESUMEN

Diversity-generating and mobile genetic elements are key to microbial and viral evolution and can result in evolutionary leaps. State-of-the-art algorithms to detect these elements have limitations. Here, we introduce DIVE, a new reference-free approach to overcome these limitations using information contained in sequencing reads alone. We show that DIVE has improved detection power compared to existing reference-based methods using simulations and real data. We use DIVE to rediscover and characterize the activity of known and novel elements and generate new biological hypotheses about the mobilome. Building on DIVE, we develop a reference-free framework capable of de novo discovery of mobile genetic elements.


Asunto(s)
Transferencia de Gen Horizontal , Secuencias Repetitivas Esparcidas , Elementos Transponibles de ADN
2.
Nucleic Acids Res ; 51(5): 2046-2065, 2023 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-36762477

RESUMEN

Epigenetic information defines tissue identity and is largely inherited in development through DNA methylation. While studied mostly for mean differences, methylation also encodes stochastic change, defined as entropy in information theory. Analyzing allele-specific methylation in 49 human tissue sample datasets, we find that methylation entropy is associated with specific DNA binding motifs, regulatory DNA, and CpG density. Then applying information theory to 42 mouse embryo methylation datasets, we find that the contribution of methylation entropy to time- and tissue-specific patterns of development is comparable to the contribution of methylation mean, and methylation entropy is associated with sequence and chromatin features conserved with human. Moreover, methylation entropy is directly related to gene expression variability in development, suggesting a role for epigenetic entropy in developmental plasticity.


Asunto(s)
Metilación de ADN , Epigénesis Genética , Humanos , Animales , Ratones , Metilación de ADN/genética , Entropía , Islas de CpG/genética , ADN/genética
3.
Sci Rep ; 11(1): 21619, 2021 11 03.
Artículo en Inglés | MEDLINE | ID: mdl-34732768

RESUMEN

High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisfactory detection performance and hamper downstream analysis. Here we develop a statistical method, CpelNano, for the quantification and analysis of 5mC methylation landscapes using nanopore data. CpelNano takes into account nanopore noise by means of a hidden Markov model (HMM) in which the true but unknown ("hidden") methylation state is modeled through an Ising probability distribution that is consistent with methylation means and pairwise correlations, whereas nanopore current signals constitute the observed state. It then estimates the associated methylation potential energy function by employing the expectation-maximization (EM) algorithm and performs differential methylation analysis via permutation-based hypothesis testing. Using simulations and analysis of published data obtained from three human cell lines (GM12878, MCF-10A, and MDA-MB-231), we show that CpelNano can faithfully estimate DNA methylation potential energy landscapes, substantially improving current methods and leading to a powerful tool for the modeling and analysis of epigenetic landscapes using nanopore sequencing data.


Asunto(s)
Algoritmos , Neoplasias de la Mama/genética , Metilación de ADN , Epigénesis Genética , Linfocitos/metabolismo , Secuenciación de Nanoporos/métodos , Análisis de Secuencia de ADN/métodos , Neoplasias de la Mama/patología , Células Cultivadas , Femenino , Genoma Humano , Humanos
4.
Nat Biomed Eng ; 5(4): 360-376, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33859388

RESUMEN

In cancer, linking epigenetic alterations to drivers of transformation has been difficult, in part because DNA methylation analyses must capture epigenetic variability, which is central to tumour heterogeneity and tumour plasticity. Here, by conducting a comprehensive analysis, based on information theory, of differences in methylation stochasticity in samples from patients with paediatric acute lymphoblastic leukaemia (ALL), we show that ALL epigenomes are stochastic and marked by increased methylation entropy at specific regulatory regions and genes. By integrating DNA methylation and single-cell gene-expression data, we arrived at a relationship between methylation entropy and gene-expression variability, and found that epigenetic changes in ALL converge on a shared set of genes that overlap with genetic drivers involved in chromosomal translocations across the disease spectrum. Our findings suggest that an epigenetically driven gene-regulation network, with UHRF1 (ubiquitin-like with PHD and RING finger domains 1) as a central node, links genetic drivers and epigenetic mediators in ALL.


Asunto(s)
Epigénesis Genética , Modelos Teóricos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Proteínas Potenciadoras de Unión a CCAAT/genética , Niño , Subunidad alfa 2 del Factor de Unión al Sitio Principal/genética , Análisis Citogenético , Metilación de ADN , Entropía , Edición Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Proteínas de Fusión Oncogénica/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/patología , RNA-Seq , Análisis de la Célula Individual , Procesos Estocásticos , Ubiquitina-Proteína Ligasas/genética
5.
Epigenetics ; 15(8): 841-858, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32114880

RESUMEN

Translocations of the KMT2A (MLL) gene define a biologically distinct and clinically aggressive subtype of acute myeloid leukaemia (AML), marked by a characteristic gene expression profile and few cooperating mutations. Although dysregulation of the epigenetic landscape in this leukaemia is particularly interesting given the low mutation frequency, its comprehensive analysis using whole genome bisulphite sequencing (WGBS) has not been previously performed. Here we investigated epigenetic dysregulation in nine MLL-rearranged (MLL-r) AML samples by comparing them to six normal myeloid controls, using a computational method that encapsulates mean DNA methylation measurements along with analyses of methylation stochasticity. We discovered a dramatically altered epigenetic profile in MLL-r AML, associated with genome-wide hypomethylation and a markedly increased DNA methylation entropy reflecting an increasingly disordered epigenome. Methylation discordance mapped to key genes and regulatory elements that included bivalent promoters and active enhancers. Genes associated with significant changes in methylation stochasticity recapitulated known MLL-r AML expression signatures, suggesting a role for the altered epigenetic landscape in the transcriptional programme initiated by MLL translocations. Accordingly, we established statistically significant associations between discordances in methylation stochasticity and gene expression in MLL-r AML, thus providing a link between the altered epigenetic landscape and the phenotype.


Asunto(s)
Metilación de ADN , Regulación Neoplásica de la Expresión Génica , Leucemia Bifenotípica Aguda/genética , Leucemia Mieloide Aguda/genética , Epigénesis Genética , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Leucemia Bifenotípica Aguda/metabolismo , Leucemia Mieloide Aguda/metabolismo , Proteína de la Leucemia Mieloide-Linfoide/genética , Transcriptoma , Translocación Genética
6.
BMC Bioinformatics ; 20(1): 175, 2019 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-30961526

RESUMEN

BACKGROUND: Establishment and maintenance of DNA methylation throughout the genome is an important epigenetic mechanism that regulates gene expression whose disruption has been implicated in human diseases like cancer. It is therefore crucial to know which genes, or other genomic features of interest, exhibit significant discordance in DNA methylation between two phenotypes. We have previously proposed an approach for ranking genes based on methylation discordance within their promoter regions, determined by centering a window of fixed size at their transcription start sites. However, we cannot use this method to identify statistically significant genomic features and handle features of variable length and with missing data. RESULTS: We present a new approach for computing the statistical significance of methylation discordance within genomic features of interest in single and multiple test/reference studies. We base the proposed method on a well-articulated hypothesis testing problem that produces p- and q-values for each genomic feature, which we then use to identify and rank features based on the statistical significance of their epigenetic dysregulation. We employ the information-theoretic concept of mutual information to derive a novel test statistic, which we can evaluate by computing Jensen-Shannon distances between the probability distributions of methylation in a test and a reference sample. We design the proposed methodology to simultaneously handle biological, statistical, and technical variability in the data, as well as variable feature lengths and missing data, thus enabling its wide-spread use on any list of genomic features. This is accomplished by estimating, from reference data, the null distribution of the test statistic as a function of feature length using generalized additive regression models. Differential assessment, using normal/cancer data from healthy fetal tissue and pediatric high-grade glioma patients, illustrates the potential of our approach to greatly facilitate the exploratory phases of clinically and biologically relevant methylation studies. CONCLUSIONS: The proposed approach provides the first computational tool for statistically testing and ranking genomic features of interest based on observed DNA methylation discordance in comparative studies that accounts, in a rigorous manner, for biological, statistical, and technical variability in methylation data, as well as for variability in feature length and for missing data.


Asunto(s)
Epigénesis Genética , Epigenómica , Genómica , Metilación de ADN , Genoma Humano , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , Probabilidad
7.
BMC Bioinformatics ; 19(1): 87, 2018 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-29514626

RESUMEN

BACKGROUND: DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. RESULTS: We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. CONCLUSIONS: This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.


Asunto(s)
Teoría de la Información , Modelos Teóricos , Estadística como Asunto , Sulfitos/química , Secuenciación Completa del Genoma/métodos , Secuencia de Bases , Simulación por Computador , Islas de CpG/genética , Metilación de ADN/genética , Entropía , Epigénesis Genética , Ontología de Genes , Genoma Humano , Humanos , Neoplasias Pulmonares/genética , Probabilidad , Navegador Web
8.
BMC Genomics ; 18(1): 694, 2017 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-28874136

RESUMEN

BACKGROUND: The information content of genomes plays a crucial role in the existence and proper development of living organisms. Thus, tremendous effort has been dedicated to developing DNA sequencing technologies that provide a better understanding of the underlying mechanisms of cellular processes. Advances in the development of sequencing technology have made it possible to sequence genomes in a relatively fast and inexpensive way. However, as with any measurement technology, there is noise involved and this needs to be addressed to reach conclusions based on the resulting data. In addition, there are multiple intermediate steps and degrees of freedom when constructing genome assemblies that lead to ambiguous and inconsistent results among assemblers. METHODS: Here we introduce HiMMe, an HMM-based tool that relies on genetic patterns to score genome assemblies. Through a Markov chain, the model is able to detect characteristic genetic patterns, while, by introducing emission probabilities, the noise involved in the process is taken into account. Prior knowledge can be used by training the model to fit a given organism or sequencing technology. RESULTS: Our results show that the method presented is able to recognize patterns even with relatively small k-mer size choices and limited computational resources. CONCLUSIONS: Our methodology provides an individual quality metric per contig in addition to an overall genome assembly score, with a time complexity well below that of an aligner. Ultimately, HiMMe provides meaningful statistical insights that can be leveraged by researchers to better select contigs and genome assemblies for downstream analysis.


Asunto(s)
Genómica/métodos , Cadenas de Markov , Algoritmos , Teorema de Bayes , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...