Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-38260545

RESUMO

Research and medical genomics require comprehensive and scalable solutions to drive the discovery of novel disease targets, evolutionary drivers, and genetic markers with clinical significance. This necessitates a framework to identify all types of variants independent of their size (e.g., SNV/SV) or location (e.g., repeats). Here we present DRAGEN that utilizes novel methods based on multigenomes, hardware acceleration, and machine learning based variant detection to provide novel insights into individual genomes with ~30min computation time (from raw reads to variant detection). DRAGEN outperforms all other state-of-the-art methods in speed and accuracy across all variant types (SNV, indel, STR, SV, CNV) and further incorporates specialized methods to obtain key insights in medically relevant genes (e.g., HLA, SMN, GBA). We showcase DRAGEN across 3,202 genomes and demonstrate its scalability, accuracy, and innovations to further advance the integration of comprehensive genomics for research and medical applications.

2.
Genome Biol ; 23(1): 2, 2022 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-34980216

RESUMO

BACKGROUND: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.


Assuntos
Genoma Humano , Polimorfismo de Nucleotídeo Único , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Reprodutibilidade dos Testes , Sequenciamento Completo do Genoma
3.
Science ; 361(6409)2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-30139913

RESUMO

To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.


Assuntos
Desequilíbrio Alélico , Metilação de DNA , Doença/genética , Epigênese Genética , Genoma Humano , Polimorfismo de Nucleotídeo Único , Alelos , Sítios de Ligação , Ilhas de CpG , Redes Reguladoras de Genes , Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Análise de Sequência de DNA , Sulfitos/química , Fatores de Transcrição/metabolismo
4.
Cell Rep ; 17(8): 2075-2086, 2016 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-27851969

RESUMO

Cancer progression depends on both cell-intrinsic processes and interactions between different cell types. However, large-scale assessment of cell type composition and molecular profiles of individual cell types within tumors remains challenging. To address this, we developed epigenomic deconvolution (EDec), an in silico method that infers cell type composition of complex tissues as well as DNA methylation and gene transcription profiles of constituent cell types. By applying EDec to The Cancer Genome Atlas (TCGA) breast tumors, we detect changes in immune cell infiltration related to patient prognosis, and a striking change in stromal fibroblast-to-adipocyte ratio across breast cancer subtypes. Furthermore, we show that a less adipose stroma tends to display lower levels of mitochondrial activity and to be associated with cancerous cells with higher levels of oxidative metabolism. These findings highlight the role of stromal composition in the metabolic coupling between distinct cell types within tumors.


Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Epigenômica , Tecido Adiposo/patologia , Neoplasias da Mama/imunologia , Neoplasias da Mama/patologia , Carcinogênese/genética , Carcinogênese/patologia , Linhagem Celular Tumoral , Simulação por Computador , Metilação de DNA/genética , Progressão da Doença , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Oxirredução , Fenótipo , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Células Estromais/patologia , Microambiente Tumoral/genética
5.
Nat Commun ; 6: 6370, 2015 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-25691256

RESUMO

Tissue-specific expression of lincRNAs suggests developmental and cell-type-specific functions, yet tissue specificity was established for only a small fraction of lincRNAs. Here, by analysing 111 reference epigenomes from the NIH Roadmap Epigenomics project, we determine tissue-specific epigenetic regulation for 3,753 (69% examined) lincRNAs, with 54% active in one of the 14 cell/tissue clusters and an additional 15% in two or three clusters. A larger fraction of lincRNA TSSs is marked in a tissue-specific manner by H3K4me1 than by H3K4me3. The tissue-specific lincRNAs are strongly linked to tissue-specific pathways and undergo distinct chromatin state transitions during cellular differentiation. Polycomb-regulated lincRNAs reside in the bivalent state in embryonic stem cells and many of them undergo H3K27me3-mediated silencing at early stages of differentiation. The exquisitely tissue-specific epigenetic regulation of lincRNAs and the assignment of a majority of them to specific tissue types will inform future studies of this newly discovered class of genes.


Assuntos
Diferenciação Celular , Epigênese Genética , Epigenômica , RNA Longo não Codificante/metabolismo , Elementos Reguladores de Transcrição , Células-Tronco Embrionárias/fisiologia , Humanos , Especificidade de Órgãos , Fenótipo , Proteínas do Grupo Polycomb/fisiologia
6.
PLoS Comput Biol ; 9(10): e1003234, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24098098

RESUMO

Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently.


Assuntos
Biologia Computacional/métodos , Cadeias de Markov , Análise de Sequência/métodos , Teorema de Bayes , Ilhas de CpG/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA