RESUMO
Gut microbial dysbioses are linked to aberrant immune responses, which are often accompanied by abnormal production of inflammatory cytokines. As part of the Human Functional Genomics Project (HFGP), we investigate how differences in composition and function of gut microbial communities may contribute to inter-individual variation in cytokine responses to microbial stimulations in healthy humans. We observe microbiome-cytokine interaction patterns that are stimulus specific, cytokine specific, and cytokine and stimulus specific. Validation of two predicted host-microbial interactions reveal that TNFα and IFNγ production are associated with specific microbial metabolic pathways: palmitoleic acid metabolism and tryptophan degradation to tryptophol. Besides providing a resource of predicted microbially derived mediators that influence immune phenotypes in response to common microorganisms, these data can help to define principles for understanding disease susceptibility. The three HFGP studies presented in this issue lay the groundwork for further studies aimed at understanding the interplay between microbial, genetic, and environmental factors in the regulation of the immune response in humans. PAPERCLIP.
Assuntos
Citocinas/imunologia , Microbioma Gastrointestinal , Inflamação/imunologia , Microbiota , Adolescente , Adulto , Idoso , Bactérias/classificação , Bactérias/imunologia , Sangue/imunologia , Disbiose/imunologia , Disbiose/microbiologia , Fezes/microbiologia , Feminino , Fungos/classificação , Fungos/imunologia , Interação Gene-Ambiente , Projeto Genoma Humano , Humanos , Infecções/imunologia , Infecções/microbiologia , Leucócitos Mononucleares/imunologia , Masculino , Pessoa de Meia-IdadeRESUMO
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Assuntos
Cromossomos Humanos Y , Evolução Molecular , Humanos , Masculino , Cromossomos Humanos Y/genética , Genoma Humano/genética , Genômica , Taxa de Mutação , Fenótipo , Eucromatina/genética , Pseudogenes , Variação Genética/genética , Cromossomos Humanos X/genética , Regiões Pseudoautossômicas/genéticaRESUMO
Despite the overwhelming evidence that multiple sclerosis is an autoimmune disease, relatively little is known about the precise nature of the immune dysregulation underlying the development of the disease. Reasoning that the CSF from patients might be enriched for cells relevant in pathogenesis, we have completed a high-resolution single-cell analysis of 96 732 CSF cells collected from 33 patients with multiple sclerosis (n = 48 675) and 48 patients with other neurological diseases (n = 48 057). Completing comprehensive cell type annotation, we identified a rare population of CD8+ T cells, characterized by the upregulation of inhibitory receptors, increased in patients with multiple sclerosis. Applying a Multi-Omics Factor Analysis to these single-cell data further revealed that activity in pathways responsible for controlling inflammatory and type 1 interferon responses are altered in multiple sclerosis in both T cells and myeloid cells. We also undertook a systematic search for expression quantitative trait loci in the CSF cells. Of particular interest were two expression quantitative trait loci in CD8+ T cells that were fine mapped to multiple sclerosis susceptibility variants in the viral control genes ZC3HAV1 (rs10271373) and IFITM2 (rs1059091). Further analysis suggests that these associations likely reflect genetic effects on RNA splicing and cell-type specific gene expression respectively. Collectively, our study suggests that alterations in viral control mechanisms might be important in the development of multiple sclerosis.
Assuntos
Esclerose Múltipla , Humanos , Linfócitos T CD8-Positivos , Regulação para Cima , Antivirais , Líquido Cefalorraquidiano/metabolismo , Proteínas de Membrana/genéticaRESUMO
Bulk and single-cell DNA sequencing has enabled reconstructing clonal substructures of somatic tissues from frequency and cooccurrence patterns of somatic variants. However, approaches to characterize phenotypic variations between clones are not established. Here we present cardelino (https://github.com/single-cell-genetics/cardelino), a computational method for inferring the clonal tree configuration and the clone of origin of individual cells assayed using single-cell RNA-seq (scRNA-seq). Cardelino flexibly integrates information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. We apply cardelino to a published cancer dataset and to newly generated matched scRNA-seq and exome-seq data from 32 human dermal fibroblast lines, identifying hundreds of differentially expressed genes between cells from different somatic clones. These genes are frequently enriched for cell cycle and proliferation pathways, indicating a role for cell division genes in somatic evolution in healthy skin.
Assuntos
Fibroblastos/metabolismo , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software , Algoritmos , Ciclo Celular , Proliferação de Células , Humanos , Melanoma , Mutação , TranscriptomaRESUMO
Approximately 1.5 billion people worldwide are overweight or affected by obesity, and are at risk of developing type 2 diabetes, cardiovascular disease and related metabolic and inflammatory disturbances. Although the mechanisms linking adiposity to associated clinical conditions are poorly understood, recent studies suggest that adiposity may influence DNA methylation, a key regulator of gene expression and molecular phenotype. Here we use epigenome-wide association to show that body mass index (BMI; a key measure of adiposity) is associated with widespread changes in DNA methylation (187 genetic loci with P < 1 × 10-7, range P = 9.2 × 10-8 to 6.0 × 10-46; n = 10,261 samples). Genetic association analyses demonstrate that the alterations in DNA methylation are predominantly the consequence of adiposity, rather than the cause. We find that methylation loci are enriched for functional genomic features in multiple tissues (P < 0.05), and show that sentinel methylation markers identify gene expression signatures at 38 loci (P < 9.0 × 10-6, range P = 5.5 × 10-6 to 6.1 × 10-35, n = 1,785 samples). The methylation loci identify genes involved in lipid and lipoprotein metabolism, substrate transport and inflammatory pathways. Finally, we show that the disturbances in DNA methylation predict future development of type 2 diabetes (relative risk per 1 standard deviation increase in methylation risk score: 2.3 (2.07-2.56); P = 1.1 × 10-54). Our results provide new insights into the biologic pathways influenced by adiposity, and may enable development of new strategies for prediction and prevention of type 2 diabetes and other adverse clinical consequences of obesity.
Assuntos
Adiposidade/genética , Índice de Massa Corporal , Metilação de DNA/genética , Diabetes Mellitus Tipo 2/genética , Epigênese Genética , Epigenômica , Estudo de Associação Genômica Ampla , Obesidade/genética , Tecido Adiposo/metabolismo , Povo Asiático/genética , Sangue/metabolismo , Estudos de Coortes , Diabetes Mellitus Tipo 2/complicações , Europa (Continente)/etnologia , Feminino , Marcadores Genéticos , Predisposição Genética para Doença , Humanos , Índia/etnologia , Masculino , Obesidade/sangue , Obesidade/complicações , Sobrepeso/sangue , Sobrepeso/complicações , Sobrepeso/genética , População Branca/genéticaRESUMO
OBJECTIVE: Patients with IBD display substantial heterogeneity in clinical characteristics. We hypothesise that individual differences in the complex interaction of the host genome and the gut microbiota can explain the onset and the heterogeneous presentation of IBD. Therefore, we performed a case-control analysis of the gut microbiota, the host genome and the clinical phenotypes of IBD. DESIGN: Stool samples, peripheral blood and extensive phenotype data were collected from 313 patients with IBD and 582 truly healthy controls, selected from a population cohort. The gut microbiota composition was assessed by tag-sequencing the 16S rRNA gene. All participants were genotyped. We composed genetic risk scores from 11 functional genetic variants proven to be associated with IBD in genes that are directly involved in the bacterial handling in the gut: NOD2, CARD9, ATG16L1, IRGM and FUT2. RESULTS: Strikingly, we observed significant alterations of the gut microbiota of healthy individuals with a high genetic risk for IBD: the IBD genetic risk score was significantly associated with a decrease in the genus Roseburia in healthy controls (false discovery rate 0.017). Moreover, disease location was a major determinant of the gut microbiota: the gut microbiota of patients with colonic Crohn's disease (CD) is different from that of patients with ileal CD, with a decrease in alpha diversity associated to ileal disease (p=3.28×10-13). CONCLUSIONS: We show for the first time that genetic risk variants associated with IBD influence the gut microbiota in healthy individuals. Roseburia spp are acetate-to-butyrate converters, and a decrease has already been observed in patients with IBD.
Assuntos
Microbioma Gastrointestinal/genética , Doenças Inflamatórias Intestinais/genética , Doenças Inflamatórias Intestinais/microbiologia , Adulto , Estudos de Casos e Controles , Colite Ulcerativa/genética , Colite Ulcerativa/microbiologia , Colite Ulcerativa/patologia , Doença de Crohn/genética , Doença de Crohn/microbiologia , Doença de Crohn/patologia , Disbiose/complicações , Disbiose/genética , Disbiose/microbiologia , Fezes/microbiologia , Feminino , Predisposição Genética para Doença , Interações Hospedeiro-Patógeno/genética , Humanos , Doenças Inflamatórias Intestinais/patologia , Masculino , Pessoa de Meia-Idade , Medição de Risco/métodos , Índice de Gravidade de DoençaRESUMO
BACKGROUND: DNA methylation has been found to associate with disease, aging and environmental exposure, but it is unknown how genome, environment and disease influence DNA methylation dynamics in childhood. RESULTS: By analysing 538 paired DNA blood samples from children at birth and at 4-5 years old and 726 paired samples from children at 4 and 8 years old from four European birth cohorts using the Illumina Infinium Human Methylation 450 k chip, we have identified 14,150 consistent age-differential methylation sites (a-DMSs) at epigenome-wide significance of p < 1.14 × 10-7. Genes with an increase in age-differential methylation were enriched in pathways related to 'development', and were more often located in bivalent transcription start site (TSS) regions, which can silence or activate expression of developmental genes. Genes with a decrease in age-differential methylation were involved in cell signalling, and enriched on H3K27ac, which can predict developmental state. Maternal smoking tended to decrease methylation levels at the identified da-DMSs. We also found 101 a-DMSs (0.71%) that were regulated by genetic variants using cis-differential Methylation Quantitative Trait Locus (cis-dMeQTL) mapping. Moreover, a-DMS-associated genes during early development were significantly more likely to be linked with disease. CONCLUSION: Our study provides new insights into the dynamic epigenetic landscape of the first 8 years of life.
Assuntos
Desenvolvimento Infantil , Metilação de DNA , Epigênese Genética , Epigenômica , Criança , Pré-Escolar , Ilhas de CpG , Epigenômica/métodos , Feminino , Predisposição Genética para Doença , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Lactente , Recém-Nascido , Exposição Materna/efeitos adversos , Gravidez , Efeitos Tardios da Exposição Pré-Natal , Locos de Características Quantitativas , Fumar/efeitos adversosRESUMO
RATIONALE: Evidence suggests that the gut microbiome is involved in the development of cardiovascular disease, with the host-microbe interaction regulating immune and metabolic pathways. However, there was no firm evidence for associations between microbiota and metabolic risk factors for cardiovascular disease from large-scale studies in humans. In particular, there was no strong evidence for association between cardiovascular disease and aberrant blood lipid levels. OBJECTIVES: To identify intestinal bacteria taxa, whose proportions correlate with body mass index and lipid levels, and to determine whether lipid variance can be explained by microbiota relative to age, sex, and host genetics. METHODS AND RESULTS: We studied 893 subjects from the Life-Lines-DEEP population cohort. After correcting for age and sex, we identified 34 bacterial taxa associated with body mass index and blood lipids; most are novel associations. Cross-validation analysis revealed that microbiota explain 4.5% of the variance in body mass index, 6% in triglycerides, and 4% in high-density lipoproteins, independent of age, sex, and genetic risk factors. A novel risk model, including the gut microbiome explained ≤ 25.9% of high-density lipoprotein variance, significantly outperforming the risk model without microbiome. Strikingly, the microbiome had little effect on low-density lipoproteins or total cholesterol. CONCLUSIONS: Our studies suggest that the gut microbiome may play an important role in the variation in body mass index and blood lipid levels, independent of age, sex, and host genetics. Our findings support the potential of therapies altering the gut microbiome to control body mass, triglycerides, and high-density lipoproteins.
Assuntos
Índice de Massa Corporal , Microbioma Gastrointestinal/fisiologia , Lipídeos/sangue , Polimorfismo de Nucleotídeo Único , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Bactérias/classificação , Bactérias/genética , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/microbiologia , Colesterol/sangue , HDL-Colesterol/sangue , LDL-Colesterol/sangue , Estudos de Coortes , Feminino , Microbioma Gastrointestinal/genética , Interações Hospedeiro-Patógeno , Humanos , Masculino , Pessoa de Meia-Idade , RNA Ribossômico 16S/genética , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Fatores de Risco , Triglicerídeos/sangue , Adulto JovemRESUMO
BACKGROUND AND AIMS: Proton pump inhibitors (PPIs) are among the top 10 most widely used drugs in the world. PPI use has been associated with an increased risk of enteric infections, most notably Clostridium difficile. The gut microbiome plays an important role in enteric infections, by resisting or promoting colonisation by pathogens. In this study, we investigated the influence of PPI use on the gut microbiome. METHODS: The gut microbiome composition of 1815 individuals, spanning three cohorts, was assessed by tag sequencing of the 16S rRNA gene. The difference in microbiota composition in PPI users versus non-users was analysed separately in each cohort, followed by a meta-analysis. RESULTS: 211 of the participants were using PPIs at the moment of stool sampling. PPI use is associated with a significant decrease in Shannon's diversity and with changes in 20% of the bacterial taxa (false discovery rate <0.05). Multiple oral bacteria were over-represented in the faecal microbiome of PPI-users, including the genus Rothia (p=9.8×10(-38)). In PPI users we observed a significant increase in bacteria: genera Enterococcus, Streptococcus, Staphylococcus and the potentially pathogenic species Escherichia coli. CONCLUSIONS: The differences between PPI users and non-users observed in this study are consistently associated with changes towards a less healthy gut microbiome. These differences are in line with known changes that predispose to C. difficile infections and can potentially explain the increased risk of enteric infections in PPI users. On a population level, the effects of PPI are more prominent than the effects of antibiotics or other commonly used drugs.
Assuntos
Microbioma Gastrointestinal/efeitos dos fármacos , Inibidores da Bomba de Prótons/farmacologia , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
BACKGROUND: The liver plays a central role in the maintenance of homeostasis and health in general. However, there is substantial inter-individual variation in hepatic gene expression, and although numerous genetic factors have been identified, less is known about the epigenetic factors. RESULTS: By analyzing the methylomes and transcriptomes of 14 fetal and 181 adult livers, we identified 657 differentially methylated genes with adult-specific expression, these genes were enriched for transcription factor binding sites of HNF1A and HNF4A. We also identified 1,000 genes specific to fetal liver, which were enriched for GATA1, STAT5A, STAT5B and YY1 binding sites. We saw strong liver-specific effects of single nucleotide polymorphisms on both methylation levels (28,447 unique CpG sites (meQTL)) and gene expression levels (526 unique genes (eQTL)), at a false discovery rate (FDR) < 0.05. Of the 526 unique eQTL associated genes, 293 correlated significantly not only with genetic variation but also with methylation levels. The tissue-specificities of these associations were analyzed in muscle, subcutaneous adipose tissue and visceral adipose tissue. We observed that meQTL were more stable between tissues than eQTL and a very strong tissue-specificity for the identified associations between CpG methylation and gene expression. CONCLUSIONS: Our analyses generated a comprehensive resource of factors involved in the regulation of hepatic gene expression, and allowed us to estimate the proportion of variation in gene expression that could be attributed to genetic and epigenetic variation, both crucial to understanding differences in drug response and the etiology of liver diseases.
Assuntos
Epigênese Genética , Epigenômica , Feto/metabolismo , Perfilação da Expressão Gênica , Fígado/crescimento & desenvolvimento , Fígado/metabolismo , Adulto , Metilação de DNA , Feto/embriologia , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Especificidade de Órgãos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genéticaRESUMO
For many complex traits, genetic variants have been found associated. However, it is still mostly unclear through which downstream mechanism these variants cause these phenotypes. Knowledge of these intermediate steps is crucial to understand pathogenesis, while also providing leads for potential pharmacological intervention. Here we relied upon natural human genetic variation to identify effects of these variants on trans-gene expression (expression quantitative trait locus mapping, eQTL) in whole peripheral blood from 1,469 unrelated individuals. We looked at 1,167 published trait- or disease-associated SNPs and observed trans-eQTL effects on 113 different genes, of which we replicated 46 in monocytes of 1,490 different individuals and 18 in a smaller dataset that comprised subcutaneous adipose, visceral adipose, liver tissue, and muscle tissue. HLA single-nucleotide polymorphisms (SNPs) were 10-fold enriched for trans-eQTLs: 48% of the trans-acting SNPs map within the HLA, including ulcerative colitis susceptibility variants that affect plausible candidate genes AOAH and TRBV18 in trans. We identified 18 pairs of unlinked SNPs associated with the same phenotype and affecting expression of the same trans-gene (21 times more than expected, P<10(-16)). This was particularly pronounced for mean platelet volume (MPV): Two independent SNPs significantly affect the well-known blood coagulation genes GP9 and F13A1 but also C19orf33, SAMD14, VCL, and GNG11. Several of these SNPs have a substantially higher effect on the downstream trans-genes than on the eventual phenotypes, supporting the concept that the effects of these SNPs on expression seems to be much less multifactorial. Therefore, these trans-eQTLs could well represent some of the intermediate genes that connect genetic variants with their eventual complex phenotypic outcomes.
Assuntos
Mapeamento Cromossômico , Regulação da Expressão Gênica , Variação Genética , Antígenos HLA/genética , Fenótipo , Locos de Características Quantitativas/genética , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Monócitos/metabolismo , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Ageing is the accumulation of changes and decline of function of organisms over time. The concept and biomarkers of biological age have been established, notably DNA methylation-based clocks. The emergence of single-cell DNA methylation profiling methods opens the possibility of studying the biological age of individual cells. Here, we generate a large single-cell DNA methylation and transcriptome dataset from mouse peripheral blood samples, spanning a broad range of ages. The number of genes expressed increases with age, but gene-specific changes are small. We next develop scEpiAge, a single-cell DNA methylation age predictor, which can accurately predict age in (very sparse) publicly available datasets, and also in single cells. DNA methylation age distribution is wider than technically expected, indicating epigenetic age heterogeneity and functional differences. Our work provides a foundation for single-cell and sparse data epigenetic age predictors, validates their functionality and highlights epigenetic heterogeneity during ageing.
Assuntos
Envelhecimento , Metilação de DNA , Epigênese Genética , Análise de Célula Única , Transcriptoma , Animais , Análise de Célula Única/métodos , Envelhecimento/sangue , Envelhecimento/genética , Camundongos , Senescência Celular/genética , Masculino , Camundongos Endogâmicos C57BL , Feminino , Perfilação da Expressão Gênica/métodos , Epigenômica/métodosRESUMO
BACKGROUND: The plasma metabolome reflects the physiological state of various biological processes and can serve as a proxy for disease risk. Plasma metabolite variation, influenced by genetic and epigenetic mechanisms, can also affect the cellular microenvironment and blood cell epigenetics. The interplay between the plasma metabolome and the blood cell epigenome remains elusive. In this study, we performed an epigenome-wide association study (EWAS) of 1183 plasma metabolites in 693 participants from the LifeLines-DEEP cohort and investigated the causal relationships in DNA methylation-metabolite associations using bidirectional Mendelian randomization and mediation analysis. RESULTS: After rigorously adjusting for potential confounders, including genetics, we identified five robust associations between two plasma metabolites (L-serine and glycine) and three CpG sites located in two independent genomic regions (cg14476101 and cg16246545 in PHGDH and cg02711608 in SLC1A5) at a false discovery rate of less than 0.05. Further analysis revealed a complex bidirectional relationship between plasma glycine/serine levels and DNA methylation. Moreover, we observed a strong mediating role of DNA methylation in the effect of glycine/serine on the expression of their metabolism/transport genes, with the proportion of the mediated effect ranging from 11.8 to 54.3%. This result was also replicated in an independent population-based cohort, the Rotterdam Study. To validate our findings, we conducted in vitro cell studies which confirmed the mediating role of DNA methylation in the regulation of PHGDH gene expression. CONCLUSIONS: Our findings reveal a potential feedback mechanism in which glycine and serine regulate gene expression through DNA methylation.
Assuntos
Metilação de DNA , Epigênese Genética , Estudo de Associação Genômica Ampla , Glicina , Metaboloma , Serina , Humanos , Glicina/sangue , Serina/sangue , Serina/genética , Metilação de DNA/genética , Masculino , Feminino , Estudo de Associação Genômica Ampla/métodos , Metaboloma/genética , Epigênese Genética/genética , Pessoa de Meia-Idade , Ilhas de CpG/genética , Epigenoma/genética , Adulto , Idoso , Análise da Randomização MendelianaRESUMO
Encounters with pathogens and other molecules can imprint long-lasting effects on our immune system, influencing future physiological outcomes. Given the wide range of microbes to which humans are exposed, their collective impact on health is not fully understood. To explore relations between exposures and biological aging and inflammation, we profiled an antibody-binding repertoire against 2,815 microbial, viral, and environmental peptides in a population cohort of 1,443 participants. Utilizing antibody-binding as a proxy for past exposures, we investigated their impact on biological aging, cell composition, and inflammation. Immune response against cytomegalovirus (CMV), rhinovirus, and gut bacteria relates with telomere length. Single-cell expression measurements identified an effect of CMV infection on the transcriptional landscape of subpopulations of CD8 and CD4 T-cells. This examination of the relationship between microbial exposures and biological aging and inflammation highlights a role for chronic infections (CMV and Epstein-Barr virus) and common pathogens (rhinoviruses and adenovirus C).
RESUMO
Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here, we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (130 Mbp median continuity), closing 92% of all previous assembly gaps1,2 and reaching telomere-to-telomere (T2T) status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8, and AMY1/AMY2, and fully resolve 1,852 complex structural variants (SVs). In addition, we completely assemble and validate 1,246 human centromeres. We find up to 30-fold variation in α-satellite high-order repeat (HOR) array length and characterize the pattern of mobile element insertions into α-satellite HOR arrays. While most centromeres predict a single site of kinetochore attachment, epigenetic analysis suggests the presence of two hypomethylated regions for 7% of centromeres. Combining our data with the draft pangenome reference1 significantly enhances genotyping accuracy from short-read data, enabling whole-genome inference3 to a median quality value (QV) of 45. Using this approach, 26,115 SVs per sample are detected, substantially increasing the number of SVs now amenable to downstream disease association studies.
RESUMO
We present pycoMeth, a toolbox to store, manage and analyze DNA methylation calls from long-read sequencing data obtained using the Oxford Nanopore Technologies sequencing platform. Building on a novel, rapid-access, read-level and reference-anchored methylation storage format MetH5, we propose efficient algorithms for haplotype aware, multi-sample consensus segmentation and differential methylation testing. We show that MetH5 is more efficient than existing solutions for storing Oxford Nanopore Technologies methylation calls, and carry out benchmarking for pycoMeth segmentation and differential methylation testing, demonstrating increased performance and sensitivity compared to existing solutions designed for short-read methylation data.
Assuntos
Nanoporos , Análise de Sequência de DNA , Metilação de DNA , Algoritmos , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
We present novoRNABreak, a unified framework for cancer specific novel splice junction and fusion transcript detection in RNA-seq data obtained from human cancer samples. novoRNABreak is based on a local assembly model, which offers a tradeoff between the alignment-based and de novo whole transcriptome assembly (WTA) methods. This approach is accurate and sensitive in assembling novel junctions that are difficult to directly align or have multiple alignments. Additionally, it is more efficient due to the strategy that focuses on junctions rather than full length transcripts. The performance of novoRNABreak is demonstrated by a comprehensive set of experiments using synthetic data generated based on genome reference, as well as real RNA-seq data from breast cancer and prostate cancer samples. The results show that our tool has a better performance by fully utilizing unmapped reads and precisely identifying the junctions where short reads or small exons have multiple alignments. novoRNABreak is a fully-fledged program available on GitHub (https://github.com/KChen-lab/novoRNABreak).
RESUMO
Cancer genomes harbor a broad spectrum of structural variants (SVs) driving tumorigenesis, a relevant subset of which escape discovery using short-read sequencing. We employed Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assembled complex rearrangements, including a 1.55-Mbp chromothripsis event, and we uncover a complex SV pattern termed templated insertion (TI) thread, characterized by short (mostly <1 kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50 kbp in size. TI threads occur in 3% of cancers, with a prevalence up to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read-based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in cancer-driver genes. Our study shows the advantage of long-read sequencing in the discovery and characterization of complex somatic rearrangements.