RESUMO
Whole-genome bisulfite sequencing (BS-Seq) measures cytosine methylation changes at single-base resolution and can be used to profile cell-free DNA (cfDNA). In plasma, ultrashort single-stranded cfDNA (uscfDNA, â¼50 nt) has been identified together with 167 bp double-stranded mononucleosomal cell-free DNA (mncfDNA). However, the methylation profile of uscfDNA has not been described. Conventional BS-Seq workflows may not be helpful because bisulfite conversion degrades larger DNA into smaller fragments, leading to erroneous categorization as uscfDNA. We describe the '5mCAdpBS-Seq' workflow in which pre-methylated 5mC (5-methylcytosine) single-stranded adapters are ligated to heat-denatured cfDNA before bisulfite conversion. This method retains only DNA fragments that are unaltered by bisulfite treatment, resulting in less biased uscfDNA methylation analysis. Using 5mCAdpBS-Seq, uscfDNA had lower levels of DNA methylation (â¼15%) compared to mncfDNA and was enriched in promoters and CpG islands. Hypomethylated uscfDNA fragments were enriched in upstream transcription start sites (TSSs), and the intensity of enrichment was correlated with expressed genes of hemopoietic cells. Using tissue-of-origin deconvolution, we inferred that uscfDNA is derived primarily from eosinophils, neutrophils, and monocytes. As proof-of-principle, we show that characteristics of the methylation profile of uscfDNA can distinguish non-small cell lung carcinoma from non-cancer samples. The 5mCAdpBS-Seq workflow is recommended for any cfDNA methylation-based investigations.
Assuntos
5-Metilcitosina , Ácidos Nucleicos Livres , Ilhas de CpG , Metilação de DNA , DNA de Cadeia Simples , Humanos , Ácidos Nucleicos Livres/sangue , Ácidos Nucleicos Livres/genética , DNA de Cadeia Simples/metabolismo , DNA de Cadeia Simples/genética , DNA de Cadeia Simples/sangue , 5-Metilcitosina/metabolismo , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/sangue , Sulfitos/química , Regiões Promotoras Genéticas , Análise de Sequência de DNA/métodos , Sequenciamento Completo do Genoma/métodosRESUMO
Few neuropsychiatric disorders have replicable biomarkers, prompting high-resolution and large-scale molecular studies. However, we still lack consensus on a more foundational question: whether quantitative shifts in cell types-the functional unit of life-contribute to neuropsychiatric disorders. Leveraging advances in human brain single-cell methylomics, we deconvolve seven major cell types using bulk DNA methylation profiling across 1270 postmortem brains, including from individuals diagnosed with Alzheimer's disease, schizophrenia, and autism. We observe and replicate cell-type compositional shifts for Alzheimer's disease (endothelial cell loss), autism (increased microglia), and schizophrenia (decreased oligodendrocytes), and find age- and sex-related changes. Multiple layers of evidence indicate that endothelial cell loss contributes to Alzheimer's disease, with comparable effect size to APOE genotype among older people. Genome-wide association identified five genetic loci related to cell-type composition, involving plausible genes for the neurovascular unit (P2RX5 and TRPV3) and excitatory neurons (DPY30 and MEMO1). These results implicate specific cell-type shifts in the pathophysiology of neuropsychiatric disorders.
Assuntos
Doença de Alzheimer , Transtorno Autístico , Encéfalo , Metilação de DNA , Esquizofrenia , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Doença de Alzheimer/metabolismo , Esquizofrenia/genética , Esquizofrenia/patologia , Encéfalo/metabolismo , Encéfalo/patologia , Transtorno Autístico/genética , Transtorno Autístico/patologia , Masculino , Feminino , Estudo de Associação Genômica Ampla , Idoso , Células Endoteliais/metabolismo , Células Endoteliais/patologia , Epigenômica/métodos , Pessoa de Meia-Idade , Idoso de 80 Anos ou maisRESUMO
Genome-wide association studies (GWASs) have uncovered susceptibility loci associated with psychiatric disorders such as bipolar disorder (BP) and schizophrenia (SCZ). However, most of these loci are in non-coding regions of the genome, and the causal mechanisms of the link between genetic variation and disease risk is unknown. Expression quantitative trait locus (eQTL) analysis of bulk tissue is a common approach used for deciphering underlying mechanisms, although this can obscure cell-type-specific signals and thus mask trait-relevant mechanisms. Although single-cell sequencing can be prohibitively expensive in large cohorts, computationally inferred cell-type proportions and cell-type gene expression estimates have the potential to overcome these problems and advance mechanistic studies. Using bulk RNA-seq from 1,730 samples derived from whole blood in a cohort ascertained from individuals with BP and SCZ, this study estimated cell-type proportions and their relation with disease status and medication. For each cell type, we found between 2,875 and 4,629 eGenes (genes with an associated eQTL), including 1,211 that are not found on the basis of bulk expression alone. We performed a colocalization test between cell-type eQTLs and various traits and identified hundreds of associations that occur between cell-type eQTLs and GWASs but that are not detected in bulk eQTLs. Finally, we investigated the effects of lithium use on the regulation of cell-type expression loci and found examples of genes that are differentially regulated according to lithium use. Our study suggests that applying computational methods to large bulk RNA-seq datasets of non-brain tissue can identify disease-relevant, cell-type-specific biology of psychiatric disorders and psychiatric medication.
Assuntos
Estudo de Associação Genômica Ampla , Lítio , Humanos , Estudo de Associação Genômica Ampla/métodos , RNA-Seq , Locos de Características Quantitativas/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Predisposição Genética para DoençaRESUMO
An individual's disease risk is affected by the populations that they belong to, due to shared genetics and environmental factors. The study of fine-scale populations in clinical care is important for identifying and reducing health disparities and for developing personalized interventions. To assess patterns of clinical diagnoses and healthcare utilization by fine-scale populations, we leveraged genetic data and electronic medical records from 35,968 patients as part of the UCLA ATLAS Community Health Initiative. We defined clusters of individuals using identity by descent, a form of genetic relatedness that utilizes shared genomic segments arising due to a common ancestor. In total, we identified 376 clusters, including clusters with patients of Afro-Caribbean, Puerto Rican, Lebanese Christian, Iranian Jewish and Gujarati ancestry. Our analysis uncovered 1,218 significant associations between disease diagnoses and clusters and 124 significant associations with specialty visits. We also examined the distribution of pathogenic alleles and found 189 significant alleles at elevated frequency in particular clusters, including many that are not regularly included in population screening efforts. Overall, this work progresses the understanding of health in understudied communities and can provide the foundation for further study into health inequities.
Assuntos
Atenção à Saúde , Aceitação pelo Paciente de Cuidados de Saúde , Humanos , Los Angeles , Irã (Geográfico) , EtnicidadeRESUMO
Genome-wide association studies (GWAS) have uncovered susceptibility loci associated with psychiatric disorders like bipolar disorder (BP) and schizophrenia (SCZ). However, most of these loci are in non-coding regions of the genome with unknown causal mechanisms of the link between genetic variation and disease risk. Expression quantitative trait loci (eQTL) analysis of bulk tissue is a common approach to decipher underlying mechanisms, though this can obscure cell-type specific signals thus masking trait-relevant mechanisms. While single-cell sequencing can be prohibitively expensive in large cohorts, computationally inferred cell type proportions and cell type gene expression estimates have the potential to overcome these problems and advance mechanistic studies. Using bulk RNA-Seq from 1,730 samples derived from whole blood in a cohort ascertained for individuals with BP and SCZ this study estimated cell type proportions and their relation with disease status and medication. We found between 2,875 and 4,629 eGenes for each cell type, including 1,211 eGenes that are not found using bulk expression alone. We performed a colocalization test between cell type eQTLs and various traits and identified hundreds of associations between cell type eQTLs and GWAS loci that are not detected in bulk eQTLs. Finally, we investigated the effects of lithium use on cell type expression regulation and found examples of genes that are differentially regulated dependent on lithium use. Our study suggests that computational methods can be applied to large bulk RNA-Seq datasets of non-brain tissue to identify disease-relevant, cell type specific biology of psychiatric disorders and psychiatric medication.
RESUMO
BACKGROUND: Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736). METHODS: We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. RESULTS: We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. CONCLUSIONS: Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.
Assuntos
Registros Eletrônicos de Saúde , Saúde Pública , Povo Asiático , Bancos de Espécimes Biológicos , Genômica , HumanosRESUMO
Circulating cell-free DNA (cfDNA) in the bloodstream originates from dying cells and is a promising noninvasive biomarker for cell death. Here, we propose an algorithm, CelFiE, to accurately estimate the relative abundances of cell types and tissues contributing to cfDNA from epigenetic cfDNA sequencing. In contrast to previous work, CelFiE accommodates low coverage data, does not require CpG site curation, and estimates contributions from multiple unknown cell types that are not available in external reference data. In simulations, CelFiE accurately estimates known and unknown cell type proportions from low coverage and noisy cfDNA mixtures, including from cell types composing less than 1% of the total mixture. When used in two clinically-relevant situations, CelFiE correctly estimates a large placenta component in pregnant women, and an elevated skeletal muscle component in amyotrophic lateral sclerosis (ALS) patients, consistent with the occurrence of muscle wasting typical in these patients. Together, these results show how CelFiE could be a useful tool for biomarker discovery and monitoring the progression of degenerative disease.
Assuntos
Algoritmos , Esclerose Lateral Amiotrófica/genética , Ácidos Nucleicos Livres/genética , Metilação de DNA , Epigênese Genética , Adulto , Esclerose Lateral Amiotrófica/sangue , Esclerose Lateral Amiotrófica/imunologia , Esclerose Lateral Amiotrófica/patologia , Linfócitos B/imunologia , Linfócitos B/metabolismo , Biomarcadores/sangue , Estudos de Casos e Controles , Ácidos Nucleicos Livres/sangue , Ácidos Nucleicos Livres/classificação , Feminino , Humanos , Macrófagos/imunologia , Macrófagos/metabolismo , Masculino , Monócitos/imunologia , Monócitos/metabolismo , Músculo Esquelético/imunologia , Músculo Esquelético/metabolismo , Músculo Esquelético/patologia , Neutrófilos/imunologia , Neutrófilos/metabolismo , Especificidade de Órgãos , Gravidez , Trimestres da Gravidez/sangue , Trimestres da Gravidez/genética , Linfócitos T/imunologia , Linfócitos T/metabolismoRESUMO
The methylation pattern of cfDNA, isolated from liquid biopsies, is gaining substantial interest for diagnosis and monitoring of diseases. We have evaluated the impact of type of blood collection tube and time delay between blood draw and plasma preparation on bisulphite-based cfDNA methylation profiling. Fifteen tubes of blood were drawn from three healthy volunteer subjects (BD Vacutainer K2E EDTA spray tubes, Streck Cell-Free DNA BCT tubes, PAXgene Blood ccfDNA tubes, Roche Cell-Free DNA Collection tubes and Biomatrica LBgard blood tubes in triplicate). Samples were either immediately processed or stored at room temperature for 24 or 72 hours before plasma preparation. DNA fragment size was evaluated by capillary electrophoresis. Reduced representation bisulphite sequencing was performed on the cell-free DNA isolated from these plasma samples. We evaluated the impact of blood tube and time delay on several quality control metrics. All preservation tubes performed similar on the quality metrics that were evaluated. Furthermore, a considerable increase in cfDNA concentration and the fraction of it derived from NK cells was observed after a 72-hour time delay in EDTA tubes. The methylation pattern of cfDNA is robust and reproducible in between the different preservation tubes. EDTA tubes processed as soon as possible, preferably within 24 hours, are the most cost effective. If immediate processing is not possible, preservation tubes are valid alternatives.
Assuntos
Ácidos Nucleicos Livres , Estudo de Associação Genômica Ampla , Coleta de Amostras Sanguíneas , Metilação de DNA , Epigenoma , Humanos , Biópsia LíquidaRESUMO
Protein conformations are shaped by cellular environments, but how environmental changes alter the conformational landscapes of specific proteins in vivo remains largely uncharacterized, in part due to the challenge of probing protein structures in living cells. Here, we use deep mutational scanning to investigate how a toxic conformation of α-synuclein, a dynamic protein linked to Parkinson's disease, responds to perturbations of cellular proteostasis. In the context of a course for graduate students in the UCSF Integrative Program in Quantitative Biology, we screened a comprehensive library of α-synuclein missense mutants in yeast cells treated with a variety of small molecules that perturb cellular processes linked to α-synuclein biology and pathobiology. We found that the conformation of α-synuclein previously shown to drive yeast toxicity-an extended, membrane-bound helix-is largely unaffected by these chemical perturbations, underscoring the importance of this conformational state as a driver of cellular toxicity. On the other hand, the chemical perturbations have a significant effect on the ability of mutations to suppress α-synuclein toxicity. Moreover, we find that sequence determinants of α-synuclein toxicity are well described by a simple structural model of the membrane-bound helix. This model predicts that α-synuclein penetrates the membrane to constant depth across its length but that membrane affinity decreases toward the C terminus, which is consistent with orthogonal biophysical measurements. Finally, we discuss how parallelized chemical genetics experiments can provide a robust framework for inquiry-based graduate coursework.