Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Am J Hum Genet ; 111(2): 323-337, 2024 02 01.
Article in English | MEDLINE | ID: mdl-38306997

ABSTRACT

Genome-wide association studies (GWASs) have uncovered susceptibility loci associated with psychiatric disorders such as bipolar disorder (BP) and schizophrenia (SCZ). However, most of these loci are in non-coding regions of the genome, and the causal mechanisms of the link between genetic variation and disease risk is unknown. Expression quantitative trait locus (eQTL) analysis of bulk tissue is a common approach used for deciphering underlying mechanisms, although this can obscure cell-type-specific signals and thus mask trait-relevant mechanisms. Although single-cell sequencing can be prohibitively expensive in large cohorts, computationally inferred cell-type proportions and cell-type gene expression estimates have the potential to overcome these problems and advance mechanistic studies. Using bulk RNA-seq from 1,730 samples derived from whole blood in a cohort ascertained from individuals with BP and SCZ, this study estimated cell-type proportions and their relation with disease status and medication. For each cell type, we found between 2,875 and 4,629 eGenes (genes with an associated eQTL), including 1,211 that are not found on the basis of bulk expression alone. We performed a colocalization test between cell-type eQTLs and various traits and identified hundreds of associations that occur between cell-type eQTLs and GWASs but that are not detected in bulk eQTLs. Finally, we investigated the effects of lithium use on the regulation of cell-type expression loci and found examples of genes that are differentially regulated according to lithium use. Our study suggests that applying computational methods to large bulk RNA-seq datasets of non-brain tissue can identify disease-relevant, cell-type-specific biology of psychiatric disorders and psychiatric medication.


Subject(s)
Genome-Wide Association Study , Lithium , Humans , Genome-Wide Association Study/methods , RNA-Seq , Quantitative Trait Loci/genetics , Phenotype , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease
2.
Nucleic Acids Res ; 51(3): e18, 2023 02 22.
Article in English | MEDLINE | ID: mdl-36546757

ABSTRACT

The vast majority of disease-associated single nucleotide polymorphisms (SNP) identified from genome-wide association studies (GWAS) are localized in non-coding regions. A significant fraction of these variants impact transcription factors binding to enhancer elements and alter gene expression. To functionally interrogate the activity of such variants we developed snpSTARRseq, a high-throughput experimental method that can interrogate the functional impact of hundreds to thousands of non-coding variants on enhancer activity. snpSTARRseq dramatically improves signal-to-noise by utilizing a novel sequencing and bioinformatic approach that increases both insert size and the number of variants tested per loci. Using this strategy, we interrogated known prostate cancer (PCa) risk-associated loci and demonstrated that 35% of them harbor SNPs that significantly altered enhancer activity. Combining these results with chromosomal looping data we could identify interacting genes and provide a mechanism of action for 20 PCa GWAS risk regions. When benchmarked to orthogonal methods, snpSTARRseq showed a strong correlation with in vivo experimental allelic-imbalance studies whereas there was no correlation with predictive in silico approaches. Overall, snpSTARRseq provides an integrated experimental and computational framework to functionally test non-coding genetic variants.


Subject(s)
Genome-Wide Association Study , Regulatory Sequences, Nucleic Acid , Humans , Male , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Transcription Factors/genetics
3.
Am J Hum Genet ; 108(12): 2284-2300, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34822763

ABSTRACT

Genome-wide association studies (GWASs) have identified more than 200 prostate cancer (PrCa) risk regions, which provide potential insights into causal mechanisms. Multiple lines of evidence show that a significant proportion of PrCa risk can be explained by germline causal variants that dysregulate nearby target genes in prostate-relevant tissues, thus altering disease risk. The traditional approach to explore this hypothesis has been correlating GWAS variants with steady-state transcript levels, referred to as expression quantitative trait loci (eQTLs). In this work, we assess the utility of chromosome conformation capture (3C) coupled with immunoprecipitation (HiChIP) to identify target genes for PrCa GWAS risk loci. We find that interactome data confirm previously reported PrCa target genes identified through GWAS/eQTL overlap (e.g., MLPH). Interestingly, HiChIP identifies links between PrCa GWAS variants and genes well-known to play a role in prostate cancer biology (e.g., AR) that are not detected by eQTL-based methods. HiChIP predicted enhancer elements at the AR and NKX3-1 prostate cancer risk loci, and both were experimentally confirmed to regulate expression of the corresponding genes through CRISPR interference (CRISPRi) perturbation in LNCaP cells. Our results demonstrate that looping data harbor additional information beyond eQTLs and expand the number of PrCa GWAS loci that can be linked to candidate susceptibility genes.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Genetic Predisposition to Disease , Genome-Wide Association Study , Histone Code/genetics , Prostatic Neoplasms/genetics , Cell Line, Tumor , Chromosomes, Human , Clustered Regularly Interspaced Short Palindromic Repeats , Genetic Techniques , Humans , Male , Quantitative Trait Loci
4.
PLoS Comput Biol ; 17(5): e1008915, 2021 05.
Article in English | MEDLINE | ID: mdl-34019542

ABSTRACT

Genetic predisposition for complex traits often acts through multiple tissues at different time points during development. As a simple example, the genetic predisposition for obesity could be manifested either through inherited variants that control metabolism through regulation of genes expressed in the brain, or that control fat storage through dysregulation of genes expressed in adipose tissue, or both. Here we describe a statistical approach that leverages tissue-specific expression quantitative trait loci (eQTLs) corresponding to tissue-specific genes to prioritize a relevant tissue underlying the genetic predisposition of a given individual for a complex trait. Unlike existing approaches that prioritize relevant tissues for the trait in the population, our approach probabilistically quantifies the tissue-wise genetic contribution to the trait for a given individual. We hypothesize that for a subgroup of individuals the genetic contribution to the trait can be mediated primarily through a specific tissue. Through simulations using the UK Biobank, we show that our approach can predict the relevant tissue accurately and can cluster individuals according to their tissue-specific genetic architecture. We analyze body mass index (BMI) and waist to hip ratio adjusted for BMI (WHRadjBMI) in the UK Biobank to identify subgroups of individuals whose genetic predisposition act primarily through brain versus adipose tissue, and adipose versus muscle tissue, respectively. Notably, we find that these individuals have specific phenotypic features beyond BMI and WHRadjBMI that distinguish them from random individuals in the data, suggesting biological effects of tissue-specific genetic contribution for these traits.


Subject(s)
Multifactorial Inheritance , Quantitative Trait Loci , Adipose Tissue/metabolism , Algorithms , Bayes Theorem , Body Mass Index , Brain/metabolism , Computational Biology , Computer Simulation , Gene Expression , Genetic Predisposition to Disease , Humans , Models, Genetic , Obesity/genetics , Obesity/pathology , Organ Specificity , Phenotype , Polymorphism, Single Nucleotide , Software , Tissue Distribution
5.
Sci Transl Med ; 16(745): eade4510, 2024 May.
Article in English | MEDLINE | ID: mdl-38691621

ABSTRACT

Human inborn errors of immunity include rare disorders entailing functional and quantitative antibody deficiencies due to impaired B cells called the common variable immunodeficiency (CVID) phenotype. Patients with CVID face delayed diagnoses and treatments for 5 to 15 years after symptom onset because the disorders are rare (prevalence of ~1/25,000), and there is extensive heterogeneity in CVID phenotypes, ranging from infections to autoimmunity to inflammatory conditions, overlapping with other more common disorders. The prolonged diagnostic odyssey drives excessive system-wide costs before diagnosis. Because there is no single causal mechanism, there are no genetic tests to definitively diagnose CVID. Here, we present PheNet, a machine learning algorithm that identifies patients with CVID from their electronic health records (EHRs). PheNet learns phenotypic patterns from verified CVID cases and uses this knowledge to rank patients by likelihood of having CVID. PheNet could have diagnosed more than half of our patients with CVID 1 or more years earlier than they had been diagnosed. When applied to a large EHR dataset, followed by blinded chart review of the top 100 patients ranked by PheNet, we found that 74% were highly probable to have CVID. We externally validated PheNet using >6 million records from disparate medical systems in California and Tennessee. As artificial intelligence and machine learning make their way into health care, we show that algorithms such as PheNet can offer clinical benefits by expediting the diagnosis of rare diseases.


Subject(s)
Common Variable Immunodeficiency , Electronic Health Records , Humans , Common Variable Immunodeficiency/diagnosis , Machine Learning , Algorithms , Male , Female , Phenotype , Adult , Undiagnosed Diseases/diagnosis
6.
bioRxiv ; 2023 May 25.
Article in English | MEDLINE | ID: mdl-37293101

ABSTRACT

Genome-wide association studies (GWAS) have uncovered susceptibility loci associated with psychiatric disorders like bipolar disorder (BP) and schizophrenia (SCZ). However, most of these loci are in non-coding regions of the genome with unknown causal mechanisms of the link between genetic variation and disease risk. Expression quantitative trait loci (eQTL) analysis of bulk tissue is a common approach to decipher underlying mechanisms, though this can obscure cell-type specific signals thus masking trait-relevant mechanisms. While single-cell sequencing can be prohibitively expensive in large cohorts, computationally inferred cell type proportions and cell type gene expression estimates have the potential to overcome these problems and advance mechanistic studies. Using bulk RNA-Seq from 1,730 samples derived from whole blood in a cohort ascertained for individuals with BP and SCZ this study estimated cell type proportions and their relation with disease status and medication. We found between 2,875 and 4,629 eGenes for each cell type, including 1,211 eGenes that are not found using bulk expression alone. We performed a colocalization test between cell type eQTLs and various traits and identified hundreds of associations between cell type eQTLs and GWAS loci that are not detected in bulk eQTLs. Finally, we investigated the effects of lithium use on cell type expression regulation and found examples of genes that are differentially regulated dependent on lithium use. Our study suggests that computational methods can be applied to large bulk RNA-Seq datasets of non-brain tissue to identify disease-relevant, cell type specific biology of psychiatric disorders and psychiatric medication.

7.
HGG Adv ; 3(3): 100103, 2022 Jul 14.
Article in English | MEDLINE | ID: mdl-35519825

ABSTRACT

Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies.

8.
Genome Med ; 14(1): 104, 2022 Sep 09.
Article in English | MEDLINE | ID: mdl-36085083

ABSTRACT

BACKGROUND: Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736). METHODS: We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. RESULTS: We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. CONCLUSIONS: Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.


Subject(s)
Electronic Health Records , Public Health , Asian People , Biological Specimen Banks , Genomics , Humans
9.
Nat Genet ; 54(9): 1364-1375, 2022 09.
Article in English | MEDLINE | ID: mdl-36071171

ABSTRACT

Many genetic variants affect disease risk by altering context-dependent gene regulation. Such variants are difficult to study mechanistically using current methods that link genetic variation to steady-state gene expression levels, such as expression quantitative trait loci (eQTLs). To address this challenge, we developed the cistrome-wide association study (CWAS), a framework for identifying genotypic and allele-specific effects on chromatin that are also associated with disease. In prostate cancer, CWAS identified regulatory elements and androgen receptor-binding sites that explained the association at 52 of 98 known prostate cancer risk loci and discovered 17 additional risk loci. CWAS implicated key developmental transcription factors in prostate cancer risk that are overlooked by eQTL-based approaches due to context-dependent gene regulation. We experimentally validated associations and demonstrated the extensibility of CWAS to additional epigenomic datasets and phenotypes, including response to prostate cancer treatment. CWAS is a powerful and biologically interpretable paradigm for studying variants that influence traits by affecting transcriptional regulation.


Subject(s)
Chromatin , Prostatic Neoplasms , Chromatin/genetics , Gene Expression Regulation , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Male , Polymorphism, Single Nucleotide/genetics , Prostatic Neoplasms/genetics , Quantitative Trait Loci/genetics
10.
iScience ; 24(3): 102188, 2021 Mar 19.
Article in English | MEDLINE | ID: mdl-33615196

ABSTRACT

Coronavirus disease 2019 (COVID-19) has exposed health care disparities in minority groups including Hispanics/Latinxs (HL). Studies of COVID-19 risk factors for HL have relied on county-level data. We investigated COVID-19 risk factors in HL using individual-level, electronic health records in a Los Angeles health system between March 9, 2020, and August 31, 2020. Of 9,287 HL tested for SARS-CoV-2, 562 were positive. HL constituted an increasing percentage of all COVID-19 positive individuals as disease severity escalated. Multiple risk factors identified in Non-Hispanic/Latinx whites (NHL-W), like renal disease, also conveyed risk in HL. Pre-existing nonrheumatic mitral valve disorder was a risk factor for HL hospitalization but not for NHL-W COVID-19 or HL influenza hospitalization, suggesting it may be a specific HL COVID-19 risk. Admission laboratory values also suggested that HL presented with a greater inflammatory response. COVID-19 risk factors for HL can help guide equitable government policies and identify at-risk populations.

11.
Nat Commun ; 11(1): 5504, 2020 10 30.
Article in English | MEDLINE | ID: mdl-33127880

ABSTRACT

Single-cell RNA-sequencing (scRNA-Seq) is a compelling approach to directly and simultaneously measure cellular composition and state, which can otherwise only be estimated by applying deconvolution methods to bulk RNA-Seq estimates. However, it has not yet become a widely used tool in population-scale analyses, due to its prohibitively high cost. Here we show that given the same budget, the statistical power of cell-type-specific expression quantitative trait loci (eQTL) mapping can be increased through low-coverage per-cell sequencing of more samples rather than high-coverage sequencing of fewer samples. We use simulations starting from one of the largest available real single-cell RNA-Seq data from 120 individuals to also show that multiple experimental designs with different numbers of samples, cells per sample and reads per cell could have similar statistical power, and choosing an appropriate design can yield large cost savings especially when multiplexed workflows are considered. Finally, we provide a practical approach on selecting cost-effective designs for maximizing cell-type-specific eQTL power which is available in the form of a web tool.


Subject(s)
Quantitative Trait Loci/genetics , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Base Sequence , Computational Biology , Gene Expression , Gene Expression Profiling/methods , Genomics , Humans
12.
medRxiv ; 2020 Jul 09.
Article in English | MEDLINE | ID: mdl-32637977

ABSTRACT

With the continuing coronavirus disease 2019 (COVID-19) pandemic coupled with phased reopening, it is critical to identify risk factors associated with susceptibility and severity of disease in a diverse population to help shape government policies, guide clinical decision making, and prioritize future COVID-19 research. In this retrospective case-control study, we used de-identified electronic health records (EHR) from the University of California Los Angeles (UCLA) Health System between March 9th, 2020 and June 14th, 2020 to identify risk factors for COVID-19 susceptibility (severe acute respiratory distress syndrome coronavirus 2 (SARS-CoV-2) PCR test positive), inpatient admission, and severe outcomes (treatment in an intensive care unit or intubation). Of the 26,602 individuals tested by PCR for SARS-CoV-2, 992 were COVID-19 positive (3.7% of Tested), 220 were admitted in the hospital (22% of COVID-19 positive), and 77 had a severe outcome (35% of Inpatient). Consistent with previous studies, males and individuals older than 65 years old had increased risk of inpatient admission. Notably, individuals self-identifying as Hispanic or Latino constituted an increasing percentage of COVID-19 patients as disease severity escalated, comprising 24% of those testing positive, but 40% of those with a severe outcome, a disparity that remained after correcting for medical comorbidities. Cardiovascular disease, hypertension, and renal disease were premorbid risk factors present before SARS-CoV-2 PCR testing associated with COVID-19 susceptibility. Less well-established risk factors for COVID-19 susceptibility included pre-existing dementia (odds ratio (OR) 5.2 [3.2-8.3], p=2.6 x 10-10), mental health conditions (depression OR 2.1 [1.6-2.8], p=1.1 x 10-6) and vitamin D deficiency (OR 1.8 [1.4-2.2], p=5.7 x 10-6). Renal diseases including end-stage renal disease and anemia due to chronic renal disease were the predominant premorbid risk factors for COVID-19 inpatient admission. Other less established risk factors for COVID-19 inpatient admission included previous renal transplant (OR 9.7 [2.8-39], p=3.2x10-4) and disorders of the immune system (OR 6.0 [2.3, 16], p=2.7x10-4). Prior use of oral steroid medications was associated with decreased COVID-19 positive testing risk (OR 0.61 [0.45, 0.81], p=4.3x10-4), but increased inpatient admission risk (OR 4.5 [2.3, 8.9], p=1.8x10-5). We did not observe that prior use of angiotensin converting enzyme inhibitors or angiotensin receptor blockers increased the risk of testing positive for SARS-CoV-2, being admitted to the hospital, or having a severe outcome. This study involving direct EHR extraction identified known and less well-established demographics, and prior diagnoses and medications as risk factors for COVID-19 susceptibility and inpatient admission. Knowledge of these risk factors including marked ethnic disparities observed in disease severity should guide government policies, identify at-risk populations, inform clinical decision making, and prioritize future COVID-19 research.

SELECTION OF CITATIONS
SEARCH DETAIL