Search | VHL Regional Portal

Dictionary learning for integrative, multimodal and scalable single-cell analysis.

Hao, Yuhan; Stuart, Tim; Kowalski, Madeline H; Choudhary, Saket; Hoffman, Paul; Hartman, Austin; Srivastava, Avi; Molla, Gesmira; Madad, Shaista; Fernandez-Granda, Carlos; Satija, Rahul.

Nat Biotechnol ; 42(2): 293-304, 2024 Feb.

Article in English | MEDLINE | ID: mdl-37231261

ABSTRACT

Mapping single-cell sequencing profiles to comprehensive reference datasets provides a powerful alternative to unsupervised analysis. However, most reference datasets are constructed from single-cell RNA-sequencing data and cannot be used to annotate datasets that do not measure gene expression. Here we introduce 'bridge integration', a method to integrate single-cell datasets across modalities using a multiomic dataset as a molecular bridge. Each cell in the multiomic dataset constitutes an element in a 'dictionary', which is used to reconstruct unimodal datasets and transform them into a shared space. Our procedure accurately integrates transcriptomic data with independent single-cell measurements of chromatin accessibility, histone modifications, DNA methylation and protein levels. Moreover, we demonstrate how dictionary learning can be combined with sketching techniques to improve computational scalability and harmonize 8.6 million human immune cell profiles from sequencing and mass cytometry experiments. Our approach, implemented in version 5 of our Seurat toolkit ( http://www.satijalab.org/seurat ), broadens the utility of single-cell reference datasets and facilitates comparisons across diverse molecular modalities.

Subject(s)

Gene Expression Profiling , Software , Humans , Sequence Analysis, RNA/methods , Gene Expression Profiling/methods , Transcriptome , Single-Cell Analysis/methods

Increased tryptophan, but not increased glucose metabolism, predict resistance of pembrolizumab in stage III/IV melanoma.

Oldan, Jorge D; Giglio, Benjamin C; Smith, Eric; Zhao, Weiling; Bouchard, Deeanna M; Ivanovic, Marija; Lee, Yueh Z; Collichio, Frances A; Meyers, Michael O; Wallack, Diana E; Abernethy-Leinwand, Amber; Long, Patricia K; Trembath, Dimitri G; Googe, Paul B; Kowalski, Madeline H; Ivanova, Anastasia; Ezzell, Jennifer A; Nikolaishvili-Feinberg, Nana; Thomas, Nancy E; Wong, Terence Z; Ollila, David W; Li, Zibo; Moschos, Stergios J.

Oncoimmunology ; 12(1): 2204753, 2023.

Article in English | MEDLINE | ID: mdl-37123046

ABSTRACT

Clinical trials of combined IDO/PD1 blockade in metastatic melanoma (MM) failed to show additional clinical benefit compared to PD1-alone inhibition. We reasoned that a tryptophan-metabolizing pathway other than the kynurenine one is essential. We immunohistochemically stained tissues along the nevus-to-MM progression pathway for tryptophan-metabolizing enzymes (TMEs; TPH1, TPH2, TDO2, IDO1) and the tryptophan transporter, LAT1. We assessed tryptophan and glucose metabolism by performing baseline C11-labeled α-methyl tryptophan (C11-AMT) and fluorodeoxyglucose (FDG) PET imaging of tumor lesions in a prospective clinical trial of pembrolizumab in MM (clinicaltrials.gov, NCT03089606). We found higher protein expression of all TMEs and LAT1 in melanoma cells than tumor-infiltrating lymphocytes (TILs) within MM tumors (n = 68). Melanoma cell-specific TPH1 and LAT1 expressions were significantly anti-correlated with TIL presence in MM. High melanoma cell-specific LAT1 and low IDO1 expression were associated with worse overall survival (OS) in MM. Exploratory optimal cutpoint survival analysis of pretreatment 'high' vs. 'low' C11-AMT SUVmax of the hottest tumor lesion per patient revealed that the 'low' C11-AMT SUVmax was associated with longer progression-free survival in our clinical trial (n = 26). We saw no such trends with pretreatment FDG PET SUVmax. Treatment of melanoma cell lines with telotristat, a TPH1 inhibitor, increased IDO expression and kynurenine production in addition to suppression of serotonin production. High melanoma tryptophan metabolism is a poor predictor of pembrolizumab response and an adverse prognostic factor. Serotoninergic but not kynurenine pathway activation may be significant. Melanoma cells outcompete adjacent TILs, eventually depriving the latter of an essential amino acid.

Subject(s)

Melanoma , Tryptophan , Humans , Tryptophan/metabolism , Tryptophan/pharmacology , Fluorodeoxyglucose F18 , Prospective Studies , Kynurenine/metabolism , Melanoma/diagnostic imaging , Melanoma/drug therapy , Glucose , Melanoma, Cutaneous Malignant

CPA-Perturb-seq: Multiplexed single-cell characterization of alternative polyadenylation regulators.

Kowalski, Madeline H; Wessels, Hans-Hermann; Linder, Johannes; Choudhary, Saket; Hartman, Austin; Hao, Yuhan; Mascio, Isabella; Dalgarno, Carol; Kundaje, Anshul; Satija, Rahul.

bioRxiv ; 2023 Feb 10.

Article in English | MEDLINE | ID: mdl-36798324

ABSTRACT

Most mammalian genes have multiple polyA sites, representing a substantial source of transcript diversity that is governed by the cleavage and polyadenylation (CPA) regulatory machinery. To better understand how these proteins govern polyA site choice we introduce CPA-Perturb-seq, a multiplexed perturbation screen dataset of 42 known CPA regulators with a 3' scRNA-seq readout that enables transcriptome-wide inference of polyA site usage. We develop a statistical framework to specifically identify perturbation-dependent changes in intronic and tandem polyadenylation, and discover modules of co-regulated polyA sites exhibiting distinct functional properties. By training a multi-task deep neural network (APARENT-Perturb) on our dataset, we delineate a cis-regulatory code that predicts responsiveness to perturbation and reveals interactions between distinct regulatory complexes. Finally, we leverage our framework to re-analyze published scRNA-seq datasets, identifying new regulators that affect the relative abundance of alternatively polyadenylated transcripts, and characterizing extensive cellular heterogeneity in 3' UTR length amongst antibody-producing cells. Our work highlights the potential for multiplexed single-cell perturbation screens to further our understanding of post-transcriptional regulation in vitro and in vivo.

Transcriptome-Wide Association Study of Blood Cell Traits in African Ancestry and Hispanic/Latino Populations.

Wen, Jia; Xie, Munan; Rowland, Bryce; Rosen, Jonathan D; Sun, Quan; Chen, Jiawen; Tapia, Amanda L; Qian, Huijun; Kowalski, Madeline H; Shan, Yue; Young, Kristin L; Graff, Marielisa; Argos, Maria; Avery, Christy L; Bien, Stephanie A; Buyske, Steve; Yin, Jie; Choquet, Hélène; Fornage, Myriam; Hodonsky, Chani J; Jorgenson, Eric; Kooperberg, Charles; Loos, Ruth J F; Liu, Yongmei; Moon, Jee-Young; North, Kari E; Rich, Stephen S; Rotter, Jerome I; Smith, Jennifer A; Zhao, Wei; Shang, Lulu; Wang, Tao; Zhou, Xiang; Reiner, Alexander P; Raffield, Laura M; Li, Yun.

Genes (Basel) ; 12(7)2021 07 08.

Article in English | MEDLINE | ID: mdl-34356065

ABSTRACT

BACKGROUND: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed. METHODS: To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including n = 229 African American and n = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, n = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African (n = 27,955) and Hispanic/Latino (n = 28,324) ancestry participants. RESULTS: Our results revealed 24 suggestive signals (p < 1 × 10-4) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study (n = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN. CONCLUSIONS: These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.

Subject(s)

Black or African American/genetics , Blood Cells/pathology , Genetic Predisposition to Disease , Hispanic or Latino/genetics , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Transcriptome , Blood Cells/metabolism , Cohort Studies , Genome-Wide Association Study , Humans , Phenotype , White People/genetics

Genome sequencing unveils a regulatory landscape of platelet reactivity.

Keramati, Ali R; Chen, Ming-Huei; Rodriguez, Benjamin A T; Yanek, Lisa R; Bhan, Arunoday; Gaynor, Brady J; Ryan, Kathleen; Brody, Jennifer A; Zhong, Xue; Wei, Qiang; Kammers, Kai; Kanchan, Kanika; Iyer, Kruthika; Kowalski, Madeline H; Pitsillides, Achilleas N; Cupples, L Adrienne; Li, Bingshan; Schlaeger, Thorsten M; Shuldiner, Alan R; O'Connell, Jeffrey R; Ruczinski, Ingo; Mitchell, Braxton D; Faraday, Nauder; Taub, Margaret A; Becker, Lewis C; Lewis, Joshua P; Mathias, Rasika A; Johnson, Andrew D.

Nat Commun ; 12(1): 3626, 2021 06 15.

Article in English | MEDLINE | ID: mdl-34131117

ABSTRACT

Platelet aggregation at the site of atherosclerotic vascular injury is the underlying pathophysiology of myocardial infarction and stroke. To build upon prior GWAS, here we report on 16 loci identified through a whole genome sequencing (WGS) approach in 3,855 NHLBI Trans-Omics for Precision Medicine (TOPMed) participants deeply phenotyped for platelet aggregation. We identify the RGS18 locus, which encodes a myeloerythroid lineage-specific regulator of G-protein signaling that co-localizes with expression quantitative trait loci (eQTL) signatures for RGS18 expression in platelets. Gene-based approaches implicate the SVEP1 gene, a known contributor of coronary artery disease risk. Sentinel variants at RGS18 and PEAR1 are associated with thrombosis risk and increased gastrointestinal bleeding risk, respectively. Our WGS findings add to previously identified GWAS loci, provide insights regarding the mechanism(s) by which genetics may influence cardiovascular disease risk, and underscore the importance of rare variant and regulatory approaches to identifying loci contributing to complex phenotypes.

Subject(s)

Blood Platelets/metabolism , Chromosome Mapping , Whole Genome Sequencing , Base Sequence , GTP-Binding Proteins , Genome-Wide Association Study , HEK293 Cells , Humans , K562 Cells , Phenotype , Platelet Aggregation , Platelet Function Tests , Polymorphism, Single Nucleotide , Quantitative Trait Loci , RGS Proteins/genetics , RGS Proteins/metabolism , Receptors, Cell Surface/genetics , Thrombosis/genetics

Genome-Wide Association of Kidney Traits in Hispanics/Latinos Using Dense Imputed Whole-Genome Sequencing Data: The Hispanic Community Health Study/Study of Latinos.

Qian, Huijun; Kowalski, Madeline H; Kramer, Holly J; Tao, Ran; Lash, James P; Stilp, Adrienne M; Cai, Jianwen; Li, Yun; Franceschini, Nora.

Circ Genom Precis Med ; 13(4): e002891, 2020 08.

Article in English | MEDLINE | ID: mdl-32600054

ABSTRACT

BACKGROUND: Genetic factors that influence kidney traits have been understudied for low-frequency and ancestry-specific variants. METHODS: This study used imputed whole-genome sequencing from the Trans-Omics for Precision Medicine project to identify novel loci for estimated glomerular filtration rate and urine albumin-to-creatinine ratio in up to 12 207 Hispanics/Latinos. Replication was performed in the Women's Health Initiative and the UK Biobank when variants were available. RESULTS: Two low-frequency intronic variants were associated with estimated glomerular filtration rate (rs58720902 at AQR, minor allele frequency=0.01, P=1.6×10-8) or urine albumin-to-creatinine ratio (rs527493184 at ZBTB16, minor allele frequency=0.002, P=1.1×10-8). An additional variant at PRNT (rs2422935, minor allele frequency=0.54, P=2.89×10-8) was significantly associated with estimated glomerular filtration rate in meta-analysis with replication samples. We also identified 2 known loci for urine albumin-to-creatinine ratio (BCL2L11 rs116907128, P=5.6×10-8 and HBB rs344, P=9.3×10-11) and validated 8 loci for urine albumin-to-creatinine ratio previously identified in the UK Biobank. CONCLUSIONS: Our study shows gains in gene discovery when using dense imputation from multi-ethnic whole-genome sequencing data in admixed Hispanics/Latinos. It also highlights limitations in genetic research of kidney traits, including the lack of suitable replication samples for variants that are more common in non-European ancestry and those at low frequency in populations.

Subject(s)

Genome-Wide Association Study , Hispanic or Latino/genetics , Kidney Diseases/genetics , Adult , Alleles , Bcl-2-Like Protein 11/genetics , Female , Gene Frequency , Genetic Variation , Genotype , Glomerular Filtration Rate/genetics , Humans , Kidney Diseases/pathology , Male , Middle Aged , Promyelocytic Leukemia Zinc Finger Protein/genetics , RNA Helicases/genetics , Whole Genome Sequencing

Allelic Heterogeneity at the CRP Locus Identified by Whole-Genome Sequencing in Multi-ancestry Cohorts.

Raffield, Laura M; Iyengar, Apoorva K; Wang, Biqi; Gaynor, Sheila M; Spracklen, Cassandra N; Zhong, Xue; Kowalski, Madeline H; Salimi, Shabnam; Polfus, Linda M; Benjamin, Emelia J; Bis, Joshua C; Bowler, Russell; Cade, Brian E; Choi, Won Jung; Comellas, Alejandro P; Correa, Adolfo; Cruz, Pedro; Doddapaneni, Harsha; Durda, Peter; Gogarten, Stephanie M; Jain, Deepti; Kim, Ryan W; Kral, Brian G; Lange, Leslie A; Larson, Martin G; Laurie, Cecelia; Lee, Jiwon; Lee, Seonwook; Lewis, Joshua P; Metcalf, Ginger A; Mitchell, Braxton D; Momin, Zeineen; Muzny, Donna M; Pankratz, Nathan; Park, Cheol Joo; Rich, Stephen S; Rotter, Jerome I; Ryan, Kathleen; Seo, Daekwan; Tracy, Russell P; Viaud-Martinez, Karine A; Yanek, Lisa R; Zhao, Lue Ping; Lin, Xihong; Li, Bingshan; Li, Yun; Dupuis, Josée; Reiner, Alexander P; Mohlke, Karen L; Auer, Paul L.

Am J Hum Genet ; 106(1): 112-120, 2020 01 02.

Article in English | MEDLINE | ID: mdl-31883642

ABSTRACT

Whole-genome sequencing (WGS) can improve assessment of low-frequency and rare variants, particularly in non-European populations that have been underrepresented in existing genomic studies. The genetic determinants of C-reactive protein (CRP), a biomarker of chronic inflammation, have been extensively studied, with existing genome-wide association studies (GWASs) conducted in >200,000 individuals of European ancestry. In order to discover novel loci associated with CRP levels, we examined a multi-ancestry population (n = 23,279) with WGS (â¼38× coverage) from the Trans-Omics for Precision Medicine (TOPMed) program. We found evidence for eight distinct associations at the CRP locus, including two variants that have not been identified previously (rs11265259 and rs181704186), both of which are non-coding and more common in individuals of African ancestry (â¼10% and â¼1% minor allele frequency, respectively, and rare or monomorphic in 1000 Genomes populations of East Asian, South Asian, and European ancestry). We show that the minor (G) allele of rs181704186 is associated with lower CRP levels and decreased transcriptional activity and protein binding in vitro, providing a plausible molecular mechanism for this African ancestry-specific signal. The individuals homozygous for rs181704186-G have a mean CRP level of 0.23 mg/L, in contrast to individuals heterozygous for rs181704186 with mean CRP of 2.97 mg/L and major allele homozygotes with mean CRP of 4.11 mg/L. This study demonstrates the utility of WGS in multi-ethnic populations to drive discovery of complex trait associations of large effect and to identify functional alleles in noncoding regulatory regions.

Subject(s)

Asian People/genetics , Black People/genetics , C-Reactive Protein/genetics , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , White People/genetics , Whole Genome Sequencing/methods , Cohort Studies , Gene Frequency , Genome-Wide Association Study , Humans , Linkage Disequilibrium

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations.

Kowalski, Madeline H; Qian, Huijun; Hou, Ziyi; Rosen, Jonathan D; Tapia, Amanda L; Shan, Yue; Jain, Deepti; Argos, Maria; Arnett, Donna K; Avery, Christy; Barnes, Kathleen C; Becker, Lewis C; Bien, Stephanie A; Bis, Joshua C; Blangero, John; Boerwinkle, Eric; Bowden, Donald W; Buyske, Steve; Cai, Jianwen; Cho, Michael H; Choi, Seung Hoan; Choquet, Hélène; Cupples, L Adrienne; Cushman, Mary; Daya, Michelle; de Vries, Paul S; Ellinor, Patrick T; Faraday, Nauder; Fornage, Myriam; Gabriel, Stacey; Ganesh, Santhi K; Graff, Misa; Gupta, Namrata; He, Jiang; Heckbert, Susan R; Hidalgo, Bertha; Hodonsky, Chani J; Irvin, Marguerite R; Johnson, Andrew D; Jorgenson, Eric; Kaplan, Robert; Kardia, Sharon L R; Kelly, Tanika N; Kooperberg, Charles; Lasky-Su, Jessica A; Loos, Ruth J F; Lubitz, Steven A; Mathias, Rasika A; McHugh, Caitlin P; Montgomery, Courtney.

PLoS Genet ; 15(12): e1008500, 2019 12.

Article in English | MEDLINE | ID: mdl-31869403

ABSTRACT

Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.

Subject(s)

Black or African American/genetics , Hispanic or Latino/genetics , Precision Medicine/methods , Whole Genome Sequencing/methods , beta-Globins/genetics , Adult , Aged , Aged, 80 and over , Computational Biology/methods , Databases, Genetic , Female , Gene Frequency , Genetic Predisposition to Disease , Genetics, Population , Genome-Wide Association Study , Genotyping Techniques , Humans , Linkage Disequilibrium , Male , Middle Aged , United States

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL