Búsqueda | Portal Regional de la BVS

1.

Robust identification of perturbed cell types in single-cell RNA-seq data.

Nicol, Phillip B; Paulson, Danielle; Qian, Gege; Liu, X Shirley; Irizarry, Rafael; Sahu, Avinash D.

Nat Commun ; 15(1): 7610, 2024 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-39218971

RESUMEN

Single-cell transcriptomics has emerged as a powerful tool for understanding how different cells contribute to disease progression by identifying cell types that change across diseases or conditions. However, detecting changing cell types is challenging due to individual-to-individual and cohort-to-cohort variability and naive approaches based on current computational tools lead to false positive findings. To address this, we propose a computational tool, scDist, based on a mixed-effects model that provides a statistically rigorous and computationally efficient approach for detecting transcriptomic differences. By accurately recapitulating known immune cell relationships and mitigating false positives induced by individual and cohort variation, we demonstrate that scDist outperforms current methods in both simulated and real datasets, even with limited sample sizes. Through the analysis of COVID-19 and immunotherapy datasets, scDist uncovers transcriptomic perturbations in dendritic cells, plasmacytoid dendritic cells, and FCER1G+NK cells, that provide new insights into disease mechanisms and treatment responses. As single-cell datasets continue to expand, our faster and statistically rigorous method offers a robust and versatile tool for a wide range of research and clinical applications, enabling the investigation of cellular perturbations with implications for human health and disease.

Asunto(s)

COVID-19 , Células Dendríticas , RNA-Seq , SARS-CoV-2 , Análisis de la Célula Individual , Transcriptoma , Análisis de la Célula Individual/métodos , Humanos , COVID-19/virología , COVID-19/genética , RNA-Seq/métodos , Células Dendríticas/metabolismo , SARS-CoV-2/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Células Asesinas Naturales/metabolismo , Inmunoterapia/métodos , Análisis de Secuencia de ARN/métodos , Análisis de Expresión Génica de una Sola Célula

2.

Baseline Immune State and T-cell Clonal Kinetics are Associated with Durable Response to CAR-T Therapy in Large B-cell Lymphoma.

Maurer, Katie; Grabski, Isabella N; Houot, Roch; Gohil, Satyen H; Miura, Shogo; Redd, Robert A; Lyu, Haoxiang; Lu, Wesley; Arihara, Yohei; Budka, Justin; McDonough, Mikaela; Ansuinelli, Michela; Reynolds, Carol G; Jacene, Heather; Li, Shuqiang; Livak, Kenneth J; Ritz, Jerome; Miles, Brodie; Mattie, Mike; Neuberg, Donna S; Irizarry, Rafael A; Armand, Philippe; Wu, Catherine J; Jacobson, Caron A.

Blood ; 2024 Sep 06.

Artículo en Inglés | MEDLINE | ID: mdl-39241199

RESUMEN

Engineered cellular therapy with CD19-targeting chimeric antigen receptor T-cells (CAR-T) has revolutionized outcomes for patients with relapsed/refractory Large B-Cell Lymphoma (LBCL), but the cellular and molecular features associated with response remain largely unresolved. We analyzed serial peripheral blood samples ranging from day of apheresis (day -28/baseline) to 28 days after CAR-T infusion from 50 patients with LBCL treated with axicabtagene ciloleucel (axi-cel) by integrating single cell RNA and TCR sequencing (scRNA-seq/scTCR-seq), flow cytometry, and mass cytometry (CyTOF) to characterize features associated with response to CAR-T. Pretreatment patient characteristics associated with response included presence of B cells and increased lymphocyte-to-monocyte ratio (ALC/AMC). Infusion products from responders were enriched for clonally expanded, highly activated CD8+ T cells. We expanded these observations to 99 patients from the ZUMA-1 cohort and identified a subset of patients with elevated baseline B cells, 80% of whom were complete responders. We integrated B cell proportion ï³0.5% and ALC/AMC ï³1.2 into a two-factor predictive model and applied this model to the ZUMA-1 cohort. Estimated progression free survival (PFS) at 1 year in patients meeting one or both criteria was 65% versus 31% for patients meeting neither criterion. Our results suggest that patients' immunologic state at baseline affects likelihood of response to CAR-T through both modulation of the T cell apheresis product composition and promoting a more favorable circulating immune compartment prior to therapy. These baseline immunologic features, measured readily in the clinical setting prior to CAR-T, can be applied to predict response to therapy.

3.

Detection of allele-specific expression in spatial transcriptomics with spASE.

Zou, Luli S; Cable, Dylan M; Barrera-Lopez, Irving A; Zhao, Tongtong; Murray, Evan; Aryee, Martin J; Chen, Fei; Irizarry, Rafael A.

Genome Biol ; 25(1): 180, 2024 Jul 08.

Artículo en Inglés | MEDLINE | ID: mdl-38978101

RESUMEN

Spatial transcriptomics technologies permit the study of the spatial distribution of RNA at near-single-cell resolution genome-wide. However, the feasibility of studying spatial allele-specific expression (ASE) from these data remains uncharacterized. Here, we introduce spASE, a computational framework for detecting and estimating spatial ASE. To tackle the challenges presented by cell type mixtures and a low signal to noise ratio, we implement a hierarchical model involving additive mixtures of spatial smoothing splines. We apply our method to allele-resolved Visium and Slide-seq from the mouse cerebellum and hippocampus and report new insight into the landscape of spatial and cell type-specific ASE therein.

Asunto(s)

Alelos , Cerebelo , Transcriptoma , Animales , Ratones , Cerebelo/metabolismo , Hipocampo/metabolismo , Perfilación de la Expresión Génica , Análisis de la Célula Individual

4.

DNA binding analysis of rare variants in homeodomains reveals homeodomain specificity-determining residues.

Kock, Kian Hong; Kimes, Patrick K; Gisselbrecht, Stephen S; Inukai, Sachi; Phanor, Sabrina K; Anderson, James T; Ramakrishnan, Gayatri; Lipper, Colin H; Song, Dongyuan; Kurland, Jesse V; Rogers, Julia M; Jeong, Raehoon; Blacklow, Stephen C; Irizarry, Rafael A; Bulyk, Martha L.

Nat Commun ; 15(1): 3110, 2024 Apr 10.

Artículo en Inglés | MEDLINE | ID: mdl-38600112

RESUMEN

Homeodomains (HDs) are the second largest class of DNA binding domains (DBDs) among eukaryotic sequence-specific transcription factors (TFs) and are the TF structural class with the largest number of disease-associated mutations in the Human Gene Mutation Database (HGMD). Despite numerous structural studies and large-scale analyses of HD DNA binding specificity, HD-DNA recognition is still not fully understood. Here, we analyze 92 human HD mutants, including disease-associated variants and variants of uncertain significance (VUS), for their effects on DNA binding activity. Many of the variants alter DNA binding affinity and/or specificity. Detailed biochemical analysis and structural modeling identifies 14 previously unknown specificity-determining positions, 5 of which do not contact DNA. The same missense substitution at analogous positions within different HDs often exhibits different effects on DNA binding activity. Variant effect prediction tools perform moderately well in distinguishing variants with altered DNA binding affinity, but poorly in identifying those with altered binding specificity. Our results highlight the need for biochemical assays of TF coding variants and prioritize dozens of variants for further investigations into their pathogenicity and the development of clinical diagnostics and precision therapies.

Asunto(s)

Proteínas de Homeodominio , Factores de Transcripción , Humanos , Proteínas de Homeodominio/metabolismo , Factores de Transcripción/metabolismo , ADN/metabolismo , Mutación , Modelos Moleculares

5.

Effects of KRAS Genetic Interactions on Outcomes in Cancers of the Lung, Pancreas, and Colorectum.

Grabski, Isabella N; Heymach, John V; Kehl, Kenneth L; Kopetz, Scott; Lau, Ken S; Riely, Gregory J; Schrag, Deborah; Yaeger, Rona; Irizarry, Rafael A; Haigis, Kevin M.

Cancer Epidemiol Biomarkers Prev ; 33(1): 158-169, 2024 01 09.

Artículo en Inglés | MEDLINE | ID: mdl-37943166

RESUMEN

BACKGROUND: KRAS is among the most commonly mutated oncogenes in cancer, and previous studies have shown associations with survival in many cancer contexts. Evidence from both clinical observations and mouse experiments further suggests that these associations are allele- and tissue-specific. These findings motivate using clinical data to understand gene interactions and clinical covariates within different alleles and tissues. METHODS: We analyze genomic and clinical data from the AACR Project GENIE Biopharma Collaborative for samples from lung, colorectal, and pancreatic cancers. For each of these cancer types, we report epidemiological associations for different KRAS alleles, apply principal component analysis (PCA) to discover groups of genes co-mutated with KRAS, and identify distinct clusters of patient profiles with implications for survival. RESULTS: KRAS mutations were associated with inferior survival in lung, colon, and pancreas, although the specific mutations implicated varied by disease. Tissue- and allele-specific associations with smoking, sex, age, and race were found. Tissue-specific genetic interactions with KRAS were identified by PCA, which were clustered to produce five, four, and two patient profiles in lung, colon, and pancreas. Membership in these profiles was associated with survival in all three cancer types. CONCLUSIONS: KRAS mutations have tissue- and allele-specific associations with inferior survival, clinical covariates, and genetic interactions. IMPACT: Our results provide greater insight into the tissue- and allele-specific associations with KRAS mutations and identify clusters of patients that are associated with survival and clinical attributes from combinations of genetic interactions with KRAS mutations.

Asunto(s)

Neoplasias Pulmonares , Neoplasias Pancreáticas , Animales , Humanos , Pulmón , Neoplasias Pulmonares/genética , Mutación , Páncreas , Neoplasias Pancreáticas/genética , Proteínas Proto-Oncogénicas p21(ras)/genética

6.

A consistent pattern of slide effects in Illumina DNA methylation BeadChip array data.

Hecker, Julian; Lee, Sanghun; Kachroo, Priyadarshini; Prokopenko, Dmitry; Maaser-Hecker, Anna; Lutz, Sharon M; Hahn, Georg; Irizarry, Rafael; Weiss, Scott T; DeMeo, Dawn L; Lange, Christoph.

Epigenetics ; 18(1): 2257437, 2023 12.

Artículo en Inglés | MEDLINE | ID: mdl-37731367

RESUMEN

Background: Recent studies have identified thousands of associations between DNA methylation CpGs and complex diseases/traits, emphasizing the critical role of epigenetics in understanding disease aetiology and identifying biomarkers. However, association analyses based on methylation array data are susceptible to batch/slide effects, which can lead to inflated false positive rates or reduced statistical powerResults: We use multiple DNA methylation datasets based on the popular Illumina Infinium MethylationEPIC BeadChip array to describe consistent patterns and the joint distribution of slide effects across CpGs, confirming and extending previous results. The susceptible CpGs overlap with the Illumina Infinium HumanMethylation450 BeadChip array content.Conclusions: Our findings reveal systematic patterns in slide effects. The observations provide further insights into the characteristics of these effects and can improve existing adjustment approaches.

Asunto(s)

Metilación de ADN , Epigénesis Genética , Epigenómica , Herencia Multifactorial

7.

UBR5 forms ligand-dependent complexes on chromatin to regulate nuclear hormone receptor stability.

Tsai, Jonathan M; Aguirre, Jacob D; Li, Yen-Der; Brown, Jared; Focht, Vivian; Kater, Lukas; Kempf, Georg; Sandoval, Brittany; Schmitt, Stefan; Rutter, Justine C; Galli, Pius; Sandate, Colby R; Cutler, Jevon A; Zou, Charles; Donovan, Katherine A; Lumpkin, Ryan J; Cavadini, Simone; Park, Paul M C; Sievers, Quinlan; Hatton, Charlie; Ener, Elizabeth; Regalado, Brandon D; Sperling, Micah T; Slabicki, Mikolaj; Kim, Jeonghyeon; Zon, Rebecca; Zhang, Zinan; Miller, Peter G; Belizaire, Roger; Sperling, Adam S; Fischer, Eric S; Irizarry, Rafael; Armstrong, Scott A; Thomä, Nicolas H; Ebert, Benjamin L.

Mol Cell ; 83(15): 2753-2767.e10, 2023 08 03.

Artículo en Inglés | MEDLINE | ID: mdl-37478846

RESUMEN

Nuclear hormone receptors (NRs) are ligand-binding transcription factors that are widely targeted therapeutically. Agonist binding triggers NR activation and subsequent degradation by unknown ligand-dependent ubiquitin ligase machinery. NR degradation is critical for therapeutic efficacy in malignancies that are driven by retinoic acid and estrogen receptors. Here, we demonstrate the ubiquitin ligase UBR5 drives degradation of multiple agonist-bound NRs, including the retinoic acid receptor alpha (RARA), retinoid x receptor alpha (RXRA), glucocorticoid, estrogen, liver-X, progesterone, and vitamin D receptors. We present the high-resolution cryo-EMstructure of full-length human UBR5 and a negative stain model representing its interaction with RARA/RXRA. Agonist ligands induce sequential, mutually exclusive recruitment of nuclear coactivators (NCOAs) and UBR5 to chromatin to regulate transcriptional networks. Other pharmacological ligands such as selective estrogen receptor degraders (SERDs) degrade their receptors through differential recruitment of UBR5 or RNF111. We establish the UBR5 transcriptional regulatory hub as a common mediator and regulator of NR-induced transcription.

Asunto(s)

Cromatina , Factores de Transcripción , Humanos , Ligandos , Cromatina/genética , Factores de Transcripción/metabolismo , Receptores Citoplasmáticos y Nucleares/genética , Ubiquitinas , Ubiquitina-Proteína Ligasas/genética

8.

Significance analysis for clustering with single-cell RNA-sequencing data.

Grabski, Isabella N; Street, Kelly; Irizarry, Rafael A.

Nat Methods ; 20(8): 1196-1202, 2023 08.

Artículo en Inglés | MEDLINE | ID: mdl-37429993

RESUMEN

Unsupervised clustering of single-cell RNA-sequencing data enables the identification of distinct cell populations. However, the most widely used clustering algorithms are heuristic and do not formally account for statistical uncertainty. We find that not addressing known sources of variability in a statistically rigorous manner can lead to overconfidence in the discovery of novel cell types. Here we extend a previous method, significance of hierarchical clustering, to propose a model-based hypothesis testing approach that incorporates significance analysis into the clustering algorithm and permits statistical evaluation of clusters as distinct cell populations. We also adapt this approach to permit statistical assessment on the clusters reported by any algorithm. Finally, we extend these approaches to account for batch structure. We benchmarked our approach against popular clustering workflows, demonstrating improved performance. To show practical utility, we applied our approach to the Human Lung Cell Atlas and an atlas of the mouse cerebellar cortex, identifying several cases of over-clustering and recapitulating experimentally validated cell type definitions.

Asunto(s)

Algoritmos , Benchmarking , Humanos , Animales , Ratones , Análisis por Conglomerados , ARN , Análisis de la Célula Individual/métodos , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos

9.

Reassessing pharmacogenomic cell sensitivity with multilevel statistical models.

Ploenzke, Matt; Irizarry, Rafael.

Biostatistics ; 24(4): 901-921, 2023 10 18.

Artículo en Inglés | MEDLINE | ID: mdl-35277956

RESUMEN

Pharmacogenomic experiments allow for the systematic testing of drugs, at varying dosage concentrations, to study how genomic markers correlate with cell sensitivity to treatment. The first step in the analysis is to quantify the response of cell lines to variable dosage concentrations of the drugs being tested. The signal to noise in these measurements can be low due to biological and experimental variability. However, the increasing availability of pharmacogenomic studies provides replicated data sets that can be leveraged to gain power. To do this, we formulate a hierarchical mixture model to estimate the drug-specific mixture distributions for estimating cell sensitivity and for assessing drug effect type as either broad or targeted effect. We use this formulation to propose a unified approach that can yield posterior probability of a cell being susceptible to a drug conditional on being a targeted effect or relative effect sizes conditioned on the cell being broad. We demonstrate the usefulness of our approach via case studies. First, we assess pairwise agreements for cell lines/drugs within the intersection of two data sets and confirm the moderate pairwise agreement between many publicly available pharmacogenomic data sets. We then present an analysis that identifies sensitivity to the drug crizotinib for cells harboring EML4-ALK or NPM1-ALK gene fusions, as well as significantly down-regulated cell-matrix pathways associated with crizotinib sensitivity.

Asunto(s)

Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Crizotinib/uso terapéutico , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Carcinoma de Pulmón de Células no Pequeñas/genética , Neoplasias Pulmonares/genética , Farmacogenética , Modelos Estadísticos , Proteínas Tirosina Quinasas Receptoras/genética , Proteínas Tirosina Quinasas Receptoras/uso terapéutico

10.

Discovery of antibodies and cognate surface targets for ovarian cancer by surface profiling.

Schröfelbauer, Bärbel; Kimes, Patrick K; Hauke, Paige; Reid, Charlotte E; Shao, Kevin; Hill, Sarah J; Irizarry, Rafael; Hahn, William C.

Proc Natl Acad Sci U S A ; 120(1): e2206751120, 2023 01 03.

Artículo en Inglés | MEDLINE | ID: mdl-36574667

RESUMEN

Although antibodies targeting specific tumor-expressed antigens are the standard of care for some cancers, the identification of cancer-specific targets amenable to antibody binding has remained a bottleneck in development of new therapeutics. To overcome this challenge, we developed a high-throughput platform that allows for the unbiased, simultaneous discovery of antibodies and targets based on phenotypic binding profiles. Applying this platform to ovarian cancer, we identified a wide diversity of cancer targets including receptor tyrosine kinases, adhesion and migration proteins, proteases and proteins regulating angiogenesis in a single round of screening using genomics, flow cytometry, and mass spectrometry. In particular, we identified BCAM as a promising candidate for targeted therapy in high-grade serous ovarian cancers. More generally, this approach provides a rapid and flexible framework to identify cancer targets and antibodies.

Asunto(s)

Neoplasias Ováricas , Biblioteca de Péptidos , Humanos , Femenino , Línea Celular Tumoral , Anticuerpos , Neoplasias Ováricas/genética , Antígenos de Neoplasias

11.

Cell type-specific inference of differential expression in spatial transcriptomics.

Cable, Dylan M; Murray, Evan; Shanmugam, Vignesh; Zhang, Simon; Zou, Luli S; Diao, Michael; Chen, Haiqi; Macosko, Evan Z; Irizarry, Rafael A; Chen, Fei.

Nat Methods ; 19(9): 1076-1087, 2022 09.

Artículo en Inglés | MEDLINE | ID: mdl-36050488

RESUMEN

A central problem in spatial transcriptomics is detecting differentially expressed (DE) genes within cell types across tissue context. Challenges to learning DE include changing cell type composition across space and measurement pixels detecting transcripts from multiple cell types. Here, we introduce a statistical method, cell type-specific inference of differential expression (C-SIDE), that identifies cell type-specific DE in spatial transcriptomics, accounting for localization of other cell types. We model gene expression as an additive mixture across cell types of log-linear cell type-specific expression functions. C-SIDE's framework applies to many contexts: DE due to pathology, anatomical regions, cell-to-cell interactions and cellular microenvironment. Furthermore, C-SIDE enables statistical inference across multiple/replicates. Simulations and validation experiments on Slide-seq, MERFISH and Visium datasets demonstrate that C-SIDE accurately identifies DE with valid uncertainty quantification. Last, we apply C-SIDE to identify plaque-dependent immune activity in Alzheimer's disease and cellular interactions between tumor and immune cells. We distribute C-SIDE within the R package https://github.com/dmcable/spacexr .

Asunto(s)

Perfilación de la Expresión Génica , Transcriptoma , Perfilación de la Expresión Génica/métodos

12.

Differential richness inference for 16S rRNA marker gene surveys.

Kumar, M Senthil; Slud, Eric V; Hehnly, Christine; Zhang, Lijun; Broach, James; Irizarry, Rafael A; Schiff, Steven J; Paulson, Joseph N.

Genome Biol ; 23(1): 166, 2022 08 01.

Artículo en Inglés | MEDLINE | ID: mdl-35915508

RESUMEN

BACKGROUND: Individual and environmental health outcomes are frequently linked to changes in the diversity of associated microbial communities. Thus, deriving health indicators based on microbiome diversity measures is essential. While microbiome data generated using high-throughput 16S rRNA marker gene surveys are appealing for this purpose, 16S surveys also generate a plethora of spurious microbial taxa. RESULTS: When this artificial inflation in the observed number of taxa is ignored, we find that changes in the abundance of detected taxa confound current methods for inferring differences in richness. Experimental evidence, theory-guided exploratory data analyses, and existing literature support the conclusion that most sub-genus discoveries are spurious artifacts of clustering 16S sequencing reads. We proceed to model a 16S survey's systematic patterns of sub-genus taxa generation as a function of genus abundance to derive a robust control for false taxa accumulation. These controls unlock classical regression approaches for highly flexible differential richness inference at various levels of the surveyed microbial assemblage: from sample groups to specific taxa collections. The proposed methodology for differential richness inference is available through an R package, Prokounter. CONCLUSIONS: False species discoveries bias richness estimation and confound differential richness inference. In the case of 16S microbiome surveys, supporting evidence indicate that most sub-genus taxa are spurious. Based on this finding, a flexible method is proposed and is shown to overcome the confounding problem noted with current approaches for differential richness inference. Package availability: https://github.com/mskb01/prokounter.

Asunto(s)

Bacterias , Microbiota , Artefactos , Bacterias/genética , Análisis por Conglomerados , Microbiota/genética , ARN Ribosómico 16S/genética

13.

mirTarRnaSeq: An R/Bioconductor Statistical Package for miRNA-mRNA Target Identification and Interaction Analysis.

Movassagh, Mercedeh; Morton, Sarah U; Hehnly, Christine; Smith, Jasmine; Doan, Trang T; Irizarry, Rafael; Broach, James R; Schiff, Steven J; Bailey, Jeffrey A; Paulson, Joseph N.

BMC Genomics ; 23(1): 439, 2022 Jun 13.

Artículo en Inglés | MEDLINE | ID: mdl-35698050

RESUMEN

We introduce mirTarRnaSeq, an R/Bioconductor package for quantitative assessment of miRNA-mRNA relationships within sample cohorts. mirTarRnaSeq is a statistical package to explore predicted or pre-hypothesized miRNA-mRNA relationships following target prediction.We present two use cases applying mirTarRnaSeq. First, to identify miRNA targets, we examined EBV miRNAs for interaction with human and virus transcriptomes of stomach adenocarcinoma. This revealed enrichment of mRNA targets highly expressed in CD105+ endothelial cells, monocytes, CD4+ T cells, NK cells, CD19+ B cells, and CD34 cells. Next, to investigate miRNA-mRNA relationships in SARS-CoV-2 (COVID-19) infection across time, we used paired miRNA and RNA sequenced datasets of SARS-CoV-2 infected lung epithelial cells across three time points (4, 12, and 24 hours post-infection). mirTarRnaSeq identified evidence for human miRNAs targeting cytokine signaling and neutrophil regulation immune pathways from 4 to 24 hours after SARS-CoV-2 infection. Confirming the clinical relevance of these predictions, three of the immune specific mRNA-miRNA relationships identified in human lung epithelial cells after SARS-CoV-2 infection were also observed to be differentially expressed in blood from patients with COVID-19. Overall, mirTarRnaSeq is a robust tool that can address a wide-range of biological questions providing improved prediction of miRNA-mRNA interactions.

Asunto(s)

COVID-19 , MicroARNs , COVID-19/genética , Células Endoteliales , Humanos , MicroARNs/genética , MicroARNs/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , SARS-CoV-2

14.

A probabilistic gene expression barcode for annotation of cell types from single-cell RNA-seq data.

Grabski, Isabella N; Irizarry, Rafael A.

Biostatistics ; 23(4): 1150-1164, 2022 10 14.

Artículo en Inglés | MEDLINE | ID: mdl-35770795

RESUMEN

Single-cell RNA sequencing (scRNA-seq) quantifies gene expression for individual cells in a sample, which allows distinct cell-type populations to be identified and characterized. An important step in many scRNA-seq analysis pipelines is the annotation of cells into known cell types. While this can be achieved using experimental techniques, such as fluorescence-activated cell sorting, these approaches are impractical for large numbers of cells. This motivates the development of data-driven cell-type annotation methods. We find limitations with current approaches due to the reliance on known marker genes or from overfitting because of systematic differences, or batch effects, between studies. Here, we present a statistical approach that leverages public data sets to combine information across thousands of genes, uses a latent variable model to define cell-type-specific barcodes and account for batch effect variation, and probabilistically annotates cell-type identity from a reference of known cell types. The barcoding approach also provides a new way to discover marker genes. Using a range of data sets, including those generated to represent imperfect real-world reference data, we demonstrate that our approach substantially outperforms current reference-based methods, particularly when predicting across studies.

Asunto(s)

Perfilación de la Expresión Génica , Análisis de la Célula Individual , Expresión Génica , Perfilación de la Expresión Génica/métodos , Humanos , RNA-Seq , Análisis de Secuencia de ARN/métodos , Programas Informáticos

15.

A Flexible Statistical Framework for Estimating Excess Mortality.

Acosta, Rolando J; Irizarry, Rafael A.

Epidemiology ; 33(3): 346-353, 2022 05 01.

Artículo en Inglés | MEDLINE | ID: mdl-35383642

RESUMEN

Quantifying the impact of natural disasters or epidemics is critical for guiding policy decisions and interventions. When the effects of an event are long-lasting and difficult to detect in the short term, the accumulated effects can be devastating. Mortality is one of the most reliably measured health outcomes, partly due to its unambiguous definition. As a result, excess mortality estimates are an increasingly effective approach for quantifying the effect of an event. However, the fact that indirect effects are often characterized by small, but enduring, increases in mortality rates present a statistical challenge. This is compounded by sources of variability introduced by demographic changes, secular trends, seasonal and day of the week effects, and natural variation. Here, we present a model that accounts for these sources of variability and characterizes concerning increases in mortality rates with smooth functions of time that provide statistical power. The model permits discontinuities in the smooth functions to model sudden increases due to direct effects. We implement a flexible estimation approach that permits both surveillance of concerning increases in mortality rates and careful characterization of the effect of a past event. We demonstrate our tools' utility by estimating excess mortality after hurricanes in the United States and Puerto Rico. We use Hurricane Maria as a case study to show appealing properties that are unique to our method compared with current approaches. Finally, we show the flexibility of our approach by detecting and quantifying the 2014 Chikungunya outbreak in Puerto Rico and the COVID-19 pandemic in the United States. We make our tools available through the excessmort R package available from https://cran.r-project.org/web/packages/excessmort/.

Asunto(s)

COVID-19 , Tormentas Ciclónicas , Humanos , Pandemias , Puerto Rico/epidemiología , Estados Unidos/epidemiología

16.

Effectiveness estimates of three COVID-19 vaccines based on observational data from Puerto Rico.

Robles-Fontán, Mónica M; Nieves, Elvis G; Cardona-Gerena, Iris; Irizarry, Rafael A.

Lancet Reg Health Am ; 9: 100212, 2022 May.

Artículo en Inglés | MEDLINE | ID: mdl-35229081

RESUMEN

BACKGROUND: On July 15, 2021, with 58% of the population fully vaccinated, the start of a COVID-19 surge was observed in Puerto Rico. On July 22, 2021, the government of Puerto Rico started imposing a series of strict vaccine mandates. Two months later, over 70% of the population was vaccinated, more than in any US state, and laboratory-confirmed SARS-CoV-2 had dropped substantially. The decision to impose mandates, as well as current Department of Health recommendations related to boosters, were guided by the data and the effectiveness estimates presented here. METHODS: Between December 15, 2020, when the vaccination process began in Puerto Rico, and October 15, 2021, 2,276,966 individuals were fully vaccinated against COVID-19. During this period 112,726 laboratory-confirmed SARS-CoV-2 infections were reported. These data permitted us to quantify the outcomes of the immunization campaign and to compare effectiveness of the mRNA-1273 (Moderna), BNT162b2 (Pfizer), and Ad26.COV2.S (J&J) vaccines. We obtained vaccination status, SARS-CoV-2 test results, and COVID-19 hospitalizations and deaths, from the Department of Health. We fit statistical models that adjusted for time-varying incidence rates and age group to estimate vaccine effectiveness, since the time of vaccination, against lab-confirmed SARS-CoV-2 infection, and COVID-19 hospitalization and death. RESULTS: Two weeks after final dose, the mRNA-1273, BNT162b2, and Ad26.COV2.S vaccines had an effectiveness of 90% (95% CI: 88-91), 87% (85-88), and, 64% (58-69), respectively. After five months, effectiveness waned to about 70%, 50%, and 40%, respectively. We found no evidence that effectiveness was different after the Delta variant became dominant. For those infected, the vaccines provided further protection against COVID-19 hospitalization and deaths across all age groups, and this conditional effect did not wane in time. INTERPRETATION: The mRNA-1273 and BNT162b2 vaccines were highly effective across all age groups. They were still effective after five months although the protection against SARS-CoV-2 infection waned. The Ad26.COV2.S vaccine was effective but to a lesser degree compared to the mRNA vaccines. Although, conditional on infection, protection against adverse outcomes did not wane, the waning in effectiveness resulted in a decreased protection against serious COVID-19 outcomes across time. FUNDING: RAI's work was partly funded by NIH Grant R35GM131802.

17.

Capturing discrete latent structures: choose LDs over PCs.

Alexander, Theresa A; Irizarry, Rafael A; Bravo, Héctor Corrada.

Biostatistics ; 24(1): 1-16, 2022 12 12.

Artículo en Inglés | MEDLINE | ID: mdl-34467372

RESUMEN

High-dimensional biological data collection across heterogeneous groups of samples has become increasingly common, creating high demand for dimensionality reduction techniques that capture underlying structure of the data. Discovering low-dimensional embeddings that describe the separation of any underlying discrete latent structure in data is an important motivation for applying these techniques since these latent classes can represent important sources of unwanted variability, such as batch effects, or interesting sources of signal such as unknown cell types. The features that define this discrete latent structure are often hard to identify in high-dimensional data. Principal component analysis (PCA) is one of the most widely used methods as an unsupervised step for dimensionality reduction. This reduction technique finds linear transformations of the data which explain total variance. When the goal is detecting discrete structure, PCA is applied with the assumption that classes will be separated in directions of maximum variance. However, PCA will fail to accurately find discrete latent structure if this assumption does not hold. Visualization techniques, such as t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP), attempt to mitigate these problems with PCA by creating a low-dimensional space where similar objects are modeled by nearby points in the low-dimensional embedding and dissimilar objects are modeled by distant points with high probability. However, since t-SNE and UMAP are computationally expensive, often a PCA reduction is done before applying them which makes it sensitive to PCAs downfalls. Also, tSNE is limited to only two or three dimensions as a visualization tool, which may not be adequate for retaining discriminatory information. The linear transformations of PCA are preferable to non-linear transformations provided by methods like t-SNE and UMAP for interpretable feature weights. Here, we propose iterative discriminant analysis (iDA), a dimensionality reduction technique designed to mitigate these limitations. iDA produces an embedding that carries discriminatory information which optimally separates latent clusters using linear transformations that permit post hoc analysis to determine features that define these latent structures.

Asunto(s)

Algoritmos , Humanos , Análisis de Componente Principal

18.

All-cause excess mortality across 90 municipalities in Gujarat, India, during the COVID-19 pandemic (March 2020-April 2021).

Acosta, Rolando J; Patnaik, Biraj; Buckee, Caroline; Kiang, Mathew V; Irizarry, Rafael A; Balsari, Satchit; Mahmud, Ayesha.

PLOS Glob Public Health ; 2(8): e0000824, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-36962751

RESUMEN

Official COVID-19 mortality statistics are strongly influenced by local diagnostic capacity, strength of the healthcare and vital registration systems, and death certification criteria and capacity, often resulting in significant undercounting of COVID-19 attributable deaths. Excess mortality, which is defined as the increase in observed death counts compared to a baseline expectation, provides an alternate measure of the mortality shock-both direct and indirect-of the COVID-19 pandemic. Here, we use data from civil death registers from a convenience sample of 90 (of 162) municipalities across the state of Gujarat, India, to estimate the impact of the COVID-19 pandemic on all-cause mortality. Using a model fit to weekly data from January 2019 to February 2020, we estimated excess mortality over the course of the pandemic from March 2020 to April 2021. During this period, the official government data reported 10,098 deaths attributable to COVID-19 for the entire state of Gujarat. We estimated 21,300 [95% CI: 20, 700, 22, 000] excess deaths across these 90 municipalities in this period, representing a 44% [95% CI: 43%, 45%] increase over the expected baseline. The sharpest increase in deaths in our sample was observed in late April 2021, with an estimated 678% [95% CI: 649%, 707%] increase in mortality from expected counts. The 40 to 65 age group experienced the highest increase in mortality relative to the other age groups. We found substantial increases in mortality for males and females. Our excess mortality estimate for these 90 municipalities, representing approximately at least 8% of the population, based on the 2011 census, exceeds the official COVID-19 death count for the entire state of Gujarat, even before the delta wave of the pandemic in India peaked in May 2021. Prior studies have concluded that true pandemic-related mortality in India greatly exceeds official counts. This study, using data directly from the first point of official death registration data recording, provides incontrovertible evidence of the high excess mortality in Gujarat from March 2020 to April 2021.

19.

Robust decomposition of cell type mixtures in spatial transcriptomics.

Cable, Dylan M; Murray, Evan; Zou, Luli S; Goeva, Aleksandrina; Macosko, Evan Z; Chen, Fei; Irizarry, Rafael A.

Nat Biotechnol ; 40(4): 517-526, 2022 04.

Artículo en Inglés | MEDLINE | ID: mdl-33603203

RESUMEN

A limitation of spatial transcriptomics technologies is that individual measurements may contain contributions from multiple cells, hindering the discovery of cell-type-specific spatial patterns of localization and expression. Here, we develop robust cell type decomposition (RCTD), a computational method that leverages cell type profiles learned from single-cell RNA-seq to decompose cell type mixtures while correcting for differences across sequencing technologies. We demonstrate the ability of RCTD to detect mixtures and identify cell types on simulated datasets. Furthermore, RCTD accurately reproduces known cell type and subtype localization patterns in Slide-seq and Visium datasets of the mouse brain. Finally, we show how RCTD's recovery of cell type localization enables the discovery of genes within a cell type whose expression depends on spatial environment. Spatial mapping of cell types with RCTD enables the spatial components of cellular identity to be defined, uncovering new principles of cellular organization in biological tissue. RCTD is publicly available as an open-source R package at https://github.com/dmcable/RCTD .

Asunto(s)

Análisis de la Célula Individual , Transcriptoma , Animales , Ratones , Análisis de Secuencia de ARN , Programas Informáticos , Transcriptoma/genética , Secuenciación del Exoma

20.

Stem-like intestinal Th17 cells give rise to pathogenic effector T cells during autoimmunity.

Schnell, Alexandra; Huang, Linglin; Singer, Meromit; Singaraju, Anvita; Barilla, Rocky M; Regan, Brianna M L; Bollhagen, Alina; Thakore, Pratiksha I; Dionne, Danielle; Delorey, Toni M; Pawlak, Mathias; Meyer Zu Horste, Gerd; Rozenblatt-Rosen, Orit; Irizarry, Rafael A; Regev, Aviv; Kuchroo, Vijay K.

Cell ; 184(26): 6281-6298.e23, 2021 12 22.

Artículo en Inglés | MEDLINE | ID: mdl-34875227

RESUMEN

While intestinal Th17 cells are critical for maintaining tissue homeostasis, recent studies have implicated their roles in the development of extra-intestinal autoimmune diseases including multiple sclerosis. However, the mechanisms by which tissue Th17 cells mediate these dichotomous functions remain unknown. Here, we characterized the heterogeneity, plasticity, and migratory phenotypes of tissue Th17 cells in vivo by combined fate mapping with profiling of the transcriptomes and TCR clonotypes of over 84,000 Th17 cells at homeostasis and during CNS autoimmune inflammation. Inter- and intra-organ single-cell analyses revealed a homeostatic, stem-like TCF1+ IL-17+ SLAMF6+ population that traffics to the intestine where it is maintained by the microbiota, providing a ready reservoir for the IL-23-driven generation of encephalitogenic GM-CSF+ IFN-Î³+ CXCR6+ T cells. Our study defines a direct in vivo relationship between IL-17+ non-pathogenic and GM-CSF+ and IFN-Î³+ pathogenic Th17 populations and provides a mechanism by which homeostatic intestinal Th17 cells direct extra-intestinal autoimmune disease.

Asunto(s)

Autoinmunidad , Intestinos/inmunología , Células Madre/metabolismo , Células Th17/inmunología , Animales , Movimiento Celular , Células Clonales , Encefalomielitis Autoinmune Experimental/inmunología , Factor Estimulante de Colonias de Granulocitos y Macrófagos/metabolismo , Homeostasis , Humanos , Interferón gamma/metabolismo , Interleucina-17/metabolismo , Ratones Endogámicos C57BL , Especificidad de Órganos , ARN/metabolismo , RNA-Seq , Receptores de Antígenos de Linfocitos T/metabolismo , Receptores CXCR6/metabolismo , Receptores de Interleucina/metabolismo , Reproducibilidad de los Resultados , Familia de Moléculas Señalizadoras de la Activación Linfocitaria/metabolismo , Análisis de la Célula Individual , Bazo/metabolismo

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA