Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
Add more filters










Publication year range
1.
Cancer Res ; 84(9): 1517-1533, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38587552

ABSTRACT

Pancreatic ductal adenocarcinoma (PDAC) is an aggressive malignancy characterized by an immunosuppressive tumor microenvironment enriched with cancer-associated fibroblasts (CAF). This study used a convergence approach to identify tumor cell and CAF interactions through the integration of single-cell data from human tumors with human organoid coculture experiments. Analysis of a comprehensive atlas of PDAC single-cell RNA sequencing data indicated that CAF density is associated with increased inflammation and epithelial-mesenchymal transition (EMT) in epithelial cells. Transfer learning using transcriptional data from patient-derived organoid and CAF cocultures provided in silico validation of CAF induction of inflammatory and EMT epithelial cell states. Further experimental validation in cocultures demonstrated integrin beta 1 (ITGB1) and vascular endothelial factor A (VEGFA) interactions with neuropilin-1 mediating CAF-epithelial cell cross-talk. Together, this study introduces transfer learning from human single-cell data to organoid coculture analyses for experimental validation of discoveries of cell-cell cross-talk and identifies fibroblast-mediated regulation of EMT and inflammation. SIGNIFICANCE: Adaptation of transfer learning to relate human single-cell RNA sequencing data to organoid-CAF cocultures facilitates discovery of human pancreatic cancer intercellular interactions and uncovers cross-talk between CAFs and tumor cells through VEGFA and ITGB1.


Subject(s)
Cancer-Associated Fibroblasts , Carcinoma, Pancreatic Ductal , Coculture Techniques , Epithelial-Mesenchymal Transition , Inflammation , Integrin beta1 , Pancreatic Neoplasms , Single-Cell Analysis , Tumor Microenvironment , Humans , Carcinoma, Pancreatic Ductal/pathology , Carcinoma, Pancreatic Ductal/metabolism , Carcinoma, Pancreatic Ductal/genetics , Cancer-Associated Fibroblasts/metabolism , Cancer-Associated Fibroblasts/pathology , Pancreatic Neoplasms/pathology , Pancreatic Neoplasms/metabolism , Pancreatic Neoplasms/genetics , Inflammation/pathology , Inflammation/metabolism , Integrin beta1/metabolism , Integrin beta1/genetics , Organoids/pathology , Organoids/metabolism , Vascular Endothelial Growth Factor A/metabolism , Vascular Endothelial Growth Factor A/genetics , Neuropilin-1/metabolism , Neuropilin-1/genetics , Gene Expression Regulation, Neoplastic , Cell Line, Tumor , Cell Communication
2.
bioRxiv ; 2024 Feb 28.
Article in English | MEDLINE | ID: mdl-38464021

ABSTRACT

The rising quality and amount of multi-omic data across biomedical science demands that we build innovative solutions to harness their collective discovery potential. From publicly available repositories, we have assembled and curated a compendium of gene-level transcriptomic data focused on mammalian excitatory neurogenesis in the neocortex. This collection is open for exploration by both computational and cell biologists at nemoanalytics.org, and this report forms a demonstration of its utility. Applying our novel structured joint decomposition approach to mouse, macaque and human data from the collection, we define transcriptome dynamics that are conserved across mammalian excitatory neurogenesis and which map onto the genetics of human brain structure and disease. Leveraging additional data within NeMO Analytics via projection methods, we chart the dynamics of these fundamental molecular elements of neurogenesis across developmental time and space and into postnatal life. Reversing the direction of our investigation, we use transcriptomic data from laminar-specific dissection of adult human neocortex to define molecular signatures specific to excitatory neuronal cell types resident in individual layers of the mature neocortex, and trace their emergence across development. We show that while many lineage defining transcription factors are most highly expressed at early fetal ages, the laminar neuronal identities which they drive take years to decades to reach full maturity. Finally, we interrogated data from stem-cell derived cerebral organoid systems demonstrating that many fundamental elements of in vivo development are recapitulated with high-fidelity in vitro, while specific transcriptomic programs in neuronal maturation are absent. We propose these analyses as specific applications of the general approach of combining joint decomposition with large curated collections of analysis-ready multi-omics data matrices focused on particular cell and disease contexts. Importantly, these open environments are accessible to, and must be fueled with emerging data by, cell biologists with and without coding expertise.

3.
bioRxiv ; 2023 Nov 02.
Article in English | MEDLINE | ID: mdl-37961182

ABSTRACT

The mammalian neocortex differs vastly in size and complexity between mammalian species, yet the mechanisms that lead to an increase in brain size during evolution are not known. We show here that two transcription factors coordinate gene expression programs in progenitor cells of the neocortex to regulate their proliferative capacity and neuronal output in order to determine brain size. Comparative studies in mice, ferrets and macaques demonstrate an evolutionary conserved function for these transcription factors to regulate progenitor behaviors across the mammalian clade. Strikingly, the two transcriptional regulators control the expression of large numbers of genes linked to microcephaly suggesting that transcriptional deregulation as an important determinant of the molecular pathogenesis of microcephaly, which is consistent with the finding that genetic manipulation of the two transcription factors leads to severe microcephaly. Summary: The neocortex varies in size and complexity among mammals due to the tremendous variability in the number and diversity of neuronal subtypes across species 1,2 . The increased cellular diversity is paralleled by the expansion of the pool of neocortical progenitors 2-5 and the emergence of indirect neurogenesis 6 during brain evolution. The molecular pathways that control these biological processes and are disrupted in neurological and psychiatric disorders remain largely unknown. Here we show that the transcription factors BRN1 (POU3F3) and BRN2 (POU3F2) act as master regulators of the transcriptional programs in progenitors linked to neuronal specification and neocortex expansion. Using genetically modified lissencephalic and gyrencephalic animals, we found that BRN1/2 establish transcriptional programs in neocortical progenitors that control their proliferative capacity and the switch from direct to indirect neurogenesis. Functional studies in genetically modified mice and ferrets show that BRN1/2 act in concert with NOTCH and primary microcephaly genes to regulate progenitor behavior. Analysis of transcriptomics data from genetically modified macaques provides evidence that these molecular pathways are conserved in non-human primates. Our findings thus establish a mechanistic link between BRN1/2 and genes linked to microcephaly and demonstrate that BRN1/2 are central regulators of gene expression programs in neocortical progenitors critical to determine brain size during evolution.

4.
Nat Protoc ; 18(12): 3690-3731, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37989764

ABSTRACT

Non-negative matrix factorization (NMF) is an unsupervised learning method well suited to high-throughput biology. However, inferring biological processes from an NMF result still requires additional post hoc statistics and annotation for interpretation of learned features. Here, we introduce a suite of computational tools that implement NMF and provide methods for accurate and clear biological interpretation and analysis. A generalized discussion of NMF covering its benefits, limitations and open questions is followed by four procedures for the Bayesian NMF algorithm Coordinated Gene Activity across Pattern Subsets (CoGAPS). Each procedure will demonstrate NMF analysis to quantify cell state transitions in a public domain single-cell RNA-sequencing dataset. The first demonstrates PyCoGAPS, our new Python implementation that enhances runtime for large datasets, and the second allows its deployment in Docker. The third procedure steps through the same single-cell NMF analysis using our R CoGAPS interface. The fourth introduces a beginner-friendly CoGAPS platform using GenePattern Notebook, aimed at users with a working conceptual knowledge of data analysis but without a basic proficiency in the R or Python programming language. We also constructed a user-facing website to serve as a central repository for information and instructional materials about CoGAPS and its application programming interfaces. The expected timing to setup the packages and conduct a test run is around 15 min, and an additional 30 min to conduct analyses on a precomputed result. The expected runtime on the user's desired dataset can vary from hours to days depending on factors such as dataset size or input parameters.


Subject(s)
Algorithms , Programming Languages , Bayes Theorem , Single-Cell Analysis
5.
Genome Biol ; 24(1): 246, 2023 10 26.
Article in English | MEDLINE | ID: mdl-37885016

ABSTRACT

BACKGROUND: RNA velocity analysis of single cells offers the potential to predict temporal dynamics from gene expression. In many systems, RNA velocity has been observed to produce a vector field that qualitatively reflects known features of the system. However, the limitations of RNA velocity estimates are still not well understood. RESULTS: We analyze the impact of different steps in the RNA velocity workflow on direction and speed. We consider both high-dimensional velocity estimates and low-dimensional velocity vector fields mapped onto an embedding. We conclude the transition probability method for mapping velocity estimates onto an embedding is effectively interpolating in the embedding space. Our findings reveal a significant dependence of the RNA velocity workflow on smoothing via the k-nearest-neighbors (k-NN) graph of the observed data. This reliance results in considerable estimation errors for both direction and speed in both high- and low-dimensional settings when the k-NN graph fails to accurately represent the true data structure; this is an unknown feature of real data. RNA velocity performs poorly at estimating speed in both low- and high-dimensional spaces, except in very low noise settings. We introduce a novel quality measure that can identify when RNA velocity should not be used. CONCLUSIONS: Our findings emphasize the importance of choices in the RNA velocity workflow and highlight critical limitations of data analysis. We advise against over-interpreting expression dynamics using RNA velocity, particularly in terms of speed. Finally, we emphasize that the use of RNA velocity in assessing the correctness of a low-dimensional embedding is circular.


Subject(s)
Probability , Cluster Analysis
6.
bioRxiv ; 2023 Nov 05.
Article in English | MEDLINE | ID: mdl-37745323

ABSTRACT

Cells are fundamental units of life, constantly interacting and evolving as dynamical systems. While recent spatial multi-omics can quantitate individual cells' characteristics and regulatory programs, forecasting their evolution ultimately requires mathematical modeling. We develop a conceptual framework-a cell behavior hypothesis grammar-that uses natural language statements (cell rules) to create mathematical models. This allows us to systematically integrate biological knowledge and multi-omics data to make them computable. We can then perform virtual "thought experiments" that challenge and extend our understanding of multicellular systems, and ultimately generate new testable hypotheses. In this paper, we motivate and describe the grammar, provide a reference implementation, and demonstrate its potential through a series of examples in tumor biology and immunotherapy. Altogether, this approach provides a bridge between biological, clinical, and systems biology researchers for mathematical modeling of biological systems at scale, allowing the community to extrapolate from single-cell characterization to emergent multicellular behavior.

7.
medRxiv ; 2023 Sep 12.
Article in English | MEDLINE | ID: mdl-37745408

ABSTRACT

Background: Tau pathology is common in age-related neurodegenerative diseases. Tau pathology in primary age-related tauopathy (PART) and in Alzheimer's disease (AD) has a similar biochemical structure and anatomic distribution, which is distinct from tau pathology in other diseases. However, the molecular changes associated with intraneuronal tau pathology in PART and AD, and whether these changes are similar in the two diseases, is largely unexplored. Methods: Using GeoMx spatial transcriptomics, mRNA was quantified in CA1 pyramidal neurons with tau pathology and adjacent neurons without tau pathology in 6 cases of PART and 6 cases of AD, and compared to 4 control cases without pathology. Transcriptional changes were analyzed for differential gene expression and for coordinated patterns of gene expression associated with both disease state and intraneuronal tau pathology. Results: Synaptic gene changes and two novel gene expression signatures associated with intraneuronal tau were identified in PART and AD. Overall, gene expression changes associated with intraneuronal tau pathology were similar in PART and AD. Synaptic gene expression was decreased overall in neurons in AD and PART compared to control cases. However, this decrease was largely driven by neurons lacking tau pathology. Synaptic gene expression was increased in tau-positive neurons compared to tau-negative neurons in disease. Two novel gene expression signatures associated with intraneuronal tau were identified by examining coordinated patterns of gene expression. Genes in the up-regulated expression pattern were enriched in calcium regulation and synaptic function pathways, specifically in synaptic exocytosis. These synaptic gene changes and intraneuronal tau expression signatures were confirmed in a published transcriptional dataset of cortical neurons with tau pathology in AD. Conclusions: PART and AD show similar transcriptional changes associated with intraneuronal tau pathology in CA1 pyramidal neurons, raising the possibility of a mechanistic relationship between the tau pathology in the two diseases. Intraneuronal tau pathology was also associated with increased expression of genes associated with synaptic function and calcium regulation compared to tau-negative disease neurons. The findings highlight the power of molecular analysis stratified by pathology in neurodegenerative disease and provide novel insight into common molecular pathways associated with intraneuronal tau in PART and AD.

8.
Patterns (N Y) ; 4(8): 100793, 2023 Aug 11.
Article in English | MEDLINE | ID: mdl-37602211

ABSTRACT

Single-cell transcriptomics technologies can uncover changes in the molecular states that underlie cellular phenotypes. However, understanding the dynamic cellular processes requires extending from inferring trajectories from snapshots of cellular states to estimating temporal changes in cellular gene expression. To address this challenge, we have developed a neural ordinary differential-equation-based method, RNAForecaster, for predicting gene expression states in single cells for multiple future time steps in an embedding-independent manner. We demonstrate that RNAForecaster can accurately predict future expression states in simulated single-cell transcriptomic data with cellular tracking over time. We then show that by using metabolic labeling single-cell RNA sequencing (scRNA-seq) data from constitutively dividing cells, RNAForecaster accurately recapitulates many of the expected changes in gene expression during progression through the cell cycle over a 3-day period. Thus, RNAForecaster enables short-term estimation of future expression states in biological systems from high-throughput datasets with temporal information.

10.
Nature ; 618(7966): 790-798, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37316665

ABSTRACT

Psychedelics are a broad class of drugs defined by their ability to induce an altered state of consciousness1,2. These drugs have been used for millennia in both spiritual and medicinal contexts, and a number of recent clinical successes have spurred a renewed interest in developing psychedelic therapies3-9. Nevertheless, a unifying mechanism that can account for these shared phenomenological and therapeutic properties remains unknown. Here we demonstrate in mice that the ability to reopen the social reward learning critical period is a shared property across psychedelic drugs. Notably, the time course of critical period reopening is proportional to the duration of acute subjective effects reported in humans. Furthermore, the ability to reinstate social reward learning in adulthood is paralleled by metaplastic restoration of oxytocin-mediated long-term depression in the nucleus accumbens. Finally, identification of differentially expressed genes in the 'open state' versus the 'closed state' provides evidence that reorganization of the extracellular matrix is a common downstream mechanism underlying psychedelic drug-mediated critical period reopening. Together these results have important implications for the implementation of psychedelics in clinical practice, as well as the design of novel compounds for the treatment of neuropsychiatric disease.


Subject(s)
Critical Period, Psychological , Hallucinogens , Learning , Reward , Animals , Humans , Mice , Consciousness/drug effects , Hallucinogens/pharmacology , Hallucinogens/therapeutic use , Learning/drug effects , Time Factors , Oxytocin/metabolism , Nucleus Accumbens/drug effects , Nucleus Accumbens/metabolism , Long-Term Synaptic Depression/drug effects , Extracellular Matrix/drug effects
11.
Cancer Discov ; 13(5): 1053-1057, 2023 05 04.
Article in English | MEDLINE | ID: mdl-37067199

ABSTRACT

SUMMARY: Convergence science teams integrating clinical, biological, engineering, and computational expertise are inventing new forecast systems to monitor and predict evolutionary changes in tumor and immune interactions during early cancer progression and therapeutic response. The resulting methods should inform a new predictive medicine paradigm to select adaptive immunotherapeutic regimens personalized to patients' tumors at a given time during their cancer progression for durable patient response.


Subject(s)
Immunotherapy , Neoplasms , Precision Medicine , Humans , Immunotherapy/methods , Immunotherapy/trends , Neoplasms/genetics , Neoplasms/immunology , Neoplasms/therapy , Precision Medicine/methods , Precision Medicine/trends , Drug Resistance , Tumor Microenvironment
12.
Cell Syst ; 14(4): 285-301.e4, 2023 04 19.
Article in English | MEDLINE | ID: mdl-37080163

ABSTRACT

Recent advances in spatial transcriptomics (STs) enable gene expression measurements from a tissue sample while retaining its spatial context. This technology enables unprecedented in situ resolution of the regulatory pathways that underlie the heterogeneity in the tumor as well as the tumor microenvironment (TME). The direct characterization of cellular co-localization with spatial technologies facilities quantification of the molecular changes resulting from direct cell-cell interaction, as it occurs in tumor-immune interactions. We present SpaceMarkers, a bioinformatics algorithm to infer molecular changes from cell-cell interactions from latent space analysis of ST data. We apply this approach to infer the molecular changes from tumor-immune interactions in Visium spatial transcriptomics data of metastasis, invasive and precursor lesions, and immunotherapy treatment. Further transfer learning in matched scRNA-seq data enabled further quantification of the specific cell types in which SpaceMarkers are enriched. Altogether, SpaceMarkers can identify the location and context-specific molecular interactions within the TME from ST data.


Subject(s)
Algorithms , Tumor Microenvironment , Cell Communication , Computational Biology , Gene Expression Profiling
13.
J Clin Invest ; 133(8)2023 04 17.
Article in English | MEDLINE | ID: mdl-36881486

ABSTRACT

Pancreatic ductal adenocarcinoma (PDAC) frequently presents with metastasis, but the molecular programs in human PDAC cells that drive invasion are not well understood. Using an experimental pipeline enabling PDAC organoid isolation and collection based on invasive phenotype, we assessed the transcriptomic programs associated with invasion in our organoid model. We identified differentially expressed genes in invasive organoids compared with matched noninvasive organoids from the same patients, and we confirmed that the encoded proteins were enhanced in organoid invasive protrusions. We identified 3 distinct transcriptomic groups in invasive organoids, 2 of which correlated directly with the morphological invasion patterns and were characterized by distinct upregulated pathways. Leveraging publicly available single-cell RNA-sequencing data, we mapped our transcriptomic groups onto human PDAC tissue samples, highlighting differences in the tumor microenvironment between transcriptomic groups and suggesting that non-neoplastic cells in the tumor microenvironment can modulate tumor cell invasion. To further address this possibility, we performed computational ligand-receptor analysis and validated the impact of multiple ligands (TGF-ß1, IL-6, CXCL12, MMP9) on invasion and gene expression in an independent cohort of fresh human PDAC organoids. Our results identify molecular programs driving morphologically defined invasion patterns and highlight the tumor microenvironment as a potential modulator of these programs.


Subject(s)
Carcinoma, Pancreatic Ductal , Pancreatic Neoplasms , Humans , Transcriptome , Pancreatic Neoplasms/pathology , Carcinoma, Pancreatic Ductal/metabolism , Organoids/metabolism , Gene Expression Regulation, Neoplastic , Cell Line, Tumor , Tumor Microenvironment/genetics
14.
JCI Insight ; 7(19)2022 10 10.
Article in English | MEDLINE | ID: mdl-36214223

ABSTRACT

Mass cytometry, or cytometry by TOF (CyTOF), provides a robust means of determining protein-level measurements of more than 40 markers simultaneously. While the functional states of immune cells occur along continuous phenotypic transitions, cytometric studies surveying cell phenotypes often rely on static metrics, such as discrete cell-type abundances, based on canonical markers and/or restrictive gating strategies. To overcome this limitation, we applied single-cell trajectory inference and nonnegative matrix factorization methods to CyTOF data to trace the dynamics of T cell states. In the setting of cancer immunotherapy, we showed that patient-specific summaries of continuous phenotypic shifts in T cells could be inferred from peripheral blood-derived CyTOF mass cytometry data. We further illustrated that transfer learning enabled these T cell continuous metrics to be used to estimate patient-specific cell states in new sample cohorts from a reference patient data set. Our work establishes the utility of continuous metrics for CyTOF analysis as tools for translational discovery.


Subject(s)
Benchmarking , T-Lymphocytes , Biomarkers/analysis , Clinical Trials as Topic , Flow Cytometry/methods , Immunologic Factors , Immunotherapy , Monitoring, Immunologic
15.
Biostatistics ; 23(4): 1200-1217, 2022 10 14.
Article in English | MEDLINE | ID: mdl-35358296

ABSTRACT

Integrative analysis of multiple data sets has the potential of fully leveraging the vast amount of high throughput biological data being generated. In particular such analysis will be powerful in making inference from publicly available collections of genetic, transcriptomic and epigenetic data sets which are designed to study shared biological processes, but which vary in their target measurements, biological variation, unwanted noise, and batch variation. Thus, methods that enable the joint analysis of multiple data sets are needed to gain insights into shared biological processes that would otherwise be hidden by unwanted intra-data set variation. Here, we propose a method called two-stage linked component analysis (2s-LCA) to jointly decompose multiple biologically related experimental data sets with biological and technological relationships that can be structured into the decomposition. The consistency of the proposed method is established and its empirical performance is evaluated via simulation studies. We apply 2s-LCA to jointly analyze four data sets focused on human brain development and identify meaningful patterns of gene expression in human neurogenesis that have shared structure across these data sets.


Subject(s)
Transcriptome , Computer Simulation , Humans
16.
Genome Biol ; 23(1): 41, 2022 01 31.
Article in English | MEDLINE | ID: mdl-35101061

ABSTRACT

BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. RESULTS: Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. CONCLUSIONS: Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data.


Subject(s)
Machine Learning , Single-Cell Analysis , Cell Cycle/genetics , Principal Component Analysis , Sequence Analysis, RNA , Exome Sequencing
17.
Nat Cancer ; 2(9): 891-903, 2021 09.
Article in English | MEDLINE | ID: mdl-34796337

ABSTRACT

A potentially curative hepatic resection is the optimal treatment for hepatocellular carcinoma (HCC), but most patients are not candidates for resection and most resected HCCs eventually recur. Until recently, neoadjuvant systemic therapy for HCC has been limited by a lack of effective systemic agents. Here, in a single arm phase 1b study, we evaluated the feasibility of neoadjuvant cabozantinib and nivolumab in patients with HCC including patients outside of traditional resection criteria (NCT03299946). Of 15 patients enrolled, 12 (80%) underwent successful margin negative resection, and 5/12 (42%) patients had major pathologic responses. In-depth biospecimen profiling demonstrated an enrichment in T effector cells, as well as tertiary lymphoid structures, CD138+ plasma cells, and a distinct spatial arrangement of B cells in responders as compared to non-responders, indicating an orchestrated B-cell contribution to antitumor immunity in HCC.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Anilides , Carcinoma, Hepatocellular/drug therapy , Humans , Liver Neoplasms/drug therapy , Neoadjuvant Therapy , Neoplasm Recurrence, Local , Nivolumab/therapeutic use , Pyridines
18.
Curr Opin Syst Biol ; 26: 24-32, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34660940

ABSTRACT

As the single cell field races to characterize each cell type, state, and behavior, the complexity of the computational analysis approaches the complexity of the biological systems. Single cell and imaging technologies now enable unprecedented measurements of state transitions in biological systems, providing high-throughput data that capture tens-of-thousands of measurements on hundreds-of-thousands of samples. Thus, the definition of cell type and state is evolving to encompass the broad range of biological questions now attainable. To answer these questions requires the development of computational tools for integrated multi-omics analysis. Merged with mathematical models, these algorithms will be able to forecast future states of biological systems, going from statistical inferences of phenotypes to time course predictions of the biological systems with dynamic maps analogous to weather systems. Thus, systems biology for forecasting biological system dynamics from multi-omic data represents the future of cell biology empowering a new generation of technology-driven predictive medicine.

SELECTION OF CITATIONS
SEARCH DETAIL
...