Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 115
Filter
1.
Sci Data ; 11(1): 176, 2024 Feb 07.
Article in English | MEDLINE | ID: mdl-38326333

ABSTRACT

Suncus etruscus is one of the world's smallest mammals, with an average body mass of about 2 grams. The Etruscan shrew's small body is accompanied by a very high energy demand and numerous metabolic adaptations. Here we report a chromosome-level genome assembly using PacBio long read sequencing, 10X Genomics linked short reads, optical mapping, and Hi-C linked reads. The assembly is partially phased, with the 2.472 Gbp primary pseudohaplotype and 1.515 Gbp alternate. We manually curated the primary assembly and identified 22 chromosomes, including X and Y sex chromosomes. The NCBI genome annotation pipeline identified 39,091 genes, 19,819 of them protein-coding. We also identified segmental duplications, inferred GO term annotations, and computed orthologs of human and mouse genes. This reference-quality genome will be an important resource for research on mammalian development, metabolism, and body size control.


Subject(s)
Chromosomes , Shrews , Animals , Mice , Chromosomes/genetics , Genome , Genomics , Molecular Sequence Annotation , Shrews/genetics
2.
Mol Biol Evol ; 41(3)2024 Mar 01.
Article in English | MEDLINE | ID: mdl-38376487

ABSTRACT

The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.


Subject(s)
Balaenoptera , Neoplasms , Animals , Balaenoptera/genetics , Segmental Duplications, Genomic , Genome , Demography , Neoplasms/genetics
3.
Stem Cell Reports ; 18(12): 2328-2343, 2023 12 12.
Article in English | MEDLINE | ID: mdl-37949072

ABSTRACT

Sus scrofa domesticus (pig) has served as a superb large mammalian model for biomedical studies because of its comparable physiology and organ size to humans. The derivation of transgene-free porcine induced pluripotent stem cells (PiPSCs) will, therefore, benefit the development of porcine-specific models for regenerative biology and its medical applications. In the past, this effort has been hampered by a lack of understanding of the signaling milieu that stabilizes the porcine pluripotent state in vitro. Here, we report that transgene-free PiPSCs can be efficiently derived from porcine fibroblasts by episomal vectors along with microRNA-302/367 using optimized protocols tailored for this species. PiPSCs can be differentiated into derivatives representing the primary germ layers in vitro and can form teratomas in immunocompromised mice. Furthermore, the transgene-free PiPSCs preserve intrinsic species-specific developmental timing in culture, known as developmental allochrony. This is demonstrated by establishing a porcine in vitro segmentation clock model that, for the first time, displays a specific periodicity at ∼3.7 h, a timescale recapitulating in vivo porcine somitogenesis. We conclude that the transgene-free PiPSCs can serve as a powerful tool for modeling development and disease and developing transplantation strategies. We also anticipate that they will provide insights into conserved and unique features on the regulations of mammalian pluripotency and developmental timing mechanisms.


Subject(s)
Induced Pluripotent Stem Cells , Pluripotent Stem Cells , Humans , Animals , Mice , Swine , Cellular Reprogramming , Cell Differentiation , Transgenes , Mammals
4.
BMC Bioinformatics ; 24(1): 412, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37915001

ABSTRACT

BACKGROUND: The PubMed archive contains more than 34 million articles; consequently, it is becoming increasingly difficult for a biomedical researcher to keep up-to-date with different knowledge domains. Computationally efficient and interpretable tools are needed to help researchers find and understand associations between biomedical concepts. The goal of literature-based discovery (LBD) is to connect concepts in isolated literature domains that would normally go undiscovered. This usually takes the form of an A-B-C relationship, where A and C terms are linked through a B term intermediate. Here we describe Serial KinderMiner (SKiM), an LBD algorithm for finding statistically significant links between an A term and one or more C terms through some B term intermediate(s). The development of SKiM is motivated by the observation that there are only a few LBD tools that provide a functional web interface, and that the available tools are limited in one or more of the following ways: (1) they identify a relationship but not the type of relationship, (2) they do not allow the user to provide their own lists of B or C terms, hindering flexibility, (3) they do not allow for querying thousands of C terms (which is crucial if, for instance, the user wants to query connections between a disease and the thousands of available drugs), or (4) they are specific for a particular biomedical domain (such as cancer). We provide an open-source tool and web interface that improves on all of these issues. RESULTS: We demonstrate SKiM's ability to discover useful A-B-C linkages in three control experiments: classic LBD discoveries, drug repurposing, and finding associations related to cancer. Furthermore, we supplement SKiM with a knowledge graph built with transformer machine-learning models to aid in interpreting the relationships between terms found by SKiM. Finally, we provide a simple and intuitive open-source web interface ( https://skim.morgridge.org ) with comprehensive lists of drugs, diseases, phenotypes, and symptoms so that anyone can easily perform SKiM searches. CONCLUSIONS: SKiM is a simple algorithm that can perform LBD searches to discover relationships between arbitrary user-defined concepts. SKiM is generalized for any domain, can perform searches with many thousands of C term concepts, and moves beyond the simple identification of an existence of a relationship; many relationships are given relationship type labels from our knowledge graph.


Subject(s)
Algorithms , Neoplasms , Humans , PubMed , Knowledge , Knowledge Discovery
5.
Physiol Rep ; 11(17): e15814, 2023 09.
Article in English | MEDLINE | ID: mdl-37667413

ABSTRACT

Cartilage acidic protein-1 (CRTAC1) is produced by several cell types, including Type 2 alveolar epithelial (T2AE) cells that are targeted by SARS-CoV2. Plasma CRTAC1 is known based on proteomic surveys to be low in patients with severe COVID-19. Using an ELISA, we found that patients treated for COVID-19 in an ICU almost uniformly had plasma concentrations of CRTAC1 below those of healthy controls. Magnitude of decrease in CRTAC1 distinguished COVID-19 from other causes of acute respiratory decompensation and correlated with established metrics of COVID-19 severity. CRTAC1 concentrations below those of controls were found in some patients a year after hospitalization with COVID-19, long COVID after less severe COVID-19, or chronic obstructive pulmonary disease. Decreases in CRTAC1 in severe COVID-19 correlated (r = 0.37, p = 0.0001) with decreases in CFP (properdin), which interacts with CRTAC1. Thus, decreases of CRTAC1 associated with severe COVID-19 may result from loss of production by T2AE cells or co-depletion with CFP. Determination of significance of and reasons behind decreased CRTAC1 concentration in a subset of patients with long COVID will require analysis of roles of preexisting lung disease, impact of prior acute COVID-19, age, and other confounding variables in a larger number of patients.


Subject(s)
COVID-19 , Calcium-Binding Proteins , Humans , Calcium-Binding Proteins/blood , Post-Acute COVID-19 Syndrome , Proteomics , RNA, Viral , SARS-CoV-2
6.
Sci Rep ; 13(1): 12968, 2023 08 10.
Article in English | MEDLINE | ID: mdl-37563287

ABSTRACT

Diabetic retinopathy is a common complication of long-term diabetes and that could lead to vision loss. Unfortunately, early diabetic retinopathy remains poorly understood. There is no effective way to prevent or treat early diabetic retinopathy until patients develop later stages of diabetic retinopathy. Elevated acellular capillary density is considered a reliable quantitative trait present in the early development of retinopathy. Hence, in this study, we interrogated whole retinal vascular transcriptomic changes via a Nile rat model to better understand the early pathogenesis of diabetic retinopathy. We uncovered the complexity of associations between acellular capillary density and the joint factors of blood glucose, diet, and sex, which was modeled through a Bayesian network. Using segmented regressions, we have identified different gene expression patterns and enriched Gene Ontology (GO) terms associated with acellular capillary density increasing. We developed a random forest regression model based on expression patterns of 14 genes to predict the acellular capillary density. Since acellular capillary density is a reliable quantitative trait in early diabetic retinopathy, and thus our model can be used as a transcriptomic clock to measure the severity of the progression of early retinopathy. We also identified NVP-TAE684, geldanamycin, and NVP-AUY922 as the top three potential drugs which can potentially attenuate the early DR. Although we need more in vivo studies in the future to support our re-purposed drugs, we have provided a data-driven approach to drug discovery.


Subject(s)
Diabetes Mellitus , Diabetic Retinopathy , Animals , Diabetic Retinopathy/pathology , Retinal Vessels/pathology , Transcriptome , Bayes Theorem , Murinae , Diabetes Mellitus/pathology
7.
bioRxiv ; 2023 Jun 01.
Article in English | MEDLINE | ID: mdl-37397987

ABSTRACT

Background: The PubMed database contains more than 34 million articles; consequently, it is becoming increasingly difficult for a biomedical researcher to keep up-to-date with different knowledge domains. Computationally efficient and interpretable tools are needed to help researchers find and understand associations between biomedical concepts. The goal of literature-based discovery (LBD) is to connect concepts in isolated literature domains that would normally go undiscovered. This usually takes the form of an A-B-C relationship, where A and C terms are linked through a B term intermediate. Here we describe Serial KinderMiner (SKiM), an LBD algorithm for finding statistically significant links between an A term and one or more C terms through some B term intermediate(s). The development of SKiM is motivated by the the observation that there are only a few LBD tools that provide a functional web interface, and that the available tools are limited in one or more of the following ways: 1) they identify a relationship but not the type of relationship, 2) they do not allow the user to provide their own lists of B or C terms, hindering flexibility, 3) they do not allow for querying thousands of C terms (which is crucial if, for instance, the user wants to query connections between a disease and the thousands of available drugs), or 4) they are specific for a particular biomedical domain (such as cancer). We provide an open-source tool and web interface that improves on all of these issues. Results: We demonstrate SKiM's ability to discover useful A-B-C linkages in three control experiments: classic LBD discoveries, drug repurposing, and finding associations related to cancer. Furthermore, we supplement SKiM with a knowledge graph built with transformer machine-learning models to aid in interpreting the relationships between terms found by SKiM. Finally, we provide a simple and intuitive open-source web interface ( https://skim.morgridge.org ) with comprehensive lists of drugs, diseases, phenotypes, and symptoms so that anyone can easily perform SKiM searches. Conclusions: SKiM is a simple algorithm that can perform LBD searches to discover relationships between arbitrary user-defined concepts. SKiM is generalized for any domain, can perform searches with many thousands of C term concepts, and moves beyond the simple identification of an existence of a relationship; many relationships are given relationship type labels from our knowledge graph.

9.
Stem Cell Reports ; 18(2): 585-596, 2023 02 14.
Article in English | MEDLINE | ID: mdl-36638788

ABSTRACT

Macrophages armed with chimeric antigen receptors (CARs) provide a potent new option for treating solid tumors. However, genetic engineering and scalable production of somatic macrophages remains significant challenges. Here, we used CRISPR-Cas9 gene editing methods to integrate an anti-GD2 CAR into the AAVS1 locus of human pluripotent stem cells (hPSCs). We then established a serum- and feeder-free differentiation protocol for generating CAR macrophages (CAR-Ms) through arterial endothelial-to-hematopoietic transition (EHT). CAR-M produced by this method displayed a potent cytotoxic activity against GD2-expressing neuroblastoma and melanoma in vitro and neuroblastoma in vivo. This study provides a new platform for the efficient generation of off-the-shelf CAR-Ms for antitumor immunotherapy.


Subject(s)
Melanoma , Neuroblastoma , Pluripotent Stem Cells , Receptors, Chimeric Antigen , Humans , Receptors, Chimeric Antigen/genetics , Receptors, Antigen, T-Cell/genetics , Immunotherapy/methods , Pluripotent Stem Cells/pathology , Melanoma/therapy , Neuroblastoma/therapy , Neuroblastoma/pathology , Macrophages/pathology
10.
Comput Biol Chem ; 102: 107795, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36436489

ABSTRACT

RNA sequencing (RNA-seq) has been a widely used high-throughput method to characterize transcriptomic dynamics spatiotemporally. However, RNA-seq data analysis pipelines typically depend on either a sequenced genome and/or corresponding reference transcripts. This limitation is a challenge for species lacking sequenced genomes and corresponding reference transcripts. The Nile rat (Arvicanthis niloticus) has two key features - it is daytime active, and it is prone to diet-induced diabetes, which makes it more similar to humans than regular laboratory rodents. However, at the time of this study, neither a Nile rat genome nor a reference transcript set were available, making it technically challenging to perform large-scale RNA-seq based transcriptomic studies. This genome-independent work progressed concurrently with our generation of a Nile rat genome. A well-annotated genome requires several iterations of manually reviewing curated transcripts and takes years to achieve. Here, we developed a Comparative RNA-Seq Pipeline (CRSP), integrating a comparative species strategy independent of a specific sequenced genome or species-matched reference transcripts. We performed benchmarking to validate that our CRSP tool can accurately quantify gene expression levels. In this study, we generated the first ultra-deep (2.3 billion × 2 paired-end) Nile rat RNA-seq data from 59 biopsy samples representing 22 major organs, providing a unique resource and spatial gene expression reference for Nile rat researchers. Importantly, CRSP is not limited to the Nile rat species and can be applied to any species without prior genomic knowledge. To facilitate a general use of CRSP, we also characterized the number of RNA-seq reads required for accurate estimation via simulation studies. CRSP and documents are available at: https://github.com/pjiang1105/CRSP.


Subject(s)
Murinae , Transcriptome , Humans , Animals , Transcriptome/genetics , RNA-Seq , Gene Expression Profiling , Genome , Sequence Analysis, RNA/methods , High-Throughput Nucleotide Sequencing
11.
bioRxiv ; 2023 Dec 27.
Article in English | MEDLINE | ID: mdl-38234794

ABSTRACT

During an immune response, macrophages systematically rewire their metabolism in specific ways to support their diversve functions. However, current knowledge of macrophage metabolism is largely concentrated on central carbon metabolism. Using multi-omics analysis, we identified nucleotide metabolism as one of the most significantly rewired pathways upon classical activation. Further isotopic tracing studies revealed several major changes underlying the substantial metabolomic alterations: 1) de novo synthesis of both purines and pyrimidines is shut down at several specific steps; 2) nucleotide degradation activity to nitrogenous bases is increased but complete oxidation of bases is reduced, causing a great accumulation of nucleosides and bases; and 3) cells gradually switch to primarily relying on salvaging the nucleosides and bases for maintaining most nucleotide pools. Mechanistically, the inhibition of purine nucleotide de novo synthesis is mainly caused by nitric oxide (NO)-driven inhibition of the IMP synthesis enzyme ATIC, with NO-independent transcriptional downregulation of purine synthesis genes augmenting the effect. The inhibition of pyrimidine nucleotide de novo synthesis is driven by NO-driven inhibition of CTP synthetase (CTPS) and transcriptional downregulation of thymidylate synthase (TYMS). For the rewiring of degradation, purine nucleoside phosphorylase (PNP) and uridine phosphorylase (UPP) are transcriptionally upregulated, increasing nucleoside degradation activity. However, complete degradation of purine bases by xanthine oxidoreductase (XOR) is inhibited by NO, diverting flux into nucleotide salvage. Inhibiting the activation-induced switch from nucleotide de novo synthesis to salvage by knocking out the purine salvage enzyme hypoxanthine-guanine phosporibosyl transferase (Hprt) significantly alters the expression of genes important for activated macrophage functions, suppresses macrophage migration, and increases pyroptosis. Furthermore, knocking out Hprt or Xor increases proliferation of the intracellular parasite Toxoplasma gondii in macrophages. Together, these studies comprehensively reveal the characteristics, the key regulatory mechanisms, and the functional importance of the dynamic rewiring of nucleotide metabolism in classically activated macrophages.

12.
Cell ; 185(25): 4717-4736.e25, 2022 Dec 08.
Article in English | MEDLINE | ID: mdl-36493752

ABSTRACT

Adult mammalian skin wounds heal by forming fibrotic scars. We report that full-thickness injuries of reindeer antler skin (velvet) regenerate, whereas back skin forms fibrotic scar. Single-cell multi-omics reveal that uninjured velvet fibroblasts resemble human fetal fibroblasts, whereas back skin fibroblasts express inflammatory mediators mimicking pro-fibrotic adult human and rodent fibroblasts. Consequently, injury elicits site-specific immune responses: back skin fibroblasts amplify myeloid infiltration and maturation during repair, whereas velvet fibroblasts adopt an immunosuppressive phenotype that restricts leukocyte recruitment and hastens immune resolution. Ectopic transplantation of velvet to scar-forming back skin is initially regenerative, but progressively transitions to a fibrotic phenotype akin to the scarless fetal-to-scar-forming transition reported in humans. Skin regeneration is diminished by intensifying, or enhanced by neutralizing, these pathologic fibroblast-immune interactions. Reindeer represent a powerful comparative model for interrogating divergent wound healing outcomes, and our results nominate decoupling of fibroblast-immune interactions as a promising approach to mitigate scar.


Subject(s)
Reindeer , Wound Healing , Adult , Animals , Humans , Cicatrix/pathology , Fibroblasts/pathology , Skin Transplantation , Skin/pathology , Fetus/pathology
13.
BMC Biol ; 20(1): 245, 2022 11 08.
Article in English | MEDLINE | ID: mdl-36344967

ABSTRACT

BACKGROUND: The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic. RESULTS: We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse. CONCLUSIONS: Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Animals , Haplotypes , Diabetes Mellitus, Type 2/genetics , Murinae , Genome , Genomics
14.
Genome Res ; 32(7): 1367-1384, 2022 07.
Article in English | MEDLINE | ID: mdl-35705328

ABSTRACT

Changes in transcriptional regulatory networks can significantly alter cell fate. To gain insight into transcriptional dynamics, several studies have profiled bulk multi-omic data sets with parallel transcriptomic and epigenomic measurements at different stages of a developmental process. However, integrating these data to infer cell type-specific regulatory networks is a major challenge. We present dynamic regulatory module networks (DRMNs), a novel approach to infer cell type-specific cis-regulatory networks and their dynamics. DRMN integrates expression, chromatin state, and accessibility to predict cis-regulators of context-specific expression, where context can be cell type, developmental stage, or time point, and uses multitask learning to capture network dynamics across linearly and hierarchically related contexts. We applied DRMNs to study regulatory network dynamics in three developmental processes, each showing different temporal relationships and measuring a different combination of regulatory genomic data sets: cellular reprogramming, liver dedifferentiation, and forward differentiation. DRMN identified known and novel regulators driving cell type-specific expression patterns, showing its broad applicability to examine dynamics of gene regulatory networks from linearly and hierarchically related multi-omic data sets.


Subject(s)
Gene Regulatory Networks , Genome , Chromatin/genetics , Genomics , Transcriptome
15.
Genomics ; 114(3): 110330, 2022 05.
Article in English | MEDLINE | ID: mdl-35278615

ABSTRACT

Primary hepatocytes are widely used in the pharmaceutical industry to screen drug candidates for hepatotoxicity, but hepatocytes quickly dedifferentiate and lose their mature metabolic function in culture. Attempts have been made to better recapitulate the in vivo liver environment in culture, but the full spectrum of signals required to maintain hepatocyte function ex vivo remains elusive. To elucidate molecular changes that accompany, and may contribute to dedifferentiation of hepatocytes ex vivo, we performed lineage tracing and comprehensive profiling of alterations in their gene expression profiles and chromatin landscape during culture. First, using genetically tagged hepatocytes we demonstrate that expression of the fetal gene alpha-fetoprotein in cultured hepatocytes comes from cells that previously expressed the mature gene albumin, and not from a population of albumin-negative precursor cells, proving mature hepatocytes undergo true dedifferentiation in culture. Next we studied the dedifferentiation process in detail through bulk RNA-sequencing of hepatocytes cultured over an extended period. We identified three distinct phases of dedifferentiation: an early phase, where mature hepatocyte genes are rapidly downregulated in a matter of hours; a middle phase, where fetal genes are activated; and a late phase, where initially rare contaminating non-parenchymal cells proliferate, taking over the culture. Lastly, to better understand the signaling events that result in the rapid downregulation of mature genes in hepatocytes, we examined changes in chromatin accessibility in these cells during the first 24 h of culture using Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq). We find that drastic and rapid changes in chromatin accessibility occur immediately upon the start of culture. Using binding motif analysis of the areas of open chromatin sharing similar temporal profiles, we identify several candidate transcription factors potentially involved in the dedifferentiation of primary hepatocytes in culture.


Subject(s)
Hepatocytes , Liver , Cells, Cultured , Hepatocytes/metabolism , Albumins , Chromatin/genetics
16.
Cell Rep ; 38(6): 110333, 2022 02 08.
Article in English | MEDLINE | ID: mdl-35139376

ABSTRACT

Cellular gene expression changes throughout a dynamic biological process, such as differentiation. Pseudotimes estimate cells' progress along a dynamic process based on their individual gene expression states. Ordering the expression data by pseudotime provides information about the underlying regulator-gene interactions. Because the pseudotime distribution is not uniform, many standard mathematical methods are inapplicable for analyzing the ordered gene expression states. Here we present single-cell inference of networks using Granger ensembles (SINGE), an algorithm for gene regulatory network inference from ordered single-cell gene expression data. SINGE uses kernel-based Granger causality regression to smooth irregular pseudotimes and missing expression values. It aggregates predictions from an ensemble of regression analyses to compile a ranked list of candidate interactions between transcriptional regulators and target genes. In two mouse embryonic stem cell differentiation datasets, SINGE outperforms other contemporary algorithms. However, a more detailed examination reveals caveats about poor performance for individual regulators and uninformative pseudotimes.


Subject(s)
Cell Differentiation/physiology , Gene Expression Profiling , Gene Regulatory Networks/physiology , Transcriptome/physiology , Algorithms , Animals , Computational Biology/methods , Gene Expression Profiling/methods , Mice , Software
17.
Cell Rep Methods ; 2(12): 100369, 2022 12 19.
Article in English | MEDLINE | ID: mdl-36590683

ABSTRACT

Recent advances in spatially resolved transcriptomics technologies enable both the measurement of genome-wide gene expression profiles and their mapping to spatial locations within a tissue. A first step in spatial transcriptomics data analysis is identifying genes with expression that varies spatially, and robust statistical methods exist to address this challenge. While useful, these methods do not detect spatial changes in the coordinated expression within a group of genes. To this end, we present SpatialCorr, a method for identifying sets of genes with spatially varying correlation structure. Given a collection of gene sets pre-defined by a user, SpatialCorr tests for spatially induced differences in the correlation of each gene set within tissue regions, as well as between and among regions. An application to cutaneous squamous cell carcinoma demonstrates the power of the approach for revealing biological insights not identified using existing methods.


Subject(s)
Carcinoma, Squamous Cell , Skin Neoplasms , Humans , Carcinoma, Squamous Cell/genetics , Skin Neoplasms/genetics , Gene Expression Profiling/methods , Transcriptome/genetics
18.
Nucleic Acids Res ; 50(2): e12, 2022 01 25.
Article in English | MEDLINE | ID: mdl-34850101

ABSTRACT

Considerable effort has been devoted to refining experimental protocols to reduce levels of technical variability and artifacts in single-cell RNA-sequencing data (scRNA-seq). We here present evidence that equalizing the concentration of cDNA libraries prior to pooling, a step not consistently performed in single-cell experiments, improves gene detection rates, enhances biological signals, and reduces technical artifacts in scRNA-seq data. To evaluate the effect of equalization on various protocols, we developed Scaffold, a simulation framework that models each step of an scRNA-seq experiment. Numerical experiments demonstrate that equalization reduces variation in sequencing depth and gene-specific expression variability. We then performed a set of experiments in vitro with and without the equalization step and found that equalization increases the number of genes that are detected in every cell by 17-31%, improves discovery of biologically relevant genes, and reduces nuisance signals associated with cell cycle. Further support is provided in an analysis of publicly available data.


Subject(s)
Gene Library , RNA-Seq/methods , Single-Cell Analysis/methods , Algorithms , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling/methods , Humans , RNA-Seq/standards , Sequence Analysis, RNA/methods , Single-Cell Analysis/standards , Software
19.
Genet Med ; 23(7): 1273-1280, 2021 07.
Article in English | MEDLINE | ID: mdl-33772223

ABSTRACT

PURPOSE: Fragile X syndrome (FXS), the most prevalent inherited cause of intellectual disability, remains underdiagnosed in the general population. Clinical studies have shown that individuals with FXS have a complex health profile leading to unique clinical needs. However, the full impact of this X-linked disorder on the health of affected individuals is unclear and the prevalence of co-occurring conditions is unknown. METHODS: We mined the longitudinal electronic health records from more than one million individuals to investigate the health characteristics of patients who have been clinically diagnosed with FXS. Additionally, using machine-learning approaches, we created predictive models to identify individuals with FXS in the general population. RESULTS: Our discovery-oriented approach identified the associations of FXS with a wide range of medical conditions including circulatory, endocrine, digestive, and genitourinary, in addition to mental and neurological disorders. We successfully created predictive models to identify cases five years prior to clinical diagnosis of FXS without relying on any genetic or familial data. CONCLUSION: Although FXS is often thought of primarily as a neurological disorder, it is in fact a multisystem syndrome involving many co-occurring conditions, some primary and some secondary, and they are associated with a considerable burden on patients and their families.


Subject(s)
Fragile X Syndrome , Intellectual Disability , Artificial Intelligence , Fragile X Syndrome/diagnosis , Fragile X Syndrome/epidemiology , Fragile X Syndrome/genetics , Humans , Intellectual Disability/diagnosis , Intellectual Disability/epidemiology , Intellectual Disability/genetics , Machine Learning , Phenotype
20.
PLoS Comput Biol ; 17(3): e1008778, 2021 03.
Article in English | MEDLINE | ID: mdl-33647016

ABSTRACT

Human pluripotent stem cells hold significant promise for regenerative medicine. However, long differentiation protocols and immature characteristics of stem cell-derived cell types remain challenges to the development of many therapeutic applications. In contrast to the slow differentiation of human stem cells in vitro that mirrors a nine-month gestation period, mouse stem cells develop according to a much faster three-week gestation timeline. Here, we tested if co-differentiation with mouse pluripotent stem cells could accelerate the differentiation speed of human embryonic stem cells. Following a six-week RNA-sequencing time course of neural differentiation, we identified 929 human genes that were upregulated earlier and 535 genes that exhibited earlier peaked expression profiles in chimeric cell cultures than in human cell cultures alone. Genes with accelerated upregulation were significantly enriched in Gene Ontology terms associated with neurogenesis, neuron differentiation and maturation, and synapse signaling. Moreover, chimeric mixed samples correlated with in utero human embryonic samples earlier than human cells alone, and acceleration was dose-dependent on human-mouse co-culture ratios. The altered gene expression patterns and developmental rates described in this report have implications for accelerating human stem cell differentiation and the use of interspecies chimeric embryos in developing human organs for transplantation.


Subject(s)
Chimerism , Human Embryonic Stem Cells , Neurogenesis , Pluripotent Stem Cells , Animals , Cells, Cultured , Computational Biology , Human Embryonic Stem Cells/cytology , Human Embryonic Stem Cells/physiology , Humans , Mice , Neurogenesis/genetics , Neurogenesis/physiology , Pluripotent Stem Cells/cytology , Pluripotent Stem Cells/physiology , Species Specificity , Transcriptome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...