RESUMO
Dynamic compartmentalization of eukaryotic DNA into active and repressed states enables diverse transcriptional programs to arise from a single genetic blueprint, whereas its dysregulation can be strongly linked to a broad spectrum of diseases. While single-cell Hi-C experiments allow for chromosome conformation profiling across many cells, they are still expensive and not widely available for most labs. Here, we propose an alternate approach, scENCORE, to computationally reconstruct chromatin compartments from the more affordable and widely accessible single-cell epigenetic data. First, scENCORE constructs a long-range epigenetic correlation graph to mimic chromatin interaction frequencies, where nodes and edges represent genome bins and their correlations. Then, it learns the node embeddings to cluster genome regions into A/B compartments and aligns different graphs to quantify chromatin conformation changes across conditions. Benchmarking using cell-type-matched Hi-C experiments demonstrates that scENCORE can robustly reconstruct A/B compartments in a cell-type-specific manner. Furthermore, our chromatin confirmation switching studies highlight substantial compartment-switching events that may introduce substantial regulatory and transcriptional changes in psychiatric disease. In summary, scENCORE allows accurate and cost-effective A/B compartment reconstruction to delineate higher-order chromatin structure heterogeneity in complex tissues.
Assuntos
Cromatina , Cromossomos , Cromatina/genética , DNA , Conformação Molecular , Epigênese GenéticaRESUMO
BACKGROUND: Alzheimer's disease (AD) is a devastating neurodegenerative disorder affecting 44 million people worldwide, leading to cognitive decline, memory loss, and significant impairment in daily functioning. The recent single-cell sequencing technology has revolutionized genetic and genomic resolution by enabling scientists to explore the diversity of gene expression patterns at the finest resolution. Most existing studies have solely focused on molecular perturbations within each cell, but cells live in microenvironments rather than in isolated entities. Here, we leveraged the large-scale and publicly available single-nucleus RNA sequencing in the human prefrontal cortex to investigate cell-to-cell communication in healthy brains and their perturbations in AD. We uniformly processed the snRNA-seq with strict QCs and labeled canonical cell types consistent with the definitions from the BRAIN Initiative Cell Census Network. From ligand and receptor gene expression, we built a high-confidence cell-to-cell communication network to investigate signaling differences between AD and healthy brains. RESULTS: Specifically, we first performed broad communication pattern analyses to highlight that biologically related cell types in normal brains rely on largely overlapping signaling networks and that the AD brain exhibits the irregular inter-mixing of cell types and signaling pathways. Secondly, we performed a more focused cell-type-centric analysis and found that excitatory neurons in AD have significantly increased their communications to inhibitory neurons, while inhibitory neurons and other non-neuronal cells globally decreased theirs to all cells. Then, we delved deeper with a signaling-centric view, showing that canonical signaling pathways CSF, TGFß, and CX3C are significantly dysregulated in their signaling to the cell type microglia/PVM and from endothelial to neuronal cells for the WNT pathway. Finally, after extracting 23 known AD risk genes, our intracellular communication analysis revealed a strong connection of extracellular ligand genes APP, APOE, and PSEN1 to intracellular AD risk genes TREM2, ABCA1, and APP in the communication from astrocytes and microglia to neurons. CONCLUSIONS: In summary, with the novel advances in single-cell sequencing technologies, we show that cellular signaling is regulated in a cell-type-specific manner and that improper regulation of extracellular signaling genes is linked to intracellular risk genes, giving the mechanistic intra- and inter-cellular picture of AD.
Assuntos
Doença de Alzheimer , Comunicação Celular , Análise de Célula Única , Transcriptoma , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Humanos , Comunicação Celular/fisiologia , Análise de Célula Única/métodos , Encéfalo/metabolismo , Encéfalo/patologia , Córtex Pré-Frontal/metabolismo , Neurônios/metabolismo , Transdução de Sinais/fisiologia , Transdução de Sinais/genéticaRESUMO
Early and accurate detection of viruses in clinical and environmental samples is essential for effective public healthcare, treatment, and therapeutics. While PCR detects potential pathogens with high sensitivity, it is difficult to scale and requires knowledge of the exact sequence of the pathogen. With the advent of next-gen single-cell sequencing, it is now possible to scrutinize viral transcriptomics at the finest possible resolution-cells. This newfound ability to investigate individual cells opens new avenues to understand viral pathophysiology with unprecedented resolution. To leverage this ability, we propose an efficient and accurate computational pipeline, named Venus, for virus detection and integration site discovery in both single-cell and bulk-tissue RNA-seq data. Specifically, Venus addresses two main questions: whether a tissue/cell type is infected by viruses or a virus of interest? And if infected, whether and where has the virus inserted itself into the human genome? Our analysis can be broken into two parts-validation and discovery. Firstly, for validation, we applied Venus on well-studied viral datasets, such as HBV- hepatocellular carcinoma and HIV-infection treated with antiretroviral therapy. Secondly, for discovery, we analyzed datasets such as HIV-infected neurological patients and deeply sequenced T-cells. We detected viral transcripts in the novel target of the brain and high-confidence integration sites in immune cells. In conclusion, here we describe Venus, a publicly available software which we believe will be a valuable virus investigation tool for the scientific community at large.
Assuntos
Infecções por HIV , Neoplasias Hepáticas , Vírus , Humanos , RNA-Seq , Análise de Sequência de RNA , SoftwareRESUMO
Next-generation sequencing (NGS) is an incredibly useful tool for genetic disease diagnosis. However, the most commonly used bioinformatics methods for analyzing sequence reads insufficiently discriminate genomic regions with extensive sequence identity, such as gene families and pseudogenes, complicating diagnostics. This problem has been recognized for specific genes, including many involved in human disease, and diagnostic labs must perform additional costly steps to guarantee accurate diagnosis in these cases. Here we report a new data analysis method based on the comparison of read depth between highly homologous regions to identify misalignment. Analyzing six clinically important genes-CYP21A2, GBA, HBA1/2, PMS2, and SMN1-each exhibiting misalignment issues related to homology, we show that our technique can correctly identify potential misalignment events and be used to make appropriate calls. Combined with long-range PCR and/or MLPA orthogonal testing, our clinical laboratory can improve variant calling with minimal additional cost. We propose an accurate and cost-efficient NGS testing procedure that will benefit disease diagnostics, carrier screening, and research-based population studies.
Assuntos
Doenças Genéticas Inatas/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Humanos , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único , PseudogenesRESUMO
MOTIVATION: In addition to alternative splicing, alternative polyadenylation has also been identified as a critical and prevalent regulatory mechanism in human gene expression. However, the mechanism of alternative polyadenylation selection and the involved factors is still largely unknown. RESULTS: We use the ENCODE data to scan DNA functional elements, including chromatin accessibility and histone modification, around transcript cleavage sites. Our results demonstrate that polyadenylation sites tend to be less sensitive to DNase I. However, these polyadenylation sites have preference in nucleosome-depleted regions, indicating the involvement of chromatin higher-order structure rather than nucleosomes in the resultant lower chromatin accessibility. More interestingly, for genes using two polyadenylation sites, the distal sites show even lower chromatin accessibility compared with the proximal sites or the unique sites of genes using only one polyadenylation site. We also observe that the histone modification mark, histone H3 lysine 36 tri-methylation (H3K36Me3), exhibits different patterns around the cleavage sites of genes using multiple polyadenylation sites from those of genes using a single polyadenylation site. Surprisingly, the H3K36Me3 levels are comparable among the alternative polyadenylation sites themselves. In summary, polyadenylation and alternative polyadenylation are closely related to functional elements on the DNA level. CONTACT: liang.chen@usc.edu.
Assuntos
Cromatina/química , Histonas/metabolismo , Poliadenilação , Linhagem Celular , Desoxirribonuclease I , Humanos , Células K562 , Nucleossomos/químicaRESUMO
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.
Assuntos
Encéfalo , Redes Reguladoras de Genes , Transtornos Mentais , Análise de Célula Única , Humanos , Envelhecimento/genética , Encéfalo/metabolismo , Comunicação Celular/genética , Cromatina/metabolismo , Cromatina/genética , Genômica , Transtornos Mentais/genética , Córtex Pré-Frontal/metabolismo , Córtex Pré-Frontal/fisiologia , Locos de Características QuantitativasRESUMO
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multi-omics datasets into a resource comprising >2.8M nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550K cell-type-specific regulatory elements and >1.4M single-cell expression-quantitative-trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.
RESUMO
The purpose of this study was to compare the effects of nutritional supplement drinks (NSDs) and nutritional education (NE) on the nutritional status and physical performance of older nursing home residents who were at risk of malnutrition. This study was a clustered, randomized, parallel, multi-center clinical trial, with 107 participants more than 65 years old and at risk of malnutrition recruited from several nursing homes in this study. Participants were divided into two groups: an NE group (n = 50) and an NSD group (n = 57). The NE group was given NE by a dietitian, while the NSD group was provided with two packs of NSD except receiving NE (Mei Balance, Meiji Holdings, Tokyo, Japan) per day as a snack between meals and before bed. Anthropometric data, blood pressure, nutritional status, blood biochemical biomarkers, and physical performance were measured before and after 12-week interventions. After 12 weeks of NE combined with NSD intervention, body weight, body-mass index, the mini nutritional assessment-short form (MNA-SF) score, walking speed, and SF-36 questionnaire score were improved in older nursing home residents at risk of malnutrition.
Assuntos
Desnutrição , Estado Nutricional , Humanos , Idoso , Avaliação Nutricional , Desnutrição/prevenção & controle , Casas de Saúde , Desempenho Físico Funcional , Avaliação GeriátricaRESUMO
Preterm birth (PTB) is the leading cause of infant deaths globally. Current clinical measures often fail to identify women who may deliver preterm. Therefore, accurate screening tools are imperative for early prediction of PTB. Here, we show that Raman spectroscopy is a promising tool for studying biological interfaces, and we examine differences in the maternal metabolome of the first trimester plasma of PTB patients and those that delivered at term (healthy). We identified fifteen statistically significant metabolites that are predictive of the onset of PTB. Mass spectrometry metabolomics validates the Raman findings identifying key metabolic pathways that are enriched in PTB. We also show that patient clinical information alone and protein quantification of standard inflammatory cytokines both fail to identify PTB patients. We show for the first time that synergistic integration of Raman and clinical data guided with machine learning results in an unprecedented 85.1% accuracy of risk stratification of PTB in the first trimester that is currently not possible clinically. Correlations between metabolites and clinical features highlight the body mass index and maternal age as contributors of metabolic rewiring. Our findings show that Raman spectral screening may complement current prenatal care for early prediction of PTB, and our approach can be translated to other patient-specific biological interfaces.
Assuntos
Nascimento Prematuro , Gravidez , Humanos , Feminino , Recém-Nascido , Nascimento Prematuro/diagnóstico , Nascimento Prematuro/prevenção & controle , Primeiro Trimestre da Gravidez , Análise Espectral Raman , MetabolômicaRESUMO
iSARST is a web server for efficient protein structural similarity searches. It is a multi-processor, batch-processing and integrated implementation of several structural comparison tools and two database searching methods: SARST for common structural homologs and CPSARST for homologs with circular permutations. iSARST allows users submitting multiple PDB/SCOP entry IDs or an archive file containing many structures. After scanning the target database using SARST/CPSARST, the ordering of hits are refined with conventional structure alignment tools such as FAST, TM-align and SAMO, which are run in a PC cluster. In this way, iSARST achieves a high running speed while preserving the high precision of refinement engines. The final outputs include tables listing co-linear or circularly permuted homologs of the query proteins and a functional summary of the best hits. Superimposed structures can be examined through an interactive and informative visualization tool. iSARST provides the first batch mode structural comparison web service for both co-linear homologs and circular permutants. It can serve as a rapid annotation system for functionally unknown or hypothetical proteins, which are increasing rapidly in this post-genomics era. The server can be accessed at http://sarst.life.nthu.edu.tw/iSARST/.
Assuntos
Software , Homologia Estrutural de Proteína , Bases de Dados de Proteínas , Internet , Integração de Sistemas , Interface Usuário-ComputadorRESUMO
Circular permutation (CP) in a protein can be considered as if its sequence were circularized followed by a creation of termini at a new location. Since the first observation of CP in 1979, a substantial number of studies have concluded that circular permutants (CPs) usually retain native structures and functions, sometimes with increased stability or functional diversity. Although this interesting property has made CP useful in many protein engineering and folding researches, large-scale collections of CP-related information were not available until this study. Here we describe CPDB, the first CP DataBase. The organizational principle of CPDB is a hierarchical categorization in which pairs of circular permutants are grouped into CP clusters, which are further grouped into folds and in turn classes. Additions to CPDB include a useful set of tools and resources for the identification, characterization, comparison and visualization of CP. Besides, several viable CP site prediction methods are implemented and assessed in CPDB. This database can be useful in protein folding and evolution studies, the discovery of novel protein structural and functional relationships, and facilitating the production of new CPs with unique biotechnical or industrial interests. The CPDB database can be accessed at http://sarst.life.nthu.edu.tw/cpdb.
Assuntos
Bases de Dados de Proteínas , Proteínas/química , Gráficos por Computador , Internet , Dobramento de Proteína , Homologia Estrutural de Proteína , Interface Usuário-ComputadorRESUMO
When a potential disease-causing variant is detected in a proband, parental testing is used to determine the mode of inheritance. This study demonstrates that next-generation sequencing (NGS) is uniquely well suited for parental testing, in particular because of its ability to detect clinically relevant germline mosaicism. Parental variant testing by NGS was performed in a clinical laboratory for 1 year. The detection of mosaicism by NGS was compared with its detection by Sanger sequencing. Eight cases of previously unrevealed mosaicism were detected by NGS across eight different genes. Mosaic variants were differentiated from sequencing noise using custom bioinformatics analyses in combination with familial inheritance data and complementary Sanger sequencing. Sanger sequencing detected mosaic variants with allele fractions ≥8% by NGS, but could not detect mosaic variants below that level. Detection of germline mosaicism by NGS is invaluable to parents, providing a more accurate recurrence risk that can alter decisions on family planning and pregnancy management. Because NGS can also confirm parentage and increase scalability, it simultaneously streamlines and strengthens the variant curation process. These features make NGS the ideal method for parental testing, superior even to Sanger sequencing for most genomic loci.