RESUMO
The growth of omic data presents evolving challenges in data manipulation, analysis and integration. Addressing these challenges, Bioconductor provides an extensive community-driven biological data analysis platform. Meanwhile, tidy R programming offers a revolutionary data organization and manipulation standard. Here we present the tidyomics software ecosystem, bridging Bioconductor to the tidy R paradigm. This ecosystem aims to streamline omic analysis, ease learning and encourage cross-disciplinary collaborations. We demonstrate the effectiveness of tidyomics by analyzing 7.5 million peripheral blood mononuclear cells from the Human Cell Atlas, spanning six data frameworks and ten analysis tools.
Assuntos
Software , Humanos , Biologia Computacional/métodos , Leucócitos Mononucleares/metabolismo , Leucócitos Mononucleares/citologia , Genômica/métodos , Análise de DadosRESUMO
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Assuntos
Benchmarking , RNA , RNA/genética , RNA-Seq , Poliadenilação , Análise de Sequência de RNA/métodosRESUMO
SUMMARY: RNA isoforms contribute to the diverse functionality of the proteins they encode within the cell. Visualizing how isoform expression differs across cell types and brain regions can inform our understanding of disease and gain or loss of functionality caused by alternative splicing with potential negative impacts. However, the extent to which this occurs in specific cell types and brain regions is largely unknown. This is the kind of information that ScisorWiz plots can provide in an informative and easily communicable manner. ScisorWiz affords its user the opportunity to visualize specific genes across any number of cell types, and provides various sorting options for the user to gain different ways to understand their data. ScisorWiz provides a clear picture of differential isoform expression through various clustering methods and highlights features such as alternative exons and single-nucleotide variants. Tools like ScisorWiz are key for interpreting single-cell isoform sequencing data. This tool applies to any single-cell long-read RNA sequencing data in any cell type, tissue or species. AVAILABILITY AND IMPLEMENTATION: Source code is available at http://github.com/ans4013/ScisorWiz. No new data were generated for this publication. Data used to generate figures was sourced from GEO accession token GSE158450 and available on GitHub as example data.
Assuntos
Processamento Alternativo , Software , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Isoformas de RNA/metabolismo , Éxons , Análise de Sequência de RNARESUMO
Endoplasmic reticulum (ER) stress response is an adaptive program to cope with cellular stress that disturbs the function and homeostasis of ER, which commonly occurs during cancer progression to late stage. Late-stage cancers, mostly requiring chemotherapy, often develop treatment resistance. Chemoresistance has been linked to ER stress response; however, most of the evidence has come from studies that correlate the expression of stress markers with poor prognosis or demonstrate proapoptosis by the knockdown of stress-responsive genes. Since ER stress in cancers usually persists and is essentially not induced by genetic manipulations, we used low doses of ER stress inducers at levels that allowed cell adaptation to occur in order to investigate the effect of stress response on chemoresistance. We found that prolonged tolerable ER stress promotes mesenchymal-epithelial transition, slows cell-cycle progression, and delays the S-phase exit. Consequently, cisplatin-induced apoptosis was significantly decreased in stress-adapted cells, implying their acquisition of cisplatin resistance. Molecularly, we found that proliferating cell nuclear antigen (PCNA) ubiquitination and the expression of polymerase η, the main polymerase responsible for translesion synthesis across cisplatin-DNA damage, were up-regulated in ER stress-adaptive cells, and their enhanced cisplatin resistance was abrogated by the knockout of polymerase η. We also found that a fraction of p53 in stress-adapted cells was translocated to the nucleus, and that these cells exhibited a significant decline in the level of cisplatin-DNA damage. Consistently, we showed that the nuclear p53 coincided with strong positivity of glucose-related protein 78 (GRP78) on immunostaining of clinical biopsies, and the cisplatin-based chemotherapy was less effective for patients with high levels of ER stress. Taken together, this study uncovers that adaptation to ER stress enhances DNA repair and damage tolerance, with which stressed cells gain resistance to chemotherapeutics.
Assuntos
Adaptação Fisiológica , Cisplatino/farmacologia , Reparo do DNA , DNA Polimerase Dirigida por DNA/metabolismo , Resistencia a Medicamentos Antineoplásicos , Estresse do Retículo Endoplasmático , Neoplasias Bucais/tratamento farmacológico , Antineoplásicos/farmacologia , Apoptose , Proliferação de Células , Dano ao DNA , Replicação do DNA , DNA Polimerase Dirigida por DNA/genética , Chaperona BiP do Retículo Endoplasmático , Humanos , Neoplasias Bucais/metabolismo , Neoplasias Bucais/patologia , Células Tumorais CultivadasRESUMO
The growth of omic data presents evolving challenges in data manipulation, analysis, and integration. Addressing these challenges, Bioconductor1 provides an extensive community-driven biological data analysis platform. Meanwhile, tidy R programming2 offers a revolutionary standard for data organisation and manipulation. Here, we present the tidyomics software ecosystem, bridging Bioconductor to the tidy R paradigm. This ecosystem aims to streamline omic analysis, ease learning, and encourage cross-disciplinary collaborations. We demonstrate the effectiveness of tidyomics by analysing 7.5 million peripheral blood mononuclear cells from the Human Cell Atlas3, spanning six data frameworks and ten analysis tools.
RESUMO
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.
RESUMO
In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Genômica , SoftwareRESUMO
BACKGROUND: The development of complex diseases is contributed by the combination of multiple factors and complicated interactions between them. Inflammation has recently been associated with many complex diseases and may cause long-term damage to the human body. In this study, we examined whether two types of complex disease, cerebrovascular disease (CVD) or major depression (MD), systematically altered the transcriptomes of non-diseased human tissues and whether inflammation is linked to identifiable molecular signatures, using post-mortem samples from the Genotype-Tissue Expression (GTEx) project. RESULTS: Following a series of differential expression analyses, dozens to hundreds of differentially expressed genes (DEGs) were identified in multiple tissues between subjects with and without a history of CVD or MD. DEGs from these disease-associated tissues-the visceral adipose, tibial artery, caudate, and spinal cord for CVD; and the hypothalamus, putamen, and spinal cord for MD-were further analyzed for functional enrichment. Many pathways associated with immunological events were enriched in the upregulated DEGs of the CVD-associated tissues, as were the neurological and metabolic pathways in DEGs of the MD-associated tissues. Eight gene-tissue pairs were found to overlap with those prioritized by our transcriptome-wide association studies, indicating a potential genetic effect on gene expression for circulating cytokine phenotypes. CONCLUSION: Cerebrovascular disease and major depression cause detectable changes in the gene expression of non-diseased tissues, suggesting that a possible long-term impact of diseases, lifestyles and environmental factors may together contribute to the appearance of "transcriptomic scars" on the human body. Furthermore, inflammation is probably one of the systemic and long-lasting effects of cerebrovascular events.