Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Bioinformatics ; 38(20): 4727-4734, 2022 10 14.
Article in English | MEDLINE | ID: mdl-36018233

ABSTRACT

MOTIVATION: Transcriptome-based gene co-expression analysis has become a standard procedure for structured and contextualized understanding and comparison of different conditions and phenotypes. Since large study designs with a broad variety of conditions are costly and laborious, extensive comparisons are hindered when utilizing only a single dataset. Thus, there is an increased need for tools that allow the integration of multiple transcriptomic datasets with subsequent joint analysis, which can provide a more systematic understanding of gene co-expression and co-functionality within and across conditions. To make such an integrative analysis accessible to a wide spectrum of users with differing levels of programming expertise it is essential to provide user-friendliness and customizability as well as thorough documentation. RESULTS: This article introduces horizontal CoCena (hCoCena: horizontal construction of co-expression networks and analysis), an R-package for network-based co-expression analysis that allows the analysis of a single transcriptomic dataset as well as the joint analysis of multiple datasets. With hCoCena, we provide a freely available, user-friendly and adaptable tool for integrative multi-study or single-study transcriptomics analyses alongside extensive comparisons to other existing tools. AVAILABILITY AND IMPLEMENTATION: The hCoCena R-package is provided together with R Markdowns that implement an exemplary analysis workflow including extensive documentation and detailed descriptions of data structures and objects. Such efforts not only make the tool easy to use but also enable the seamless integration of user-written scripts and functions into the workflow, creating a tool that provides a clear design while remaining flexible and highly customizable. The package and additional information including an extensive Wiki are freely available on GitHub: https://github.com/MarieOestreich/hCoCena. The version at the time of writing has been added to Zenodo under the following link: https://doi.org/10.5281/zenodo.6911782. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Transcriptome , Gene Expression Profiling , Phenotype , Workflow
2.
J Cheminform ; 16(1): 26, 2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38444032

ABSTRACT

Autoencoders are frequently used to embed molecules for training of downstream deep learning models. However, evaluation of the chemical information quality in the latent spaces is lacking and the model architectures are often arbitrarily chosen. Unoptimized architectures may not only negatively affect latent space quality but also increase energy consumption during training, making the models unsustainable. We conducted systematic experiments to better understand how the autoencoder architecture affects the reconstruction and latent space quality and how it can be optimized towards the encoding task as well as energy consumption. We can show that optimizing the architecture allows us to maintain the quality of a generic architecture but using 97% less data and reducing energy consumption by around 36%. We additionally observed that representing the molecules as SELFIES reduced the reconstruction performance compared to SMILES and that training with enumerated SMILES drastically improved latent space quality. Scientific Contribution: This work provides the first comprehensive systematic analysis of how choosing the autoencoder architecture affects the reconstruction performance of small molecules, the chemical information content of the latent space as well as the energy required for training. Demonstrated on the MOSES benchmarking dataset it provides first valuable insights into how autoencoders for the embedding of small molecules can be designed to optimize their utility and simultaneously become more sustainable, both in terms of energy consumption as well as the required amount of training data. All code, data and model checkpoints are made available on Zenodo (Oestreich et al. Small molecule autoencoders: architecture engineering to optimize latent space utility and sustainability. Zenodo, 2024). Furthermore, the top models can be found on GitHub with scripts to encode custom molecules: https://github.com/MarieOestreich/small-molecule-autoencoders .

3.
STAR Protoc ; 5(1): 102922, 2024 Mar 15.
Article in English | MEDLINE | ID: mdl-38427570

ABSTRACT

As the number and complexity of transcriptomic datasets increase, there is a rising demand for accessible and user-friendly analysis tools. Here, we present hCoCena (horizontal construction of co-expression networks and analysis), a toolbox facilitating the analysis of a single dataset, as well as the joint analysis of multiple datasets. We describe steps for workspace setup, formatting tables, data processing, and network integration. We then detail procedures for gene clustering, gene set enrichment analysis, and transcription factor enrichment analysis. For complete details on the use and execution of this protocol, please refer to Oestreich et al.1.


Subject(s)
Gene Expression Profiling , Transcriptome , Transcriptome/genetics , Cluster Analysis , Transcription Factors
4.
Elife ; 112022 08 31.
Article in English | MEDLINE | ID: mdl-36043458

ABSTRACT

Omics-based technologies are driving major advances in precision medicine, but efforts are still required to consolidate their use in drug discovery. In this work, we exemplify the use of multi-omics to support the development of 3-chloropiperidines, a new class of candidate anticancer agents. Combined analyses of transcriptome and chromatin accessibility elucidated the mechanisms underlying sensitivity to test agents. Furthermore, we implemented a new versatile strategy for the integration of RNA- and ATAC-seq (Assay for Transposase-Accessible Chromatin) data, able to accelerate and extend the standalone analyses of distinct omic layers. This platform guided the construction of a perturbation-informed basal signature predicting cancer cell lines' sensitivity and to further direct compound development against specific tumor types. Overall, this approach offers a scalable pipeline to support the early phases of drug discovery, understanding of mechanisms, and potentially inform the positioning of therapeutics in the clinic.


Subject(s)
Chromatin , Transcriptome , Precision Medicine , RNA , Transposases/metabolism
5.
EXCLI J ; 20: 1243-1260, 2021.
Article in English | MEDLINE | ID: mdl-34345236

ABSTRACT

An increasing amount of attention has been geared towards understanding the privacy risks that arise from sharing genomic data of human origin. Most of these efforts have focused on issues in the context of genomic sequence data, but the popularity of techniques for collecting other types of genome-related data has prompted researchers to investigate privacy concerns in a broader genomic context. In this review, we give an overview of different types of genome-associated data, their individual ways of revealing sensitive information, the motivation to share them as well as established and upcoming methods to minimize information leakage. We further discuss the concise threats that are being posed, who is at risk, and how the risk level compares to potential benefits, all while addressing the topic in the context of modern technology, methodology, and information sharing culture. Additionally, we will discuss the current legal situation regarding the sharing of genomic data in a selection of countries, evaluating the scope of their applicability as well as their limitations. We will finalize this review by evaluating the development that is required in the scientific field in the near future in order to improve and develop privacy-preserving data sharing techniques for the genomic context.

6.
ERJ Open Res ; 7(3)2021 Jul.
Article in English | MEDLINE | ID: mdl-34527724

ABSTRACT

BACKGROUND: Immune cells play a major role in the pathogenesis of COPD. Changes in the distribution and cellular functions of major immune cells, such as alveolar macrophages (AMs) and neutrophils are well known; however, their transcriptional reprogramming and contribution to the pathophysiology of COPD are still not fully understood. METHOD: To determine changes in transcriptional reprogramming and lipid metabolism in the major immune cell type within bronchoalveolar lavage fluid, we analysed whole transcriptomes and lipidomes of sorted CD45+Lin-HLA-DR+CD66b-Autofluorescencehi AMs from controls and COPD patients. RESULTS: We observed global transcriptional reprogramming featuring a spectrum of activation states, including pro- and anti-inflammatory signatures. We further detected significant changes between COPD patients and controls in genes involved in lipid metabolism, such as fatty acid biosynthesis in GOLD2 patients. Based on these findings, assessment of a total of 202 lipid species in sorted AMs revealed changes of cholesteryl esters, monoacylglycerols and phospholipids in a disease grade-dependent manner. CONCLUSIONS: Transcriptome and lipidome profiling of COPD AMs revealed GOLD grade-dependent changes, such as in cholesterol metabolism and interferon-α and γ responses.

7.
Genome Med ; 13(1): 7, 2021 01 13.
Article in English | MEDLINE | ID: mdl-33441124

ABSTRACT

BACKGROUND: The SARS-CoV-2 pandemic is currently leading to increasing numbers of COVID-19 patients all over the world. Clinical presentations range from asymptomatic, mild respiratory tract infection, to severe cases with acute respiratory distress syndrome, respiratory failure, and death. Reports on a dysregulated immune system in the severe cases call for a better characterization and understanding of the changes in the immune system. METHODS: In order to dissect COVID-19-driven immune host responses, we performed RNA-seq of whole blood cell transcriptomes and granulocyte preparations from mild and severe COVID-19 patients and analyzed the data using a combination of conventional and data-driven co-expression analysis. Additionally, publicly available data was used to show the distinction from COVID-19 to other diseases. Reverse drug target prediction was used to identify known or novel drug candidates based on finding from data-driven findings. RESULTS: Here, we profiled whole blood transcriptomes of 39 COVID-19 patients and 10 control donors enabling a data-driven stratification based on molecular phenotype. Neutrophil activation-associated signatures were prominently enriched in severe patient groups, which was corroborated in whole blood transcriptomes from an independent second cohort of 30 as well as in granulocyte samples from a third cohort of 16 COVID-19 patients (44 samples). Comparison of COVID-19 blood transcriptomes with those of a collection of over 3100 samples derived from 12 different viral infections, inflammatory diseases, and independent control samples revealed highly specific transcriptome signatures for COVID-19. Further, stratified transcriptomes predicted patient subgroup-specific drug candidates targeting the dysregulated systemic immune response of the host. CONCLUSIONS: Our study provides novel insights in the distinct molecular subgroups or phenotypes that are not simply explained by clinical parameters. We show that whole blood transcriptomes are extremely informative for COVID-19 since they capture granulocytes which are major drivers of disease severity.


Subject(s)
COVID-19/pathology , Neutrophils/metabolism , Transcriptome , Antiviral Agents/therapeutic use , COVID-19/virology , Case-Control Studies , Down-Regulation , Drug Repositioning , Humans , Neutrophils/cytology , Neutrophils/immunology , Phenotype , Principal Component Analysis , RNA/blood , RNA/chemistry , RNA/metabolism , Sequence Analysis, RNA , Severity of Illness Index , Up-Regulation , COVID-19 Drug Treatment
8.
Cell Rep ; 29(5): 1221-1235.e5, 2019 10 29.
Article in English | MEDLINE | ID: mdl-31665635

ABSTRACT

Tumor-associated macrophages (TAMs) are frequently the most abundant immune cells in cancers and are associated with poor survival. Here, we generated TAM molecular signatures from K14cre;Cdh1flox/flox;Trp53flox/flox (KEP) and MMTV-NeuT (NeuT) transgenic mice that resemble human invasive lobular carcinoma (ILC) and HER2+ tumors, respectively. Determination of TAM-specific signatures requires comparison with healthy mammary tissue macrophages to avoid overestimation of gene expression differences. TAMs from the two models feature a distinct transcriptomic profile, suggesting that the cancer subtype dictates their phenotype. The KEP-derived signature reliably correlates with poor overall survival in ILC but not in triple-negative breast cancer patients, indicating that translation of murine TAM signatures to patients is cancer subtype dependent. Collectively, we show that a transgenic mouse tumor model can yield a TAM signature relevant for human breast cancer outcome prognosis and provide a generalizable strategy for determining and applying immune cell signatures provided the murine model reflects the human disease.


Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Gene Expression Profiling , Macrophages/metabolism , Mammary Neoplasms, Animal/pathology , Transcription, Genetic , Animals , Carcinogenesis/genetics , Carcinogenesis/pathology , Disease Models, Animal , Female , Gene Expression Regulation, Neoplastic , Humans , Mammary Neoplasms, Animal/genetics , Mice, Inbred BALB C , Mice, Transgenic , Phenotype , Prognosis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Survival Analysis , Transcriptome/genetics , Treatment Outcome
SELECTION OF CITATIONS
SEARCH DETAIL