Search | VHL Regional Portal

Spatiotemporal, optogenetic control of gene expression in organoids.

Legnini, Ivano; Emmenegger, Lisa; Zappulo, Alessandra; Rybak-Wolf, Agnieszka; Wurmus, Ricardo; Martinez, Anna Oliveras; Jara, Cledi Cerda; Boltengagen, Anastasiya; Hessler, Talé; Mastrobuoni, Guido; Kempa, Stefan; Zinzen, Robert; Woehler, Andrew; Rajewsky, Nikolaus.

Nat Methods ; 20(10): 1544-1552, 2023 Oct.

Article in English | MEDLINE | ID: mdl-37735569

ABSTRACT

Organoids derived from stem cells have become an increasingly important tool for studying human development and modeling disease. However, methods are still needed to control and study spatiotemporal patterns of gene expression in organoids. Here we combined optogenetics and gene perturbation technologies to activate or knock-down RNA of target genes in programmable spatiotemporal patterns. To illustrate the usefulness of our approach, we locally activated Sonic Hedgehog (SHH) signaling in an organoid model for human neurodevelopment. Spatial and single-cell transcriptomic analyses showed that this local induction was sufficient to generate stereotypically patterned organoids and revealed new insights into SHH's contribution to gene regulation in neurodevelopment. With this study, we propose optogenetic perturbations in combination with spatial transcriptomics as a powerful technology to reprogram and study cell fates and tissue patterning in organoids.

Subject(s)

Hedgehog Proteins , Optogenetics , Humans , Hedgehog Proteins/metabolism , Organoids/metabolism , Cell Differentiation , Gene Expression

SARS-CoV-2 infection dynamics revealed by wastewater sequencing analysis and deconvolution.

Schumann, Vic-Fabienne; de Castro Cuadrat, Rafael Ricardo; Wyler, Emanuel; Wurmus, Ricardo; Deter, Aylina; Quedenau, Claudia; Dohmen, Jan; Faxel, Miriam; Borodina, Tatiana; Blume, Alexander; Freimuth, Jonas; Meixner, Martin; Grau, José Horacio; Liere, Karsten; Hackenbeck, Thomas; Zietzschmann, Frederik; Gnirss, Regina; Böckelmann, Uta; Uyar, Bora; Franke, Vedran; Barke, Niclas; Altmüller, Janine; Rajewsky, Nikolaus; Landthaler, Markus; Akalin, Altuna.

Sci Total Environ ; 853: 158931, 2022 Dec 20.

Article in English | MEDLINE | ID: mdl-36228784

ABSTRACT

The use of RNA sequencing from wastewater samples is a valuable way for estimating infection dynamics and circulating lineages of SARS-CoV-2. This approach is independent from testing individuals and can therefore become the key tool to monitor this and potentially other viruses. However, it is equally important to develop easily accessible and scalable tools which can highlight critical changes in infection rates and dynamics over time across different locations given sequencing data from wastewater. Here, we provide an analysis of lineage dynamics in Berlin and New York City using wastewater sequencing and present PiGx SARS-CoV-2, a highly reproducible computational analysis pipeline with comprehensive reports. This end-to-end pipeline includes all steps from raw data to shareable reports, additional taxonomic analysis, deconvolution and geospatial time series analyses. Using simulated datasets (in silico generated and spiked-in samples) we could demonstrate the accuracy of our pipeline calculating proportions of Variants of Concern (VOC) from environmental as well as pre-mixed samples (spiked-in). By applying our pipeline on a dataset of wastewater samples from Berlin between February 2021 and January 2022, we could reconstruct the emergence of B.1.1.7(alpha) in February/March 2021 and the replacement dynamics from B.1.617.2 (delta) to BA.1 and BA.2 (omicron) during the winter of 2021/2022. Using data from very-short-reads generated in an industrial scale setting, we could see even higher accuracy in our deconvolution. Lastly, using a targeted sequencing dataset from New York City (receptor-binding-domain (RBD) only), we could reproduce the results recovering the proportions of the so-called cryptic lineages shown in the original study. Overall our study provides an in-depth analysis reconstructing virus lineage dynamics from wastewater. While applying our tool on a wide range of different datasets (from different types of wastewater sample locations and sequenced with different methods), we show that PiGx SARS-CoV-2 can be used to identify new mutations and detect any emerging new lineages in a highly automated and scalable way. Our approach can support efforts to establish continuous monitoring and early-warning projects for detecting SARS-CoV-2 or any other pathogen.

Subject(s)

COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Wastewater , New York City , Mannosyltransferases

Scalable Workflows and Reproducible Data Analysis for Genomics.

Strozzi, Francesco; Janssen, Roel; Wurmus, Ricardo; Crusoe, Michael R; Githinji, George; Di Tommaso, Paolo; Belhachemi, Dominique; Möller, Steffen; Smant, Geert; de Ligt, Joep; Prins, Pjotr.

Methods Mol Biol ; 1910: 723-745, 2019.

Article in English | MEDLINE | ID: mdl-31278683

ABSTRACT

Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, and interactomes, within and between individuals and across species. Due to large volumes, the analysis and integration of data generated by such high-throughput technologies have become computationally intensive, and analysis can no longer happen on a typical desktop computer.In this chapter we show how to describe and execute the same analysis using a number of workflow systems and how these follow different approaches to tackle execution and reproducibility issues. We show how any researcher can create a reusable and reproducible bioinformatics pipeline that can be deployed and run anywhere. We show how to create a scalable, reusable, and shareable workflow using four different workflow engines: the Common Workflow Language (CWL), Guix Workflow Language (GWL), Snakemake, and Nextflow. Each of which can be run in parallel.We show how to bundle a number of tools used in evolutionary biology by using Debian, GNU Guix, and Bioconda software distributions, along with the use of container systems, such as Docker, GNU Guix, and Singularity. Together these distributions represent the overall majority of software packages relevant for biology, including PAML, Muscle, MAFFT, MrBayes, and BLAST. By bundling software in lightweight containers, they can be deployed on a desktop, in the cloud, and, increasingly, on compute clusters.By bundling software through these public software distributions, and by creating reproducible and shareable pipelines using these workflow engines, not only do bioinformaticians have to spend less time reinventing the wheel but also do we get closer to the ideal of making science reproducible. The examples in this chapter allow a quick comparison of different solutions.

Subject(s)

Computational Biology , Genomics , Big Data , Biological Evolution , Cloud Computing , Computational Biology/methods , Data Analysis , Genomics/methods , Humans , Reproducibility of Results , Software , Workflow

HOT or not: examining the basis of high-occupancy target regions.

Wreczycka, Katarzyna; Franke, Vedran; Uyar, Bora; Wurmus, Ricardo; Bulut, Selman; Tursun, Baris; Akalin, Altuna.

Nucleic Acids Res ; 47(11): 5735-5745, 2019 06 20.

Article in English | MEDLINE | ID: mdl-31114922

ABSTRACT

High-occupancy target (HOT) regions are segments of the genome with unusually high number of transcription factor binding sites. These regions are observed in multiple species and thought to have biological importance due to high transcription factor occupancy. Furthermore, they coincide with house-keeping gene promoters and consequently associated genes are stably expressed across multiple cell types. Despite these features, HOT regions are solely defined using ChIP-seq experiments and shown to lack canonical motifs for transcription factors that are thought to be bound there. Although, ChIP-seq experiments are the golden standard for finding genome-wide binding sites of a protein, they are not noise free. Here, we show that HOT regions are likely to be ChIP-seq artifacts and they are similar to previously proposed 'hyper-ChIPable' regions. Using ChIP-seq data sets for knocked-out transcription factors, we demonstrate presence of false positive signals on HOT regions. We observe sequence characteristics and genomic features that are discriminatory of HOT regions, such as GC/CpG-rich k-mers, enrichment of RNA-DNA hybrids (R-loops) and DNA tertiary structures (G-quadruplex DNA). The artificial ChIP-seq enrichment on HOT regions could be associated to these discriminatory features. Furthermore, we propose strategies to deal with such artifacts for the future ChIP-seq studies.

Subject(s)

Binding Sites , Chromatin Immunoprecipitation/methods , Promoter Regions, Genetic , Transcription Factors/chemistry , Amino Acid Motifs , Animals , Artifacts , Caenorhabditis elegans , DNA/chemistry , Drosophila melanogaster , False Positive Reactions , G-Quadruplexes , Genome , Genome, Human , Genomics , Humans , Mice , Protein Binding , Protein Domains , RNA/chemistry , Sequence Analysis, DNA

Global identification of functional microRNA-mRNA interactions in Drosophila.

Wessels, Hans-Hermann; Lebedeva, Svetlana; Hirsekorn, Antje; Wurmus, Ricardo; Akalin, Altuna; Mukherjee, Neelanjan; Ohler, Uwe.

Nat Commun ; 10(1): 1626, 2019 04 09.

Article in English | MEDLINE | ID: mdl-30967537

ABSTRACT

MicroRNAs (miRNAs) are key mediators of post-transcriptional gene expression silencing. So far, no comprehensive experimental annotation of functional miRNA target sites exists in Drosophila. Here, we generated a transcriptome-wide in vivo map of miRNA-mRNA interactions in Drosophila melanogaster, making use of single nucleotide resolution in Argonaute1 (AGO1) crosslinking and immunoprecipitation (CLIP) data. Absolute quantification of cellular miRNA levels presents the miRNA pool in Drosophila cell lines to be more diverse than previously reported. Benchmarking two CLIP approaches, we identify a similar predictive potential to unambiguously assign thousands of miRNA-mRNA pairs from AGO1 interaction data at unprecedented depth, achieving higher signal-to-noise ratios than with computational methods alone. Quantitative RNA-seq and sub-codon resolution ribosomal footprinting data upon AGO1 depletion enabled the determination of miRNA-mediated effects on target expression and translation. We thus provide the first comprehensive resource of miRNA target sites and their quantitative functional impact in Drosophila.

Subject(s)

Argonaute Proteins/genetics , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Gene Expression Regulation , MicroRNAs/metabolism , RNA, Messenger/metabolism , Animals , MicroRNAs/genetics , MicroRNAs/isolation & purification , RNA, Messenger/genetics , RNA, Messenger/isolation & purification , Sequence Analysis, RNA , Transcriptome/genetics

Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling.

Karabacak Calviello, Aslihan; Hirsekorn, Antje; Wurmus, Ricardo; Yusuf, Dilmurat; Ohler, Uwe.

Genome Biol ; 20(1): 42, 2019 02 21.

Article in English | MEDLINE | ID: mdl-30791920

ABSTRACT

BACKGROUND: DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. RESULTS: Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints. CONCLUSIONS: We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq.

Subject(s)

DNA Footprinting , Genomics/methods , Sequence Analysis, DNA/methods , Transcription Factors/metabolism , Gene Library , HEK293 Cells , Humans , K562 Cells

PiGx: reproducible genomics analysis pipelines with GNU Guix.

Wurmus, Ricardo; Uyar, Bora; Osberg, Brendan; Franke, Vedran; Gosdschan, Alexander; Wreczycka, Katarzyna; Ronen, Jonathan; Akalin, Altuna.

Gigascience ; 7(12)2018 12 01.

Article in English | MEDLINE | ID: mdl-30277498

ABSTRACT

In bioinformatics, as well as other computationally intensive research fields, there is a need for workflows that can reliably produce consistent output, from known sources, independent of the software environment or configuration settings of the machine on which they are executed. Indeed, this is essential for controlled comparison between different observations and for the wider dissemination of workflows. However, providing this type of reproducibility and traceability is often complicated by the need to accommodate the myriad dependencies included in a larger body of software, each of which generally comes in various versions. Moreover, in many fields (bioinformatics being a prime example), these versions are subject to continual change due to rapidly evolving technologies, further complicating problems related to reproducibility. Here, we propose a principled approach for building analysis pipelines and managing their dependencies with GNU Guix. As a case study to demonstrate the utility of our approach, we present a set of highly reproducible pipelines called PiGx for the analysis of RNA sequencing, chromatin immunoprecipitation sequencing, bisulfite-treated DNA sequencing, and single-cell resolution RNA sequencing. All pipelines process raw experimental data and generate reports containing publication-ready plots and figures, with interactive report elements and standard observables. Users may install these highly reproducible packages and apply them to their own datasets without any special computational expertise beyond the use of the command line. We hope such a toolkit will provide immediate benefit to laboratory workers wishing to process their own datasets or bioinformaticians seeking to automate all, or parts of, their analyses. In the long term, we hope our approach to reproducibility will serve as a blueprint for reproducible workflows in other areas. Our pipelines, along with their corresponding documentation and sample reports, are available at http://bioinformatics.mdc-berlin.de/pigx.

Subject(s)

Genomics , User-Computer Interface , Chromatin Immunoprecipitation , Computational Biology , DNA Methylation , Promoter Regions, Genetic , Reproducibility of Results , Sequence Analysis, RNA , Single-Cell Analysis

RCAS: an RNA centric annotation system for transcriptome-wide regions of interest.

Uyar, Bora; Yusuf, Dilmurat; Wurmus, Ricardo; Rajewsky, Nikolaus; Ohler, Uwe; Akalin, Altuna.

Nucleic Acids Res ; 45(10): e91, 2017 Jun 02.

Article in English | MEDLINE | ID: mdl-28334930

ABSTRACT

In the field of RNA, the technologies for studying the transcriptome have created a tremendous potential for deciphering the puzzles of the RNA biology. Along with the excitement, the unprecedented volume of RNA related omics data is creating great challenges in bioinformatics analyses. Here, we present the RNA Centric Annotation System (RCAS), an R package, which is designed to ease the process of creating gene-centric annotations and analysis for the genomic regions of interest obtained from various RNA-based omics technologies. The design of RCAS is modular, which enables flexible usage and convenient integration with other bioinformatics workflows. RCAS is an R/Bioconductor package but we also created graphical user interfaces including a Galaxy wrapper and a stand-alone web service. The application of RCAS on published datasets shows that RCAS is not only able to reproduce published findings but also helps generate novel knowledge and hypotheses. The meta-gene profiles, gene-centric annotation, motif analysis and gene-set analysis provided by RCAS provide contextual knowledge which is necessary for understanding the functional aspects of different biological events that involve RNAs. In addition, the array of different interfaces and deployment options adds the convenience of use for different levels of users. RCAS is available at http://bioconductor.org/packages/release/bioc/html/RCAS.html and http://rcas.mdc-berlin.de.

Subject(s)

Genome , Molecular Sequence Annotation/methods , RNA, Messenger/genetics , RNA-Binding Proteins/genetics , Transcriptome , User-Computer Interface , Animals , Base Sequence , Binding Sites , Chickens/genetics , Chickens/metabolism , Computational Biology/methods , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Humans , Protein Binding , RNA, Messenger/metabolism , RNA-Binding Proteins/metabolism

DoRiNA 2.0--upgrading the doRiNA database of RNA interactions in post-transcriptional regulation.

Blin, Kai; Dieterich, Christoph; Wurmus, Ricardo; Rajewsky, Nikolaus; Landthaler, Markus; Akalin, Altuna.

Nucleic Acids Res ; 43(Database issue): D160-7, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25416797

ABSTRACT

The expression of almost all genes in animals is subject to post-transcriptional regulation by RNA binding proteins (RBPs) and microRNAs (miRNAs). The interactions between both RBPs and miRNAs with mRNA can be mapped on a whole-transcriptome level using experimental and computational techniques established in the past years. The combined action of RBPs and miRNAs is thought to form a post-transcriptional regulatory code. Here we present doRiNA 2.0, available at http://dorina.mdc-berlin.de. In this highly improved new version, we have completely reworked the user interface and expanded the database to improve the usability of the website. Taking into account user feedback over the past years, the input forms for both the simple and the combinatorial search function have been streamlined and combined into a single web page that will also display the search results. Especially, custom uploads is one of the key new features in doRiNA 2.0. To enable the inclusion of doRiNA into third-party analysis pipelines, all operations are accessible via a REST API. Alternatively, local installations can be queried using a Python API. Both the web application and the APIs are available under an OSI-approved Open Source license that allows research and commercial access and re-use.

Subject(s)

Databases, Genetic , MicroRNAs/metabolism , RNA Processing, Post-Transcriptional , RNA-Binding Proteins/metabolism , Animals , Argonaute Proteins/metabolism , Exons , HEK293 Cells , Humans , Internet , Mice , Nuclear Proteins/metabolism , RNA/metabolism , RNA, Circular , Serine-Arginine Splicing Factors , Software

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL