RESUMO
The molecular causes and mechanisms of neurodegenerative diseases remain poorly understood. A growing number of single-cell studies have implicated various neural, glial, and immune cell subtypes to affect the mammalian central nervous system in many age-related disorders. Integrating this body of transcriptomic evidence into a comprehensive and reproducible framework poses several computational challenges. Here, we introduce ZEBRA, a large single-cell and single-nucleus RNA-seq database. ZEBRA integrates and normalizes gene expression and metadata from 33 studies, encompassing 4.2 million human and mouse brain cells sampled from 39 brain regions. It incorporates samples from patients with neurodegenerative diseases like Alzheimer's disease, Parkinson's disease, and Multiple sclerosis, as well as samples from relevant mouse models. We employed scVI, a deep probabilistic auto-encoder model, to integrate the samples and curated both cell and sample metadata for downstream analysis. ZEBRA allows for cell-type and disease-specific markers to be explored and compared between sample conditions and brain regions, a cell composition analysis, and gene-wise feature mappings. Our comprehensive molecular database facilitates the generation of data-driven hypotheses, enhancing our understanding of mammalian brain function during aging and disease. The data sets, along with an interactive database are freely available at https://www.ccb.uni-saarland.de/zebra.
Assuntos
Doenças Neurodegenerativas , Análise de Célula Única , Animais , Humanos , Camundongos , Doença de Alzheimer/metabolismo , Encéfalo/metabolismo , Doenças Neurodegenerativas/genética , Doença de Parkinson/metabolismo , Transcriptoma , Expressão GênicaRESUMO
Single-cell RNA sequencing (RNA-seq) has revolutionized our understanding of cell biology, developmental and pathophysiological molecular processes, paving the way toward novel diagnostic and therapeutic approaches. However, most of the gene regulatory processes on the single-cell level are still unknown, including post-transcriptional control conferred by microRNAs (miRNAs). Like the established single-cell gene expression analysis, advanced computational expertise is required to comprehensively process newly emerging single-cell miRNA-seq datasets. A web server providing a workflow tailored for single-cell miRNA-seq data with a self-explanatory interface is currently not available. Here, we present SingmiR, enabling the rapid (pre-)processing and quantification of human miRNAs from noncoding single-cell samples. It performs read trimming for different library preparation protocols, generates automated quality control reports and provides feature-normalized count files. Numerous standard and advanced analyses such as dimension reduction, clustered feature heatmaps, sample correlation heatmaps and differential expression statistics are implemented. We aim to speed up the prototyping pipeline for biologists developing single-cell miRNA-seq protocols on small to medium-sized datasets. SingmiR is freely available to all users without the need for a login at https://www.ccb.uni-saarland.de/singmir.
Assuntos
MicroRNAs , Análise de Sequência de RNA , Análise de Célula Única , Software , MicroRNAs/genética , MicroRNAs/metabolismo , Análise de Célula Única/métodos , Humanos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Alinhamento de SequênciaRESUMO
The human microbiome has emerged as a rich source of diverse and bioactive natural products, harboring immense potential for therapeutic applications. To facilitate systematic exploration and analysis of its biosynthetic landscape, we present ABC-HuMi: the Atlas of Biosynthetic Gene Clusters (BGCs) in the Human Microbiome. ABC-HuMi integrates data from major human microbiome sequence databases and provides an expansive repository of BGCs compared to the limited coverage offered by existing resources. Employing state-of-the-art BGC prediction and analysis tools, our database ensures accurate annotation and enhanced prediction capabilities. ABC-HuMi empowers researchers with advanced browsing, filtering, and search functionality, enabling efficient exploration of the resource. At present, ABC-HuMi boasts a catalog of 19 218 representative BGCs derived from the human gut, oral, skin, respiratory and urogenital systems. By capturing the intricate biosynthetic potential across diverse human body sites, our database fosters profound insights into the molecular repertoire encoded within the human microbiome and offers a comprehensive resource for the discovery and characterization of novel bioactive compounds. The database is freely accessible at https://www.ccb.uni-saarland.de/abc_humi/.
Assuntos
Vias Biossintéticas , Bases de Dados Genéticas , Microbiota , Família Multigênica , Humanos , Vias Biossintéticas/genética , Biologia Computacional/instrumentação , Internet , Microbiota/genética , Família Multigênica/genética , Metagenoma/genéticaRESUMO
Quantifying microbiome species and composition from metagenomic assays is often challenging due to its time-consuming nature and computational complexity. In Bioinformatics, k-mer-based approaches were long established to expedite the analysis of large sequencing data and are now widely used to annotate metagenomic data. We make use of k-mer counting techniques for efficient and accurate compositional analysis of microbiota from whole metagenome sequencing. Mibianto solves this problem by operating directly on read files, without manual preprocessing or complete data exchange. It handles diverse sequencing platforms, including short single-end, paired-end, and long read technologies. Our sketch-based workflow significantly reduces the data volume transferred from the user to the server (up to 99.59% size reduction) to subsequently perform taxonomic profiling with enhanced efficiency and privacy. Mibianto offers functionality beyond k-mer quantification; it supports advanced community composition estimation, including diversity, ordination, and differential abundance analysis. Our tool aids in the standardization of computational workflows, thus supporting reproducibility of scientific sequencing studies. It is adaptable to small- and large-scale experimental designs and offers a user-friendly interface, thus making it an invaluable tool for both clinical and research-oriented metagenomic studies. Mibianto is freely available without the need for a login at: https://www.ccb.uni-saarland.de/mibianto.
Assuntos
Metagenômica , Microbiota , Software , Metagenômica/métodos , Microbiota/genética , Humanos , Metagenoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , Fluxo de Trabalho , Análise de Sequência de DNA/métodos , Biologia Computacional/métodosRESUMO
Selecting proper genome assembly is key for downstream analysis in genomics studies. However, the availability of many genome assembly tools and the huge variety of their running parameters challenge this task. The existing online evaluation tools are limited to specific taxa or provide just a one-sided view on the assembly quality. We present WebQUAST, a web server for multifaceted quality assessment and comparison of genome assemblies based on the state-of-the-art QUAST tool. The server is freely available at https://www.ccb.uni-saarland.de/quast/. WebQUAST can handle an unlimited number of genome assemblies and evaluate them against a user-provided or pre-loaded reference genome or in a completely reference-free fashion. We demonstrate key WebQUAST features in three common evaluation scenarios: assembly of an unknown species, a model organism, and a close variant of it.
Assuntos
Genômica , Software , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , InternetRESUMO
MicroRNAs (miRNAs) are small non-coding RNAs that play a critical role in regulating diverse biological processes. Extracting functional insights from a list of miRNAs is challenging, as each miRNA can potentially interact with hundreds of genes. To address this challenge, we developed miEAA, a flexible and comprehensive miRNA enrichment analysis tool based on direct and indirect miRNA annotation. The latest release of miEAA includes a data warehouse of 19 miRNA repositories, covering 10 different organisms and 139 399 functional categories. We have added information on the cellular context of miRNAs, isomiRs, and high-confidence miRNAs to improve the accuracy of the results. We have also improved the representation of aggregated results, including interactive Upset plots to aid users in understanding the interaction among enriched terms or categories. Finally, we demonstrate the functionality of miEAA in the context of ageing and highlight the importance of carefully considering the miRNA input list. MiEAA is free to use and publicly available at https://www.ccb.uni-saarland.de/mieaa/.
Assuntos
MicroRNAs , Software , MicroRNAs/genética , Bases de Dados de Ácidos NucleicosRESUMO
A significant fraction of mature miRNA transcripts carries sequence and/or length variations, termed isomiRs. IsomiRs are differentially abundant in cell types, tissues, body fluids or patients' samples. Not surprisingly, multiple studies describe a physiological and pathophysiological role. Despite their importance, systematically collected and annotated isomiR information available in databases remains limited. We thus developed isomiRdb, a comprehensive resource that compiles miRNA expression data at isomiR resolution from various sources. We processed 42 499 human miRNA-seq datasets (5.9 × 1011 sequencing reads) and consistently analyzed them using miRMaster and sRNAbench. Our database provides online access to the 90 483 most abundant isomiRs (>1 RPM in at least 1% of the samples) from 52 tissues and 188 cell types. Additionally, the full set of over 3 million detected isomiRs is available for download. Our resource can be queried at the sample, miRNA or isomiR level so users can quickly answer common questions about the presence/absence of a particular miRNA/isomiR in tissues of interest. Further, the database facilitates to identify whether a potentially interesting new isoform has been detected before and its frequency. In addition to expression tables, isomiRdb can generate multiple interactive visualisations including violin plots and heatmaps. isomiRdb is free to use and publicly available at: https://www.ccb.uni-saarland.de/isomirdb.
Assuntos
Bases de Dados Genéticas , MicroRNAs , Humanos , Sequenciamento de Nucleotídeos em Larga Escala , MicroRNAs/genética , MicroRNAs/metabolismo , Isoformas de Proteínas/genética , Análise de Sequência de RNARESUMO
Despite recent methodology and reference database improvements for taxonomic profiling tools, metagenomic assembly and genomic binning remain important pillars of metagenomic analysis workflows. In case reference information is lacking, genomic binning is considered to be a state-of-the-art method in mixed culture metagenomic data analysis. In this light, our previously published tool BusyBee Web implements a composition-based binning method efficient enough to function as a rapid online utility. Handling assembled contigs and long nanopore generated reads alike, the webserver provides a wide range of supplementary annotations and visualizations. Half a decade after the initial publication, we revisited existing functionality, added comprehensive visualizations, and increased the number of data analysis customization options for further experimentation. The webserver now allows for visualization-supported differential analysis of samples, which is computationally expensive and typically only performed in coverage-based binning methods. Further, users may now optionally check their uploaded samples for plasmid sequences using PLSDB as a reference database. Lastly, a new application programming interface with a supporting python package was implemented, to allow power users fully automated access to the resource and integration into existing workflows. The webserver is freely available under: https://www.ccb.uni-saarland.de/busybee.
Assuntos
Algoritmos , Metagenoma , Software , Metagenômica/métodos , Fluxo de Trabalho , Análise de Sequência de DNARESUMO
Plasmids are known to contain genes encoding for virulence factors and antibiotic resistance mechanisms. Their relevance in metagenomic data processing is steadily growing. However, with the increasing popularity and scale of metagenomics experiments, the number of reported plasmids is rapidly growing as well, amassing a considerable number of false positives due to undetected misassembles. Here, our previously published database PLSDB provides a reliable resource for researchers to quickly compare their sequences against selected and annotated previous findings. Within two years, the size of this resource has more than doubled from the initial 13,789 to now 34,513 entries over the course of eight regular data updates. For this update, we aggregated community feedback for major changes to the database featuring new analysis functionality as well as performance, quality, and accessibility improvements. New filtering steps, annotations, and preprocessing of existing records improve the quality of the provided data. Additionally, new features implemented in the web-server ease user interaction and allow for a deeper understanding of custom uploaded sequences, by visualizing similarity information. Lastly, an application programming interface was implemented along with a python library, to allow remote database queries in automated workflows. The latest release of PLSDB is freely accessible under https://www.ccb.uni-saarland.de/plsdb.
Assuntos
Bactérias/genética , Bases de Dados Genéticas , Plasmídeos/química , Interface Usuário-Computador , Actinobacteria/genética , Actinobacteria/patogenicidade , Bactérias/classificação , Bactérias/patogenicidade , Bacteroidetes/genética , Bacteroidetes/patogenicidade , Resistência Microbiana a Medicamentos/genética , Firmicutes/genética , Firmicutes/patogenicidade , Internet , Metagenômica/métodos , Anotação de Sequência Molecular , Plasmídeos/classificação , Plasmídeos/metabolismo , Proteobactérias/genética , Proteobactérias/patogenicidade , Spirochaetales/genética , Spirochaetales/patogenicidade , Tenericutes/genética , Tenericutes/patogenicidade , Virulência/genéticaRESUMO
With Aviator, we present a web service and repository that facilitates surveillance of online tools. Aviator consists of a user-friendly website and two modules, a literature-mining based general and a manually curated module. The general module currently checks 9417 websites twice a day with respect to their availability and stores many features (frontend and backend response time, required RAM and size of the web page, security certificates, analytic tools and trackers embedded in the webpage and others) in a data warehouse. Aviator is also equipped with an analysis functionality, for example authors can check and evaluate the availability of their own tools or those of their peers. Likewise, users can check the availability of a certain tool they intend to use in research or teaching to avoid including unstable tools. The curated section of Aviator offers additional services. We provide API snippets for common programming languages (Perl, PHP, Python, JavaScript) as well as an OpenAPI documentation for embedding in the backend of own web services for an automatic test of their function. We query the respective APIs twice a day and send automated notifications in case of an unexpected result. Naturally, the same analysis functionality as for the literature-based module is available for the curated section. Aviator can freely be used at https://www.ccb.uni-saarland.de/aviator.
Assuntos
Gráficos por Computador , Software , Reposicionamento de Medicamentos , Humanos , Internet , Melanoma/metabolismo , Receptores Odorantes/metabolismo , Transdução de Sinais , Tratamento Farmacológico da COVID-19RESUMO
Analyzing all features of small non-coding RNA sequencing data can be demanding and challenging. To facilitate this process, we developed miRMaster. After the analysis of over 125 000 human samples and 1.5 trillion human small RNA reads over 4 years, we present miRMaster 2 with a wide range of updates and new features. We extended our reference data sets so that miRMaster 2 now supports the analysis of eight species (e.g. human, mouse, chicken, dog, cow) and 10 non-coding RNA classes (e.g. microRNAs, piRNAs, tRNAs, rRNAs, circRNAs). We also incorporated new downstream analysis modules such as batch effect analysis or sample embeddings using UMAP, and updated annotation data bases included by default (miRBase, Ensembl, GtRNAdb). To accommodate the increasing popularity of single cell small-RNA sequencing data, we incorporated a module for unique molecular identifier (UMI) processing. Further, the output tables and graphics have been improved based on user feedback and new output formats that emerged in the community are now supported (e.g. miRGFF3). Finally, we integrated differential expression analysis with the miRNA enrichment analysis tool miEAA. miRMaster is freely available at https://www.ccb.uni-saarland.de/mirmaster2.
Assuntos
Pequeno RNA não Traduzido/química , Análise de Sequência de RNA/métodos , Animais , Bovinos , Demência/genética , Cães , Humanos , Camundongos , MicroRNAs , Pequeno RNA não Traduzido/metabolismo , Ratos , SoftwareRESUMO
MOTIVATION: Since the initial discovery of microRNAs as post-transcriptional, regulatory key players in the 1990s, a total number of $2656$ mature microRNAs have been publicly described for Homo sapiens. As discovery of new miRNAs is still on-going, target identification remains to be an essential and challenging step preceding functional annotation analysis. One key challenge for researchers seems to be the selection of the most appropriate tool out of the larger multiverse of published solutions for a given research study set-up. RESULTS: In this review we collectively describe the field of in silico target prediction in the course of time and point out long withstanding principles as well as recent developments. By compiling a catalog of characteristics about the 98 prediction methods and identifying common and exclusive traits, we signpost a simplified mechanism to address the problem of application selection. Going further we devised interpretation strategies for common types of output as generated by frequently used computational methods. To this end, our work specifically aims to make prospective users aware of common mistakes and practical questions that arise during the application of target prediction tools. AVAILABILITY: An interactive implementation of our recommendations including materials shown in the manuscript is freely available at https://www.ccb.uni-saarland.de/mtguide.
Assuntos
Biologia Computacional , Simulação por Computador , Regulação da Expressão Gênica , MicroRNAs , Biologia Computacional/métodos , Estudos Prospectivos , SoftwareRESUMO
This paper introduces both a hardware and a software system designed to allow low-cost electronic monitoring of social insects using RFID tags. Data formats for individual insect identification and their associated experiment are proposed to facilitate data sharing from experiments conducted with this system. The antennas' configuration and their duty cycle ensure a high degree of detection rates. Other advantages and limitations of this system are discussed in detail in the paper.
Assuntos
Sistemas de Identificação Animal/economia , Abelhas , Dispositivo de Identificação por Radiofrequência/economia , Software/economia , Animais , Abelhas/classificaçãoRESUMO
The identification of targetomes remains a challenge given the pleiotropic effect of miRNAs, the limited effects of miRNAs on individual targets, and the sheer number of estimated miRNA-target gene interactions (MTIs), which is around 44,571,700. Currently, targetome identification for single miRNAs relies on computational evidence and functional studies covering smaller numbers of targets. To ensure that the targetome analysis could be experimentally verified by functional assays, we employed a systematic approach and explored the targetomes of four miRNAs (miR-129-5p, miR-129-1-3p, miR-133b, and miR-873-5p) by analyzing 410 predicted target genes, both of which were previously associated with Parkinson's disease (PD). After performing 13,536 transfections, we validated 442 of the 705 putative MTIs (62,7%) through dual luciferase reporter assays. These analyses increased the number of validated MTIs by at least 2.1-fold for miR-133b and by a maximum of 24.3-fold for miR-873-5p. Our study contributes to the experimental capture of miRNA targetomes by addressing i) the ratio of experimentally verified MTIs to predicted MTIs, ii) the sizes of disease-related miRNA targetomes, and iii) the density of MTI networks. A web service to support the analyses on the MTI level is available online ( https://ccb-web.cs.uni-saarland.de/utr-seremato ), and all the data have been added to the miRATBase database ( https://ccb-web.cs.uni-saarland.de/miratbase ).