Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Nucleic Acids Res ; 51(D1): D337-D344, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399486

RESUMEN

The 5' and 3' untranslated regions of eukaryotic mRNAs (UTRs) play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization, and message stability. Since 1996, we have developed and maintained UTRdb, a specialized database of UTR sequences. Here we present UTRdb 2.0, a major update of UTRdb featuring an extensive collection of eukaryotic 5' and 3' UTR sequences, including over 26 million entries from over 6 million genes and 573 species, enriched with a curated set of functional annotations. Annotations include CAGE tags and polyA signals to label the completeness of 5' and 3'UTRs, respectively. In addition, uORFs and IRES are annotated in 5'UTRs as well as experimentally validated miRNA targets in 3'UTRs. Further annotations include evolutionarily conserved blocks, Rfam motifs, ADAR-mediated RNA editing events, and m6A modifications. A web interface allowing a flexible selection and retrieval of specific subsets of UTRs, selected according to a combination of criteria, has been implemented which also provides comprehensive download facilities. UTRdb 2.0 is accessible at http://utrdb.cloud.ba.infn.it/utrdb/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Eucariontes , ARN Mensajero , Regiones no Traducidas , Regiones no Traducidas 3'/genética , Regiones no Traducidas 5' , Eucariontes/genética , Células Eucariotas/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo
2.
Nucleic Acids Res ; 49(D1): D1012-D1019, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33104797

RESUMEN

RNA editing is a relevant epitranscriptome phenomenon able to increase the transcriptome and proteome diversity of eukaryotic organisms. ADAR mediated RNA editing is widespread in humans in which millions of A-to-I changes modify thousands of primary transcripts. RNA editing has pivotal roles in the regulation of gene expression or modulation of the innate immune response or functioning of several neurotransmitter receptors. Massive transcriptome sequencing has fostered the research in this field. Nonetheless, different aspects of the RNA editing biology are still unknown and need to be elucidated. To support the study of A-to-I RNA editing we have updated our REDIportal catalogue raising its content to about 16 millions of events detected in 9642 human RNAseq samples from the GTEx project by using a dedicated pipeline based on the HPC version of the REDItools software. REDIportal now allows searches at sample level, provides overviews of RNA editing profiles per each RNAseq experiment, implements a Gene View module to look at individual events in their genic context and hosts the CLAIRE database. Starting from this novel version, REDIportal will start collecting non-human RNA editing changes for comparative genomics investigations. The database is freely available at http://srv00.recas.ba.infn.it/atlas/index.html.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Regulación de la Expresión Génica , Proteoma/genética , Edición de ARN/genética , Transcriptoma/genética , Secuencia de Bases/genética , Curaduría de Datos/métodos , Minería de Datos/métodos , Perfilación de la Expresión Génica/métodos , Genómica/métodos , Humanos , Internet , Proteómica/métodos
3.
Bioinformatics ; 36(22-23): 5522-5523, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33346830

RESUMEN

SUMMARY: While over 200 000 genomic sequences are currently available through dedicated repositories, ad hoc methods for the functional annotation of SARS-CoV-2 genomes do not harness all currently available resources for the annotation of functionally relevant genomic sites. Here, we present CorGAT, a novel tool for the functional annotation of SARS-CoV-2 genomic variants. By comparisons with other state of the art methods we demonstrate that, by providing a more comprehensive and rich annotation, our method can facilitate the identification of evolutionary patterns in the genome of SARS-CoV-2. AVAILABILITYAND IMPLEMENTATION: Galaxy.http://corgat.cloud.ba.infn.it/galaxy; software: https://github.com/matteo14c/CorGAT/tree/Revision_V1; docker: https://hub.docker.com/r/laniakeacloud/galaxy_corgat. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Bioinformatics ; 36(24): 5590-5599, 2021 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-33367501

RESUMEN

MOTIVATION: Clinical applications of genome re-sequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants potentially associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance. RESULTS: In this article, we present VINYL, a flexible and fully automated system for the functional annotation and prioritization of genetic variants. Extensive analyses of both real and simulated datasets suggest that VINYL can identify clinically relevant genetic variants in a more accurate manner compared to equivalent state of the art methods, allowing a more rapid and effective prioritization of genetic variants in different experimental settings. As such we believe that VINYL can establish itself as a valuable tool to assist healthcare operators and researchers in clinical genomics investigations. AVAILABILITY AND IMPLEMENTATION: VINYL is available at http://beaconlab.it/VINYL and https://github.com/matteo14c/VINYL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

5.
BMC Bioinformatics ; 22(Suppl 15): 544, 2021 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-34749633

RESUMEN

BACKGROUND: Improving the availability and usability of data and analytical tools is a critical precondition for further advancing modern biological and biomedical research. For instance, one of the many ramifications of the COVID-19 global pandemic has been to make even more evident the importance of having bioinformatics tools and data readily actionable by researchers through convenient access points and supported by adequate IT infrastructures. One of the most successful efforts in improving the availability and usability of bioinformatics tools and data is represented by the Galaxy workflow manager and its thriving community. In 2020 we introduced Laniakea, a software platform conceived to streamline the configuration and deployment of "on-demand" Galaxy instances over the cloud. By facilitating the set-up and configuration of Galaxy web servers, Laniakea provides researchers with a powerful and highly customisable platform for executing complex bioinformatics analyses. The system can be accessed through a dedicated and user-friendly web interface that allows the Galaxy web server's initial configuration and deployment. RESULTS: "Laniakea@ReCaS", the first instance of a Laniakea-based service, is managed by ELIXIR-IT and was officially launched in February 2020, after about one year of development and testing that involved several users. Researchers can request access to Laniakea@ReCaS through an open-ended call for use-cases. Ten project proposals have been accepted since then, totalling 18 Galaxy on-demand virtual servers that employ ~ 100 CPUs, ~ 250 GB of RAM and ~ 5 TB of storage and serve several different communities and purposes. Herein, we present eight use cases demonstrating the versatility of the platform. CONCLUSIONS: During this first year of activity, the Laniakea-based service emerged as a flexible platform that facilitated the rapid development of bioinformatics tools, the efficient delivery of training activities, and the provision of public bioinformatics services in different settings, including food safety and clinical research. Laniakea@ReCaS provides a proof of concept of how enabling access to appropriate, reliable IT resources and ready-to-use bioinformatics tools can considerably streamline researchers' work.


Asunto(s)
COVID-19 , Nube Computacional , Biología Computacional , Humanos , SARS-CoV-2 , Programas Informáticos
6.
BMC Bioinformatics ; 21(Suppl 10): 352, 2020 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-32838759

RESUMEN

BACKGROUND: The advent of Next Generation Sequencing (NGS) technologies and the concomitant reduction in sequencing costs allows unprecedented high throughput profiling of biological systems in a cost-efficient manner. Modern biological experiments are increasingly becoming both data and computationally intensive and the wealth of publicly available biological data is introducing bioinformatics into the "Big Data" era. For these reasons, the effective application of High Performance Computing (HPC) architectures is becoming progressively more recognized also by bioinformaticians. Here we describe HPC resources provisioning pilot programs dedicated to bioinformaticians, run by the Italian Node of ELIXIR (ELIXIR-IT) in collaboration with CINECA, the main Italian supercomputing center. RESULTS: Starting from April 2016, CINECA and ELIXIR-IT launched the pilot Call "ELIXIR-IT HPC@CINECA", offering streamlined access to HPC resources for bioinformatics. Resources are made available either through web front-ends to dedicated workflows developed at CINECA or by providing direct access to the High Performance Computing systems through a standard command-line interface tailored for bioinformatics data analysis. This allows to offer to the biomedical research community a production scale environment, continuously updated with the latest available versions of publicly available reference datasets and bioinformatic tools. Currently, 63 research projects have gained access to the HPC@CINECA program, for a total handout of ~ 8 Millions of CPU/hours and, for data storage, ~ 100 TB of permanent and ~ 300 TB of temporary space. CONCLUSIONS: Three years after the beginning of the ELIXIR-IT HPC@CINECA program, we can appreciate its impact over the Italian bioinformatics community and draw some considerations. Several Italian researchers who applied to the program have gained access to one of the top-ranking public scientific supercomputing facilities in Europe. Those investigators had the opportunity to sensibly reduce computational turnaround times in their research projects and to process massive amounts of data, pursuing research approaches that would have been otherwise difficult or impossible to undertake. Moreover, by taking advantage of the wealth of documentation and training material provided by CINECA, participants had the opportunity to improve their skills in the usage of HPC systems and be better positioned to apply to similar EU programs of greater scale, such as PRACE. To illustrate the effective usage and impact of the resources awarded by the program - in different research applications - we report five successful use cases, which have already published their findings in peer-reviewed journals.


Asunto(s)
Biología Computacional , Metodologías Computacionales , Programas Informáticos , Algoritmos , Animales , Línea Celular , Bases de Datos Genéticas , Fusión Génica , Genoma , Humanos , Prunus persica/genética , Edición de ARN , Golondrinas/genética
7.
NAR Genom Bioinform ; 6(4): lqae140, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39391613

RESUMEN

Technological advances in high-throughput technologies improve our ability to explore the molecular mechanisms of life. Computational infrastructures for scientific applications fulfil a critical role in harnessing this potential. However, there is an ongoing need to improve accessibility and implement robust data security technologies to allow the processing of sensitive data, particularly human genetic data. Scientific clouds have emerged as a promising solution to meet these needs. We present three components of the Laniakea software stack, initially developed to support the provision of private on-demand Galaxy instances. These components can be adopted by providers of scientific cloud services built on the INDIGO PaaS layer. The Dashboard translates configuration template files into user-friendly web interfaces, enabling the easy configuration and launch of on-demand applications. The secret management and the encryption components, integrated within the Dashboard, support the secure handling of passphrases and credentials and the deployment of block-level encrypted storage volumes for managing sensitive data in the cloud environment. By adopting these software components, scientific cloud providers can develop convenient, secure and efficient on-demand services for their users.

8.
Methods Mol Biol ; 2584: 311-335, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36495458

RESUMEN

rCASC is a modular workflow providing an integrated environment for single-cell RNA-seq (scRNA-Seq) data analysis exploiting Docker containers to achieve functional and computational reproducibility. It was initially developed as an R package usable also through a Java GUI. However, the Java frontend cannot be employed when running rCASC on a remote server, a typical setup due to the significant computational resources commonly needed to analyze scRNA-Seq data.To allow the use of rCASC through a graphical user interface on the client side and to harness the many advantages provided by the Galaxy platform, we have made rCASC available as a Galaxy set of tools, also providing a dedicated public instance of Galaxy named "Galaxy-rCASC." To integrate rCASC into Galaxy, all its functions, originally implemented as a set of Docker containers to maximize reproducibility, have been extensively reworked to become independent from the R package functions that launch them in the original implementation. Furthermore, suitable Galaxy wrappers have been developed for most functions of rCASC. We provide a detailed reference document to the use of Galaxy-rCASC with insights and explanations on the platform functionalities, parameters, and output while guiding the reader through the typical rCASC analysis workflow of a scRNA-Seq dataset.


Asunto(s)
Análisis de Expresión Génica de una Sola Célula , Programas Informáticos , Humanos , Reproducibilidad de los Resultados , Análisis de Datos , Flujo de Trabajo , Análisis de la Célula Individual , Biología Computacional
9.
J Mol Biol ; 433(11): 166829, 2021 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-33508309

RESUMEN

In diploid organisms, two copies of each allele are normally inherited from parents. Paternal and maternal alleles can be regulated and expressed unequally, which is referred to as allele-specific expression (ASE). In this work, we present aScan, a novel method for the identification of ASE from the analysis of matched individual genomic and RNA sequencing data. By performing extensive analyses of both real and simulated data, we demonstrate that aScan can correctly identify ASE with high accuracy and sensitivity in different experimental settings. Additionally, by applying our method to a small cohort of individuals that are not included in publicly available databases of human genetic variation, we outline the value of possible applications of ASE analysis in single individuals for deriving a more accurate annotation of "private" low-frequency genetic variants associated with regulatory effects on transcription. All in all, we believe that aScan will represent a beneficial addition to the set of bioinformatics tools for the analysis of ASE. Finally, while our method was initially conceived for the analysis of RNA-seq data, it can in principle be applied to any quantitative NGS assay for which matched genotypic and expression data are available. AVAILABILITY: aScan is currently available in the form of an open source standalone software package at: https://github.com/Federico77z/aScan/. aScan version 1.0.3, available at https://github.com/Federico77z/aScan/releases/tag/1.0.3, has been used for all the analyses included in this manuscript. A Docker image of the tool has also been made available at https://github.com/pmandreoli/aScanDocker.


Asunto(s)
Alelos , Regulación de la Expresión Génica , Programas Informáticos , Simulación por Computador , Bases de Datos Genéticas , Genes , Humanos , Especificidad de Órganos/genética , Polimorfismo Genético , Regiones Promotoras Genéticas/genética
10.
Nat Protoc ; 15(3): 1098-1131, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31996844

RESUMEN

RNA editing is a widespread post-transcriptional mechanism able to modify transcripts through insertions/deletions or base substitutions. It is prominent in mammals, in which millions of adenosines are deaminated to inosines by members of the ADAR family of enzymes. A-to-I RNA editing has a plethora of biological functions, but its detection in large-scale transcriptome datasets is still an unsolved computational task. To this aim, we developed REDItools, the first software package devoted to the RNA editing profiling in RNA-sequencing (RNAseq) data. It has been successfully used in human transcriptomes, proving the tissue and cell type specificity of RNA editing as well as its pervasive nature. Outcomes from large-scale REDItools analyses on human RNAseq data have been collected in our specialized REDIportal database, containing more than 4.5 million events. Here we describe in detail two bioinformatic procedures based on our computational resources, REDItools and REDIportal. In the first procedure, we outline a workflow to detect RNA editing in the human cell line NA12878, for which transcriptome and whole genome data are available. In the second procedure, we show how to identify dysregulated editing at specific recoding sites in post-mortem brain samples of Huntington disease donors. On a 64-bit computer running Linux with ≥32 GB of random-access memory (RAM), both procedures should take ~76 h, using 4 to 24 cores. Our protocols have been designed to investigate RNA editing in different organisms with available transcriptomic and/or genomic reads. Scripts to complete both procedures and a docker image are available at https://github.com/BioinfoUNIBA/REDItools.


Asunto(s)
Bases de Datos Genéticas , Edición de ARN , Programas Informáticos , Secuencia de Bases , Encéfalo , Biología Computacional/métodos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Enfermedad de Huntington/genética , Enfermedad de Huntington/patología , Análisis de Secuencia de ARN
11.
Gigascience ; 9(4)2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-32252069

RESUMEN

BACKGROUND: While the popular workflow manager Galaxy is currently made available through several publicly accessible servers, there are scenarios where users can be better served by full administrative control over a private Galaxy instance, including, but not limited to, concerns about data privacy, customisation needs, prioritisation of particular job types, tools development, and training activities. In such cases, a cloud-based Galaxy virtual instance represents an alternative that equips the user with complete control over the Galaxy instance itself without the burden of the hardware and software infrastructure involved in running and maintaining a Galaxy server. RESULTS: We present Laniakea, a complete software solution to set up a "Galaxy on-demand" platform as a service. Building on the INDIGO-DataCloud software stack, Laniakea can be deployed over common cloud architectures usually supported both by public and private e-infrastructures. The user interacts with a Laniakea-based service through a simple front-end that allows a general setup of a Galaxy instance, and then Laniakea takes care of the automatic deployment of the virtual hardware and the software components. At the end of the process, the user gains access with full administrative privileges to a private, production-grade, fully customisable, Galaxy virtual instance and to the underlying virtual machine (VM). Laniakea features deployment of single-server or cluster-backed Galaxy instances, sharing of reference data across multiple instances, data volume encryption, and support for VM image-based, Docker-based, and Ansible recipe-based Galaxy deployments. A Laniakea-based Galaxy on-demand service, named Laniakea@ReCaS, is currently hosted at the ELIXIR-IT ReCaS cloud facility. CONCLUSIONS: Laniakea offers to scientific e-infrastructures a complete and easy-to-use software solution to provide a Galaxy on-demand service to their users. Laniakea-based cloud services will help in making Galaxy more accessible to a broader user base by removing most of the burdens involved in deploying and running a Galaxy service. In turn, this will facilitate the adoption of Galaxy in scenarios where classic public instances do not represent an optimal solution. Finally, the implementation of Laniakea can be easily adapted and expanded to support different services and platforms beyond Galaxy.


Asunto(s)
Biología Computacional/tendencias , Programas Informáticos , Flujo de Trabajo , Nube Computacional , Interfaz Usuario-Computador
12.
Sci Rep ; 9(1): 17550, 2019 11 26.
Artículo en Inglés | MEDLINE | ID: mdl-31772190

RESUMEN

Reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) is an accurate and fast method to measure gene expression. Reproducibility of the analyses is the main limitation of RT-qPCR experiments. Galaxy is an open, web-based, genomic workbench for a reproducible, transparent, and accessible science. Our aim was developing a new Galaxy tool for the analysis of RT-qPCR expression data. Our tool was developed using Galaxy workbench version 19.01 and functions implemented in several R packages. We developed PIPE-T, a new Galaxy tool implementing a workflow, which offers several options for parsing, filtering, normalizing, imputing, and analyzing RT-qPCR data. PIPE-T requires two input files and returns seven output files. We tested the ability of PIPE-T to analyze RT-qPCR data on two example datasets available in the gene expression omnibus repository. In both cases, our tool successfully completed execution returning expected results. PIPE-T can be easily installed from the Galaxy main tool shed or from Docker. Source code, step-by-step instructions, and example files are available on GitHub to assist new users to install, execute, and test PIPE-T. PIPE-T is a new tool suitable for the reproducible, transparent, and accessible analysis of RT-qPCR expression data.


Asunto(s)
Interpretación Estadística de Datos , Expresión Génica , Reacción en Cadena en Tiempo Real de la Polimerasa , Programas Informáticos , Biología Computacional/métodos , Genómica/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA