Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Nucleic Acids Res ; 49(D1): D1012-D1019, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33104797

RESUMO

RNA editing is a relevant epitranscriptome phenomenon able to increase the transcriptome and proteome diversity of eukaryotic organisms. ADAR mediated RNA editing is widespread in humans in which millions of A-to-I changes modify thousands of primary transcripts. RNA editing has pivotal roles in the regulation of gene expression or modulation of the innate immune response or functioning of several neurotransmitter receptors. Massive transcriptome sequencing has fostered the research in this field. Nonetheless, different aspects of the RNA editing biology are still unknown and need to be elucidated. To support the study of A-to-I RNA editing we have updated our REDIportal catalogue raising its content to about 16 millions of events detected in 9642 human RNAseq samples from the GTEx project by using a dedicated pipeline based on the HPC version of the REDItools software. REDIportal now allows searches at sample level, provides overviews of RNA editing profiles per each RNAseq experiment, implements a Gene View module to look at individual events in their genic context and hosts the CLAIRE database. Starting from this novel version, REDIportal will start collecting non-human RNA editing changes for comparative genomics investigations. The database is freely available at http://srv00.recas.ba.infn.it/atlas/index.html.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica , Proteoma/genética , Edição de RNA/genética , Transcriptoma/genética , Sequência de Bases/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Internet , Proteômica/métodos
2.
BMC Bioinformatics ; 21(Suppl 10): 353, 2020 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-32838738

RESUMO

BACKGROUND: RNA editing is a widespread co-/post-transcriptional mechanism that alters primary RNA sequences through the modification of specific nucleotides and it can increase both the transcriptome and proteome diversity. The automatic detection of RNA-editing from RNA-seq data is computational intensive and limited to small data sets, thus preventing a reliable genome-wide characterisation of such process. RESULTS: In this work we introduce HPC-REDItools, an upgraded tool for accurate RNA-editing events discovery from large dataset repositories. AVAILABILITY: https://github.com/BioinfoUNIBA/REDItools2 . CONCLUSIONS: HPC-REDItools is dramatically faster than the previous version, REDItools, enabling big-data analysis by means of a MPI-based implementation and scaling almost linearly with the number of available cores.


Assuntos
Metodologias Computacionais , Edição de RNA/genética , Software , Algoritmos , Sequência de Bases , Genoma , Transcriptoma/genética
3.
BMC Bioinformatics ; 21(Suppl 10): 352, 2020 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-32838759

RESUMO

BACKGROUND: The advent of Next Generation Sequencing (NGS) technologies and the concomitant reduction in sequencing costs allows unprecedented high throughput profiling of biological systems in a cost-efficient manner. Modern biological experiments are increasingly becoming both data and computationally intensive and the wealth of publicly available biological data is introducing bioinformatics into the "Big Data" era. For these reasons, the effective application of High Performance Computing (HPC) architectures is becoming progressively more recognized also by bioinformaticians. Here we describe HPC resources provisioning pilot programs dedicated to bioinformaticians, run by the Italian Node of ELIXIR (ELIXIR-IT) in collaboration with CINECA, the main Italian supercomputing center. RESULTS: Starting from April 2016, CINECA and ELIXIR-IT launched the pilot Call "ELIXIR-IT HPC@CINECA", offering streamlined access to HPC resources for bioinformatics. Resources are made available either through web front-ends to dedicated workflows developed at CINECA or by providing direct access to the High Performance Computing systems through a standard command-line interface tailored for bioinformatics data analysis. This allows to offer to the biomedical research community a production scale environment, continuously updated with the latest available versions of publicly available reference datasets and bioinformatic tools. Currently, 63 research projects have gained access to the HPC@CINECA program, for a total handout of ~ 8 Millions of CPU/hours and, for data storage, ~ 100 TB of permanent and ~ 300 TB of temporary space. CONCLUSIONS: Three years after the beginning of the ELIXIR-IT HPC@CINECA program, we can appreciate its impact over the Italian bioinformatics community and draw some considerations. Several Italian researchers who applied to the program have gained access to one of the top-ranking public scientific supercomputing facilities in Europe. Those investigators had the opportunity to sensibly reduce computational turnaround times in their research projects and to process massive amounts of data, pursuing research approaches that would have been otherwise difficult or impossible to undertake. Moreover, by taking advantage of the wealth of documentation and training material provided by CINECA, participants had the opportunity to improve their skills in the usage of HPC systems and be better positioned to apply to similar EU programs of greater scale, such as PRACE. To illustrate the effective usage and impact of the resources awarded by the program - in different research applications - we report five successful use cases, which have already published their findings in peer-reviewed journals.


Assuntos
Biologia Computacional , Metodologias Computacionais , Software , Algoritmos , Animais , Linhagem Celular , Bases de Dados Genéticas , Fusão Gênica , Genoma , Humanos , Prunus persica/genética , Edição de RNA , Andorinhas/genética
4.
BMC Bioinformatics ; 20(1): 414, 2019 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-31387525

RESUMO

BACKGROUND: R-loops are three-stranded nucleic acid structures that usually form during transcription and that may lead to gene regulation or genome instability. DRIP (DNA:RNA Immunoprecipitation)-seq techniques are widely used to map R-loops genome-wide providing insights into R-loop biology. However, annotation of DRIP-seq peaks to genes can be a tricky step, due to the lack of strand information when using the common basic DRIP technique. RESULTS: Here, we introduce DRIP-seq Optimized Peak Annotator (DROPA), a new tool for gene annotation of R-loop peaks based on gene expression information. DROPA allows a full customization of annotation options, ranging from the choice of reference datasets to gene feature definitions. DROPA allows to assign R-loop peaks to the DNA template strand in gene body with a false positive rate of less than 7%. A comparison of DROPA performance with three widely used annotation tools show that it identifies less false positive annotations than the others. CONCLUSIONS: DROPA is a fully customizable peak-annotation tool optimized for co-transcriptional DRIP-seq peaks, which allows a finest gene annotation based on gene expression information. Its output can easily be integrated into pipelines to perform downstream analyses, while useful and informative summary plots and statistical enrichment tests can be produced.


Assuntos
DNA/metabolismo , Imunoprecipitação , Anotação de Sequência Molecular , RNA/metabolismo , Software , Sequência de Bases , DNA/genética , Regulação da Expressão Gênica , RNA/genética
5.
BMC Genomics ; 19(1): 120, 2018 02 05.
Artigo em Inglês | MEDLINE | ID: mdl-29402227

RESUMO

BACKGROUND: The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens. RESULTS: Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software. CONCLUSIONS: CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .


Assuntos
Biologia Computacional/métodos , Sequência Consenso , Análise de Sequência de DNA/métodos , Software , Algoritmos , Bases de Dados Genéticas , Mutação INDEL , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Sensibilidade e Especificidade , Interface Usuário-Computador , Navegador , Fluxo de Trabalho
6.
Plant Cell Physiol ; 59(1): e2, 2018 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-29216377

RESUMO

Applying next-generation sequencing (NGS) technologies to species of agricultural interest has the potential to accelerate the understanding and exploration of genetic resources. The storage, availability and maintenance of huge quantities of NGS-generated data remains a major challenge. The PeachVar-DB portal, available at http://hpc-bioinformatics.cineca.it/peach, is an open-source catalog of genetic variants present in peach (Prunus persica L. Batsch) and wild-related species of Prunus genera, annotated from 146 samples publicly released on the Sequence Read Archive (SRA). We designed a user-friendly web-based interface of the database, providing search tools to retrieve single nucleotide polymorphism (SNP) and InDel variants, along with useful statistics and information. PeachVar-DB results are linked to the Genome Database for Rosaceae (GDR) and the Phytozome database to allow easy access to other external useful plant-oriented resources. In order to extend the genetic diversity covered by the PeachVar-DB further, and to allow increasingly powerful comparative analysis, we will progressively integrate newly released data.


Assuntos
Biologia Computacional/métodos , Variação Genética , Genoma de Planta/genética , Prunus persica/genética , Mineração de Dados/métodos , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , Filogenia , Polimorfismo de Nucleotídeo Único , Prunus persica/classificação , Rosaceae/classificação , Rosaceae/genética
7.
Methods Mol Biol ; 2284: 253-270, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33835447

RESUMO

RNA editing by A-to-I deamination is a relevant co/posttranscriptional modification carried out by ADAR enzymes. In humans, it has pivotal cellular effects and its deregulation has been linked to a variety of human disorders including neurological and neurodegenerative diseases and cancer. Despite its biological relevance, the detection of RNA editing variants in large transcriptome sequencing experiments (RNAseq) is yet a challenging computational task. To drastically reduce computing times we have developed a novel REDItools version able to identify A-to-I events in huge amount of RNAseq data employing High Performance Computing (HPC) infrastructures.Here we show how to use REDItools v2 in HPC systems.


Assuntos
Metodologias Computacionais , Edição de RNA/fisiologia , Análise de Sequência de RNA/métodos , Animais , Biologia Computacional/métodos , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , Doenças do Sistema Nervoso/genética , Doenças Neurodegenerativas/genética , Software , Transcriptoma
8.
Sci Data ; 7(1): 437, 2020 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-33328476

RESUMO

Stressful experiences are part of everyday life and animals have evolved physiological and behavioral responses aimed at coping with stress and maintaining homeostasis. However, repeated or intense stress can induce maladaptive reactions leading to behavioral disorders. Adaptations in the brain, mediated by changes in gene expression, have a crucial role in the stress response. Recent years have seen a tremendous increase in studies on the transcriptional effects of stress. The input raw data are freely available from public repositories and represent a wealth of information for further global and integrative retrospective analyses. We downloaded from the Sequence Read Archive 751 samples (SRA-experiments), from 18 independent BioProjects studying the effects of different stressors on the brain transcriptome in mice. We performed a massive bioinformatics re-analysis applying a single, standardized pipeline for computing differential gene expression. This data mining allowed the identification of novel candidate stress-related genes and specific signatures associated with different stress conditions. The large amount of computational results produced was systematized in the interactive "Stress Mice Portal".


Assuntos
Encéfalo/fisiologia , Expressão Gênica , Estresse Fisiológico , Estresse Psicológico , Transcriptoma , Animais , Biologia Computacional , Mineração de Dados , Conjuntos de Dados como Assunto , Feminino , Masculino , Camundongos
9.
Mol Neurobiol ; 57(5): 2301-2313, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32020500

RESUMO

Autism spectrum disorder (ASD) is a heterogeneous neurodevelopmental condition with unknown etiology. Recent experimental evidences suggest the contribution of non-coding RNAs (ncRNAs) in the pathophysiology of ASD. In this work, we aimed to investigate the expression profile of the ncRNA class of circular RNAs (circRNAs) in the hippocampus of the BTBR T + tf/J (BTBR) mouse model and age-matched C57BL/6J (B6) mice. Alongside, we analyzed BTBR hippocampal gene expression profile to evaluate possible correlations between the differential abundance of circular and linear gene products. From RNA sequencing data, we identified circRNAs highly modulated in BTBR mice. Thirteen circRNAs and their corresponding linear isoforms were validated by RT-qPCR analysis. The BTBR-regulated circCdh9 was better characterized in terms of molecular structure and expression, highlighting altered levels not only in the hippocampus, but also in the cerebellum, prefrontal cortex, and amygdala. Finally, gene expression analysis of the BTBR hippocampus pinpointed altered biological and molecular pathways relevant for the ASD phenotype. By comparison of circRNA and gene expression profiles, we identified 6 genes significantly regulated at either circRNA or mRNA gene products, suggesting low overall correlation between circRNA and host gene expression. In conclusion, our results indicate a consistent deregulation of circRNA expression in the hippocampus of BTBR mice. ASD-related circRNAs should be considered in functional studies to identify their contribution to the etiology of the disorder. In addition, as abundant and highly stable molecules, circRNAs represent interesting potential biomarkers for autism.


Assuntos
Transtorno do Espectro Autista/metabolismo , Modelos Animais de Doenças , Hipocampo/metabolismo , Camundongos Endogâmicos/metabolismo , Camundongos Mutantes/metabolismo , RNA Circular/biossíntese , RNA Mensageiro/biossíntese , Animais , Transtorno do Espectro Autista/genética , Química Encefálica , Perfilação da Expressão Gênica , Ontologia Genética , Humanos , Masculino , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos/genética , Camundongos Mutantes/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Especificidade da Espécie
10.
Gigascience ; 7(10)2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29860514

RESUMO

Background: Gene fusions derive from chromosomal rearrangements. The resulting chimeric transcripts are often endowed with oncogenic potential. Furthermore, they serve as diagnostic tools for the clinical classification of cancer subgroups with different prognosis and, in some cases, they can provide specific drug targets. To date, many efforts have been carried out to study gene fusion events occurring in tumor samples. In recent years, the availability of a comprehensive next-generation sequencing dataset for all existing human tumor cell lines has provided the opportunity to further investigate these data in order to identify novel and still uncharacterized gene fusion events. Results: In our work, we have extensively reanalyzed 935 paired-end RNA-sequencing experiments downloaded from the Cancer Cell Line Encyclopedia repository, aiming at addressing novel putative cell-line specific gene fusion events in human malignancies. The bioinformatics analysis has been performed by the execution of four gene fusion detection algorithms. The results have been further prioritized by running a Bayesian classifier that makes an in silico validation. The collection of fusion events supported by all of the predictive software results in a robust set of ∼1,700 in silico predicted novel candidates suitable for downstream analyses. Given the huge amount of data and information produced, computational results have been systematized in a database named LiGeA. The database can be browsed through a dynamic and interactive web portal, further integrated with validated data from other well-known repositories. Taking advantage of the intuitive query forms, the users can easily access, navigate, filter, and select the putative gene fusions for further validations and studies. They can also find suitable experimental models for a given fusion of interest. Conclusions: We believe that the LiGeA resource can represent not only the first compendium of both known and putative novel gene fusion events in the catalog of all of the human malignant cell lines but it can also become a handy starting point for wet-lab biologists who wish to investigate novel cancer biomarkers and specific drug targets.


Assuntos
Análise de Dados , Mineração de Dados , Fusão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Linhagem Celular , Linhagem Celular Tumoral , Biologia Computacional/métodos , Bases de Dados Genéticas , Rearranjo Gênico , Genoma Humano , Genômica/métodos , Humanos , Fusão Oncogênica , Translocação Genética , Navegador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA