Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
J Biomed Inform ; 142: 104394, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37209976

RESUMEN

The Biomedical Research field is currently advancing to develop Clinical Trials and translational projects based on Real World Evidence. To make this transition feasible, clinical centers need to work toward Data Accessibility and Interoperability. This task is particularly challenging when applied to Genomics, that entered in routinary screening in the last years via mostly amplicon-based Next-Generation Sequencing panels. Said experiments produce up to hundreds of features per patient, and their summarized results are often stored in static clinical reports, making critical information inaccessible to automated access and Federated Search consortia. In this study, we present a reanalysis of 4620 solid tumor sequencing samples in five different histology settings. Furthermore, we describe all the Bioinformatics and Data Engineering processes that were put in place in order to create a Somatic Variant Registry able to deal with the large biotechnological variability of routinary Genomics Profiling.


Asunto(s)
Investigación Biomédica , Neoplasias , Humanos , Genómica , Biología Computacional/métodos , Sistema de Registros , Neoplasias/diagnóstico , Neoplasias/genética
2.
BMC Genomics ; 19(1): 120, 2018 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-29402227

RESUMEN

BACKGROUND: The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens. RESULTS: Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software. CONCLUSIONS: CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .


Asunto(s)
Biología Computacional/métodos , Secuencia de Consenso , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Bases de Datos Genéticas , Mutación INDEL , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Sensibilidad y Especificidad , Interfaz Usuario-Computador , Navegador Web , Flujo de Trabajo
3.
BMC Genomics ; 16: S3, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26046471

RESUMEN

BACKGROUND: The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). METHODS: In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). RESULTS: Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.


Asunto(s)
ARN/análisis , Análisis de Secuencia de ARN/métodos , Interfaz Usuario-Computador , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Poliadenilación
4.
Nucleic Acids Res ; 41(Database issue): D125-31, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23118479

RESUMEN

A comprehensive knowledge of all the factors involved in splicing, both proteins and RNAs, and of their interaction network is crucial for reaching a better understanding of this process and its functions. A large part of relevant information is buried in the literature or collected in various different databases. By hand-curated screenings of literature and databases, we retrieved experimentally validated data on 71 human RNA-binding splicing regulatory proteins and organized them into a database called 'SpliceAid-F' (http://www.caspur.it/SpliceAidF/). For each splicing factor (SF), the database reports its functional domains, its protein and chemical interactors and its expression data. Furthermore, we collected experimentally validated RNA-SF interactions, including relevant information on the RNA-binding sites, such as the genes where these sites lie, their genomic coordinates, the splicing effects, the experimental procedures used, as well as the corresponding bibliographic references. We also collected information from experiments showing no RNA-SF binding, at least in the assayed conditions. In total, SpliceAid-F contains 4227 interactions, 2590 RNA-binding sites and 1141 'no-binding' sites, including information on cellular contexts and conditions where binding was tested. The data collected in SpliceAid-F can provide significant information to explain an observed splicing pattern as well as the effect of mutations in functional regulatory elements.


Asunto(s)
Bases de Datos de Proteínas , Empalme del ARN , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/metabolismo , Sitios de Unión , Humanos , Internet , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Precursores del ARN/metabolismo , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/genética , Interfaz Usuario-Computador
5.
Nucleic Acids Res ; 40(Database issue): D1168-72, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22123747

RESUMEN

The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa.


Asunto(s)
ADN Mitocondrial/química , Bases de Datos de Ácidos Nucleicos , Evolución Molecular , Genoma Mitocondrial , Sistema de Lectura Ribosómico , Genes Mitocondriales , Intrones , Proteínas Mitocondriales/genética , Anotación de Secuencia Molecular , Programas Informáticos
6.
BMC Bioinformatics ; 14 Suppl 7: S11, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23815231

RESUMEN

BACKGROUND: The advent of massively parallel sequencing technologies (Next Generation Sequencing, NGS) profoundly modified the landscape of human genetics.In particular, Whole Exome Sequencing (WES) is the NGS branch that focuses on the exonic regions of the eukaryotic genomes; exomes are ideal to help us understanding high-penetrance allelic variation and its relationship to phenotype. A complete WES analysis involves several steps which need to be suitably designed and arranged into an efficient pipeline.Managing a NGS analysis pipeline and its huge amount of produced data requires non trivial IT skills and computational power. RESULTS: Our web resource WEP (Whole-Exome sequencing Pipeline web tool) performs a complete WES pipeline and provides easy access through interface to intermediate and final results. The WEP pipeline is composed of several steps:1) verification of input integrity and quality checks, read trimming and filtering; 2) gapped alignment; 3) BAM conversion, sorting and indexing; 4) duplicates removal; 5) alignment optimization around insertion/deletion (indel) positions; 6) recalibration of quality scores; 7) single nucleotide and deletion/insertion polymorphism (SNP and DIP) variant calling; 8) variant annotation; 9) result storage into custom databases to allow cross-linking and intersections, statistics and much more. In order to overcome the challenge of managing large amount of data and maximize the biological information extracted from them, our tool restricts the number of final results filtering data by customizable thresholds, facilitating the identification of functionally significant variants. Default threshold values are also provided at the analysis computation completion, tuned with the most common literature work published in recent years. CONCLUSIONS: Through our tool a user can perform the whole analysis without knowing the underlying hardware and software architecture, dealing with both paired and single end data. The interface provides an easy and intuitive access for data submission and a user-friendly web interface for annotated variant visualization.Non-IT mastered users can access through WEP to the most updated and tested WES algorithms, tuned to maximize the quality of called variants while minimizing artifacts and false positives.The web tool is available at the following web address: http://www.caspur.it/wep.


Asunto(s)
Exoma , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Algoritmos , Humanos , Mutación INDEL , Internet , Polimorfismo de Nucleótido Simple , Interfaz Usuario-Computador
7.
Nucleic Acids Res ; 39(Database issue): D80-5, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21051348

RESUMEN

Alternative splicing is emerging as a major mechanism for the expansion of the transcriptome and proteome diversity, particularly in human and other vertebrates. However, the proportion of alternative transcripts and proteins actually endowed with functional activity is currently highly debated. We present here a new release of ASPicDB which now provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256,939 protein variants from 17,191 multi-exon genes have been extensively annotated through state of the art machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user. The retrieval interface also enables the selection of protein variants showing specific differences in the annotated features. ASPicDB is available at http://www.caspur.it/ASPicDB/.


Asunto(s)
Empalme Alternativo , Bases de Datos Genéticas , Proteínas/química , Proteínas/genética , Exones , Variación Genética , Humanos , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , ARN Mensajero/química , Análisis de Secuencia de Proteína , Interfaz Usuario-Computador
8.
Bioinformatics ; 27(9): 1311-2, 2011 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-21427194

RESUMEN

UNLABELLED: ExpEdit is a web application for assessing RNA editing in human at known or user-specified sites supported by transcript data obtained by RNA-Seq experiments. Mapping data (in SAM/BAM format) or directly sequence reads [in FASTQ/short read archive (SRA) format] can be provided as input to carry out a comparative analysis against a large collection of known editing sites collected in DARNED database as well as other user-provided potentially edited positions. Results are shown as dynamic tables containing University of California, Santa Cruz (UCSC) links for a quick examination of the genomic context. AVAILABILITY: ExpEdit is freely available on the web at http://www.caspur.it/ExpEdit/.


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Internet , Edición de ARN , Análisis de Secuencia de ARN/métodos , Humanos , Programas Informáticos
9.
Methods Mol Biol ; 2284: 393-415, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33835454

RESUMEN

Since 1950 main studies of RNA regarded its role in the protein synthesis. Later insights showed that only a small portion of RNA codes for proteins where the rest could have different functional roles. With the advent of Next Generation Sequencing (NGS) and in particular with RNA-seq technology the cost of sequencing production dropped down. Among the NGS application areas, the transcriptome analysis, that is, the analysis of transcripts in a cell, their quantification for a specific developmental stage or treatment condition, became more and more adopted in the laboratories. As a consequence in the last decade new insights were gained in the understanding of both transcriptome complexity and involvement of RNA molecules in cellular processes. For what concerns computational advances, bioinformatics research developed new methods for analyzing RNA-seq data. The comparison among transcriptome profiles from several samples is often a difficult task for nonexpert programmers. Here, in this chapter, we introduce RAP (RNA-Seq Analysis Pipeline), a completely automated web tool for transcriptome analysis. It is a user-friendly web tool implementing a detailed transcriptome workflow to detect differential expressed genes and transcript, identify spliced junctions and constitutive or alternative polyadenylation sites and predict gene fusion events. Through the web interface the researchers can get all this information without any knowledge of the underlying High Performance Computing infrastructure.


Asunto(s)
Internet , RNA-Seq/métodos , Programas Informáticos , Animales , Biología Computacional/métodos , Análisis de Datos , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Poliadenilación , RNA-Seq/estadística & datos numéricos , Análisis de Secuencia de ARN/métodos , Transcriptoma , Secuenciación del Exoma
10.
Methods Mol Biol ; 1269: 327-38, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25577388

RESUMEN

Revealing the impact of A-to-I RNA editing in RNA-Seq experiments is relevant in humans because RNA editing can influence gene expression. In addition, its deregulation has been linked to a variety of human diseases. Exploiting the RNA editing potential in complete RNA-Seq datasets, however, is a challenging task. Indeed, no dedicated software is available, and sometimes deep computational skills and appropriate hardware resources are required. To explore the impact of known RNA editing events in massive transcriptome sequencing experiments, we developed the ExpEdit web service application. In the present work, we provide an overview of ExpEdit as well as methodologies to investigate known RNA editing in human RNA-Seq datasets.


Asunto(s)
Edición de ARN/genética , Análisis de Secuencia de ARN/métodos , Biología Computacional , Genómica
11.
Methods Mol Biol ; 1269: 365-78, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25577391

RESUMEN

Alternative splicing (AS) is a basic molecular phenomenon that increases the functional complexity of higher eukaryotic transcriptomes. Indeed, through AS individual gene loci can generate multiple RNAs from the same pre-mRNA. AS has been investigated in a variety of clinical and pathological studies, such as the transcriptome regulation in cancer. In human, recent works based on massive RNA sequencing indicate that >95 % of pre-mRNAs are processed to yield multiple transcripts. Given the biological relevance of AS, several computational efforts have been done leading to the implementation of novel algorithms and specific specialized databases. Here we describe the web application ASPicDB that allows the recovery of detailed biological information about the splicing mechanism. ASPicDB provides powerful querying systems to interrogate AS events at gene, transcript, and protein levels. Finally, ASPicDB includes web visualization instruments to browse and export results for further off-line analyses.


Asunto(s)
Empalme Alternativo/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Algoritmos , Internet
12.
Int J Radiat Biol ; 88(11): 822-9, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22420862

RESUMEN

PURPOSE: Our goal was to identify genes showing a general transcriptional response to irradiation in mammalian cells and to analyze their response in function of dose, time and quality of irradiation and of cell type. MATERIALS AND METHODS: We used a modified MIAME (Minimal Information About Microarray Experiments) protocol to import microarray data from 177 different irradiation conditions in the Radiation Genes database and performed cut-off-based selections and hierarchical gene clustering. RESULTS: We identified a set of 29 genes which respond to a wide range of irradiation conditions in different cell types and tissues. Functional analysis of the negatively modulated genes revealed a dominant signature of mitotic cell cycle regulation which appears both dose and time-dependent. This signature is prominent in cancer cells and highly proliferating tissues but it is strongly attenuated in non cancer cells. CONCLUSIONS: The transcriptional response of mammalian cancer cells to irradiation is dominated by a mitotic cell cycle signature both dose and time-dependent. This core response, which is present in cancer cells and highly proliferating tissues such as skin, blood and lymph node, is weaker or absent in non-cancer cells and in liver and spleen. CDKN1A (cyclin-dependent kinase inhibitor 1A) appears as the most generally induced mammalian gene and its response (mostly dose- and time-independent) seems to go beyond the typical DNA damage response.


Asunto(s)
Proteínas de Ciclo Celular/metabolismo , Ciclo Celular/efectos de la radiación , Neoplasias/metabolismo , Neoplasias/patología , Factores de Transcripción/metabolismo , Activación Transcripcional/efectos de la radiación , Animales , Minería de Datos , Bases de Datos Genéticas , Bases de Datos de Proteínas , Relación Dosis-Respuesta en la Radiación , Humanos , Ratones , Dosis de Radiación
13.
Mitochondrion ; 10(2): 192-9, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20080208

RESUMEN

MitoZoa is a relational database collecting curated metazoan entries of complete or nearly complete mitochondrial genomes (mtDNA), specifically designed to assist comparative studies of mitochondrial genome-level features in a given taxon or in congeneric species of Metazoa. The principal novelties of MitoZoa are extensive corrections/improvements of the mtDNA annotations and the possibility of easily searching for data on: (1) gene order, a genomic feature useful as phylogenetic marker; (2) sequence, size and location of non-coding regions, likely containing the regulatory signals for mtDNA replication and transcription; (3) mt features/sequences of congeneric species, where saturation phenomena in nucleotide substitutions and gene order changes are expected to be absent or at least minimal. In addition, MitoZoa allows the exploration of basic mt features such as molecule topology, genetic code, gene content, and compositional parameters of the entire genome. Finally, in order to facilitate downstream analyses of retrieved data, MitoZoa entry lists can be visualized and downloaded in a tabular format, while sequences and gene order data are provided in FASTA and FASTA-like formats, respectively. The MitoZoa database is available at http://www.caspur.it/mitozoa.


Asunto(s)
Bases de Datos Genéticas , Genoma Mitocondrial , Animales
14.
Database (Oxford) ; 2009: bap007, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20157480

RESUMEN

The analysis of the great extent of data generated by using DNA microarrays technologies has shown that the transcriptional response to radiation can be considerably different depending on the quality, the dose range and dose rate of radiation, as well as the timing selected for the analysis. At present, it is very difficult to integrate data obtained under several experimental conditions in different biological systems to reach overall conclusions or build regulatory models which may be tested and validated. In fact, most available data is buried in different websites, public or private, in general or local repositories or in files included in published papers; it is often in various formats, which makes a wide comparison even more difficult. The Radiation Genes Database (http://www.caspur.it/RadiationGenes) collects microarrays data from various local and public repositories or from published papers and supplementary materials. The database classifies it in terms of significant variables, such as radiation quality, dose, dose rate and sampling timing, as to provide user-friendly tools to facilitate data integration and comparison.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA