Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
Add more filters











Publication year range
1.
Nat Commun ; 14(1): 5102, 2023 09 04.
Article in English | MEDLINE | ID: mdl-37666818

ABSTRACT

Flow cytometry (FCM) can investigate dozens of parameters from millions of cells and hundreds of specimens in a short time and at a reasonable cost, but the amount of data that is generated is considerable. Computational approaches are useful to identify novel subpopulations and molecular biomarkers, but generally require deep expertize in bioinformatics and the use of different platforms. To overcome these limitations, we introduce CRUSTY, an interactive, user-friendly webtool incorporating the most popular algorithms for FCM data analysis, and capable of visualizing graphical and tabular results and automatically generating publication-quality figures within minutes. CRUSTY also hosts an interactive interface for the exploration of results in real time. Thus, CRUSTY enables a large number of users to mine complex datasets and reduce the time required for data exploration and interpretation. CRUSTY is accessible at https://crusty.humanitas.it/ .


Subject(s)
Algorithms , Computational Biology , Flow Cytometry , Data Analysis
2.
PLoS One ; 18(5): e0286104, 2023.
Article in English | MEDLINE | ID: mdl-37252915

ABSTRACT

Long non-coding RNAs (lncRNAs) have emerged as key regulators of cellular senescence by transcriptionally and post-transcriptionally modulating the expression of many important genes involved in senescence-associated pathways and processes. Among the different lncRNAs associated to senescence, Senescence Associated Long Non-coding RNA (SALNR) was found to be down-regulated in different cellular models of senescence. Since its release in 2015, SALNR has not been annotated in any database or public repository, and no other experimental data have been published. The SALNR sequence is located on the long arm of chromosome 10, at band 10q23.33, and it overlaps the 3' end of the HELLS gene. This investigation helped to unravel the mystery of the existence of SALNR by analyzing publicly available short- and long-read RNA sequencing data sets and RT-PCR analysis in human tissues and cell lines. Additionally, the expression of HELLS has been studied in cellular models of replicative senescence, both in silico and in vitro. Our findings, while not supporting the actual existence of SALNR as an independent transcript in the analyzed experimental models, demonstrate the expression of a predicted HELLS isoform entirely covering the SALNR genomic region. Furthermore, we observed a strong down-regulation of HELLS in senescent cells versus proliferating cells, supporting its role in the senescence and aging process.


Subject(s)
RNA, Long Noncoding , Humans , RNA, Long Noncoding/genetics , Cellular Senescence/genetics , Down-Regulation , Cell Line , Fibroblasts/physiology , DNA Helicases/genetics
3.
Front Genet ; 11: 552490, 2020.
Article in English | MEDLINE | ID: mdl-33193626

ABSTRACT

MicroRNAs (miRNAs) are ubiquitous regulators of gene expression, evolutionarily conserved in plants and mammals. In recent years, although a growing number of papers debate the role of plant miRNAs on human gene expression, the molecular mechanisms through which this effect is achieved are still not completely elucidated. Some evidence suggest that this interaction might be sequence specific, and in this work, we investigated this possibility by transcriptomic and bioinformatics approaches. Plant and human miRNA sequences from primary databases were collected and compared for their similarities (global or local alignments). Out of 2,588 human miRNAs, 1,606 showed a perfect match of their seed sequence with the 5' end of 3,172 plant miRNAs. Further selections were applied based on the role of the human target genes or of the miRNA in cell cycle regulation (as an oncogene, tumor suppressor, or a biomarker for prognosis, or diagnosis in cancer). Based on these criteria, 20 human miRNAs were selected as potential functional analogous of 7 plant miRNAs, which were in turn transfected in different cell lines to evaluate their effect on cell proliferation. A significant decrease was observed in colorectal carcinoma HCT116 cell line. RNA-Seq demonstrated that 446 genes were differentially expressed 72 h after transfection. Noteworthy, we demonstrated that the plant mtr-miR-5754 and gma-miR4995 directly target the tumor-associated long non-coding RNA metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) and nuclear paraspeckle assembly transcript 1 (NEAT1) in a sequence-specific manner. In conclusion, according to other recent discoveries, our study strengthens and expands the hypothesis that plant miRNAs can have a regulatory effect in mammals by targeting both protein-coding and non-coding RNA, thus suggesting new biotechnological applications.

4.
Nucleic Acids Res ; 48(W1): W200-W207, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32402076

ABSTRACT

High-Throughput Sequencing technologies are transforming many research fields, including the analysis of phage display libraries. The phage display technology coupled with deep sequencing was introduced more than a decade ago and holds the potential to circumvent the traditional laborious picking and testing of individual phage rescued clones. However, from a bioinformatics point of view, the analysis of this kind of data was always performed by adapting tools designed for other purposes, thus not considering the noise background typical of the 'interactome sequencing' approach and the heterogeneity of the data. InteractomeSeq is a web server allowing data analysis of protein domains ('domainome') or epitopes ('epitome') from either Eukaryotic or Prokaryotic genomic phage libraries generated and selected by following an Interactome sequencing approach. InteractomeSeq allows users to upload raw sequencing data and to obtain an accurate characterization of domainome/epitome profiles after setting the parameters required to tune the analysis. The release of this tool is relevant for the scientific and clinical community, because InteractomeSeq will fill an existing gap in the field of large-scale biomarkers profiling, reverse vaccinology, and structural/functional studies, thus contributing essential information for gene annotation or antigen identification. InteractomeSeq is freely available at https://InteractomeSeq.ba.itb.cnr.it/.


Subject(s)
Cell Surface Display Techniques , Epitopes , High-Throughput Nucleotide Sequencing , Protein Domains , Software , Bacteriophages/genetics , Internet
5.
Brain Sci ; 9(10)2019 Oct 22.
Article in English | MEDLINE | ID: mdl-31652596

ABSTRACT

Attention Deficit Hyperactivity Disorder (ADHD) is a childhood-onset neurodevelopmental disorder, whose etiology and pathogenesis are still largely unknown. In order to uncover novel regulatory networks and molecular pathways possibly related to ADHD, we performed an integrated miRNA and mRNA expression profiling analysis in peripheral blood samples of children with ADHD and age-matched typically developing (TD) children. The expression levels of 13 miRNAs were evaluated with microfluidic qPCR, and differentially expressed (DE) mRNAs were detected on an Illumina HiSeq 2500 genome analyzer. The miRNA targetome was identified using an integrated approach of validated and predicted interaction data extracted from seven different bioinformatic tools. Gene Ontology (GO) and pathway enrichment analyses were carried out. Results showed that six miRNAs (miR-652-3p, miR-942-5p, let-7b-5p, miR-181a-5p, miR-320a, and miR-148b-3p) and 560 genes were significantly DE in children with ADHD compared to TD subjects. After correction for multiple testing, only three miRNAs (miR-652-3p, miR-148b-3p, and miR-942-5p) remained significant. Genes known to be associated with ADHD (e.g., B4GALT2, SLC6A9 TLE1, ANK3, TRIO, TAF1, and SYNE1) were confirmed to be significantly DE in our study. Integrated miRNA and mRNA expression data identified critical key hubs involved in ADHD. Finally, the GO and pathway enrichment analyses of all DE genes showed their deep involvement in immune functions, reinforcing the hypothesis that an immune imbalance might contribute to the ADHD etiology. Despite the relatively small sample size, in this study we were able to build a complex miRNA-target interaction network in children with ADHD that might help in deciphering the disease pathogenesis. Validation in larger samples should be performed in order to possibly suggest novel therapeutic strategies for treating this complex disease.

6.
BMC Bioinformatics ; 19(Suppl 10): 350, 2018 Oct 15.
Article in English | MEDLINE | ID: mdl-30367585

ABSTRACT

BACKGROUND: High throughput technologies have provided the scientific community an unprecedented opportunity for large-scale analysis of genomes. Non-coding RNAs (ncRNAs), for a long time believed to be non-functional, are emerging as one of the most important and large family of gene regulators and key elements for genome maintenance. Functional studies have been able to assign to ncRNAs a wide spectrum of functions in primary biological processes, and for this reason they are assuming a growing importance as a potential new family of cancer therapeutic targets. Nevertheless, the number of functionally characterized ncRNAs is still too poor if compared to the number of new discovered ncRNAs. Thus platforms able to merge information from available resources addressing data integration issues are necessary and still insufficient to elucidate ncRNAs biological roles. RESULTS: In this paper, we describe a platform called Arena-Idb for the retrieval of comprehensive and non-redundant annotated ncRNAs interactions. Arena-Idb provides a framework for network reconstruction of ncRNA heterogeneous interactions (i.e., with other type of molecules) and relationships with human diseases which guide the integration of data, extracted from different sources, via mapping of entities and minimization of ambiguity. CONCLUSIONS: Arena-Idb provides a schema and a visualization system to integrate ncRNA interactions that assists in discovering ncRNA functions through the extraction of heterogeneous interaction networks. The Arena-Idb is available at http://arenaidb.ba.itb.cnr.it.


Subject(s)
Gene Regulatory Networks , RNA, Untranslated/genetics , Software , Databases, Genetic , Humans , User-Computer Interface
7.
Front Mol Neurosci ; 11: 288, 2018.
Article in English | MEDLINE | ID: mdl-30210287

ABSTRACT

Amyotrophic lateral sclerosis (ALS) is a progressive and fatal neurodegenerative disease. While genetics and other factors contribute to ALS pathogenesis, critical knowledge is still missing and validated biomarkers for monitoring the disease activity have not yet been identified. To address those aspects we carried out this study with the primary aim of identifying possible miRNAs/mRNAs dysregulation associated with the sporadic form of the disease (sALS). Additionally, we explored miRNAs as modulating factors of the observed clinical features. Study included 56 sALS and 20 healthy controls (HCs). We analyzed the peripheral blood samples of sALS patients and HCs with a high-throughput next-generation sequencing followed by an integrated bioinformatics/biostatistics analysis. Results showed that 38 miRNAs (let-7a-5p, let-7d-5p, let-7f-5p, let-7g-5p, let-7i-5p, miR-103a-3p, miR-106b-3p, miR-128-3p, miR-130a-3p, miR-130b-3p, miR-144-5p, miR-148a-3p, miR-148b-3p, miR-15a-5p, miR-15b-5p, miR-151a-5p, miR-151b, miR-16-5p, miR-182-5p, miR-183-5p, miR-186-5p, miR-22-3p, miR-221-3p, miR-223-3p, miR-23a-3p, miR-26a-5p, miR-26b-5p, miR-27b-3p, miR-28-3p, miR-30b-5p, miR-30c-5p, miR-342-3p, miR-425-5p, miR-451a, miR-532-5p, miR-550a-3p, miR-584-5p, miR-93-5p) were significantly downregulated in sALS. We also found that different miRNAs profiles characterized the bulbar/spinal onset and the progression rate. This observation supports the hypothesis that miRNAs may impact the phenotypic expression of the disease. Genes known to be associated with ALS (e.g., PARK7, C9orf72, ALS2, MATR3, SPG11, ATXN2) were confirmed to be dysregulated in our study. We also identified other potential candidate genes like LGALS3 (implicated in neuroinflammation) and PRKCD (activated in mitochondrial-induced apoptosis). Some of the downregulated genes are involved in molecular bindings to ions (i.e., metals, zinc, magnesium) and in ions-related functions. The genes that we found upregulated were involved in the immune response, oxidation-reduction, and apoptosis. These findings may have important implication for the monitoring, e.g., of sALS progression and therefore represent a significant advance in the elucidation of the disease's underlying molecular mechanisms. The extensive multidisciplinary approach we applied in this study was critically important for its success, especially in complex disorders such as sALS, wherein access to genetic background is a major limitation.

8.
PeerJ ; 6: e4845, 2018.
Article in English | MEDLINE | ID: mdl-29915686

ABSTRACT

Nowadays DNA meta-barcoding is a powerful instrument capable of quickly discovering the biodiversity of an environmental sample by integrating the DNA barcoding approach with High Throughput Sequencing technologies. It mainly consists of the parallel reading of informative genomic fragment/s able to discriminate living entities. Although this approach has been widely studied, it still needs optimization in some necessary steps requested in its advanced accomplishment. A fundamental element concerns the standardization of bioinformatic analyses pipelines. The aim of the present study was to underline a number of critical parameters of laboratory material preparation and taxonomic assignment pipelines in DNA meta-barcoding experiments using the cytochrome oxidase subunit-I (coxI) barcode region, known as a suitable molecular marker for animal species identification. We compared nine taxonomic assignment pipelines, including a custom in-house method, based on Hidden Markov Models. Moreover, we evaluated the potential influence of universal primers amplification bias in qPCR, as well as the correlation between GC content with taxonomic assignment results. The pipelines were tested on a community of known terrestrial invertebrates collected by pitfall traps from a chestnut forest in Italy. Although the present analysis was not exhaustive and needs additional investigation, our results suggest some potential improvements in laboratory material preparation and the introduction of additional parameters in taxonomic assignment pipelines. These include the correct setup of OTU clustering threshold, the calibration of GC content affecting sequencing quality and taxonomic classification, as well as the evaluation of PCR primers amplification bias on the final biodiversity pattern. Thus, careful attention and further validation/optimization of the above-mentioned variables would be required in a DNA meta-barcoding experimental routine.

9.
Nucleic Acids Res ; 46(D1): D127-D132, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29036529

ABSTRACT

A holistic understanding of environmental communities is the new challenge of metagenomics. Accordingly, the amplicon-based or metabarcoding approach, largely applied to investigate bacterial microbiomes, is moving to the eukaryotic world too. Indeed, the analysis of metabarcoding data may provide a comprehensive assessment of both bacterial and eukaryotic composition in a variety of environments, including human body. In this respect, whereas hypervariable regions of the 16S rRNA are the de facto standard barcode for bacteria, the Internal Transcribed Spacer 1 (ITS1) of ribosomal RNA gene cluster has shown a high potential in discriminating eukaryotes at deep taxonomic levels. As metabarcoding data analysis rely on the availability of a well-curated barcode reference resource, a comprehensive collection of ITS1 sequences supplied with robust taxonomies, is highly needed. To address this issue, we created ITSoneDB (available at http://itsonedb.cloud.ba.infn.it/) which in its current version hosts 985 240 ITS1 sequences spanning over 134 000 eukaryotic species. Each ITS1 is mapped on the NCBI reference taxonomy with its start and end positions precisely annotated. ITSoneDB has been developed in agreement to the FAIR guidelines by enabling the users to query and download its content through a simple web-interface and access relevant metadata by cross-linking to European Nucleotide Archive.


Subject(s)
DNA, Ribosomal Spacer/genetics , Databases, Nucleic Acid , RNA, Ribosomal/genetics , Animals , DNA Barcoding, Taxonomic , Eukaryota/genetics , Humans , Internet , Metagenomics/methods , Metagenomics/trends , Molecular Sequence Annotation , Multigene Family , User-Computer Interface
10.
Hum Mol Genet ; 27(1): 66-79, 2018 01 01.
Article in English | MEDLINE | ID: mdl-29087462

ABSTRACT

Multiple sclerosis (MS) is a complex disease of the CNS that usually affects young adults, although 3-5% of cases are diagnosed in childhood and adolescence (hence called pediatric MS, PedMS). Genetic predisposition, among other factors, seems to contribute to the risk of the onset, in pediatric as in adult ages, but few studies have investigated the genetic 'environmentally naïve' load of PedMS. The main goal of this study was to identify circulating markers (miRNAs), target genes (mRNAs) and functional pathways associated with PedMS; we also verified the impact of miRNAs on clinical features, i.e. disability and cognitive performances. The investigation was performed in 19 PedMS and 20 pediatric controls (PCs) using a High-Throughput Next-generation Sequencing (HT-NGS) approach followed by an integrated bioinformatics/biostatistics analysis. Twelve miRNAs were significantly upregulated (let-7a-5p, let-7b-5p, miR-25-3p, miR-125a-5p, miR-942-5p, miR-221-3p, miR-652-3p, miR-182-5p, miR-185-5p, miR-181a-5p, miR-320a, miR-99b-5p) and 1 miRNA was downregulated (miR-148b-3p) in PedMS compared with PCs. The interactions between the significant miRNAs and their targets uncovered predicted genes (i.e. TNFSF13B, TLR2, BACH2, KLF4) related to immunological functions, as well as genes involved in autophagy-related processes (i.e. ATG16L1, SORT1, LAMP2) and ATPase activity (i.e. ABCA1, GPX3). No significant molecular profiles were associated with any PedMS demographic/clinical features. Both miRNAs and mRNA expressions predicted the phenotypes (PedMS-PC) with an accuracy of 92% and 91%, respectively. In our view, this original strategy of contemporary miRNA/mRNA analysis may help to shed light in the genetic background of the disease, suggesting further molecular investigations in novel pathogenic mechanisms.


Subject(s)
Multiple Sclerosis/genetics , Sequence Analysis, RNA/methods , Adolescent , Biomarkers , Child , Child, Preschool , Computational Biology , Female , Gene Expression Regulation/genetics , Genetic Predisposition to Disease/genetics , Humans , Kruppel-Like Factor 4 , Male , MicroRNAs/genetics , RNA, Messenger/genetics , Transcriptome/genetics
12.
Med Sci (Basel) ; 5(3)2017 Sep 18.
Article in English | MEDLINE | ID: mdl-29099035

ABSTRACT

Extracellular vesicles (EVs), nanoparticles originated from different cell types, seem to be implicated in several cellular activities. In the Central Nervous System (CNS), glia and neurons secrete EVs and recent studies have demonstrated that the intercellular communication mediated by EVs has versatile functional impact in the cerebral homeostasis. This essential role may be due to their proteins and RNAs cargo that possibly modify the phenotypes of the targeted cells. Despite the increasing importance of EVs, little is known about their fluctuations in physiological as well as in pathological conditions. Furthermore, only few studies have investigated the contents of contemporary EVs subgroups (microvesicles, MVs and exosomes, EXOs) with the purpose of discriminating between their features and functional roles. In order to possibly shed light on these issues, we performed a pilot study in which MVs and EXOs extracted from serum samples of a little cohort of subjects (patients with the first clinical evidence of CNS demyelination, also known as Clinically Isolated Syndrome and Healthy Controls) were submitted to deep small-RNA sequencing. Data were analysed by an in-home bioinformatics platform. In line with previous reports, distinct classes of non-coding RNAs have been detected in both the EVs subsets, offering interesting suggestions on their origins and functions. We also verified the feasibility of this extensive molecular approach, thus supporting its valuable use for the analysis of circulating biomarkers (e.g., microRNAs) in order to investigate and monitor specific diseases.

13.
Nucleic Acids Res ; 45(W1): W109-W115, 2017 07 03.
Article in English | MEDLINE | ID: mdl-28460063

ABSTRACT

The structural and conformational organization of chromosomes is crucial for gene expression regulation in eukaryotes and prokaryotes as well. Up to date, gene expression data generated using either microarray or RNA-sequencing are available for many bacterial genomes. However, differential gene expression is usually investigated with methods considering each gene independently, thus not taking into account the physical localization of genes along a bacterial chromosome. Here, we present WoPPER, a web tool integrating gene expression and genomic annotations to identify differentially expressed chromosomal regions in bacteria. RNA-sequencing or microarray-based gene expression data are provided as input, along with gene annotations. The user can select genomic annotations from an internal database including 2780 bacterial strains, or provide custom genomic annotations. The analysis produces as output the lists of positionally related genes showing a coordinated trend of differential expression. Graphical representations, including a circular plot of the analyzed chromosome, allow intuitive browsing of the results. The analysis procedure is based on our previously published R-package PREDA. The release of this tool is timely and relevant for the scientific community, as WoPPER will fill an existing gap in prokaryotic gene expression data analysis and visualization tools. WoPPER is open to all users and can be reached at the following URL: https://WoPPER.ba.itb.cnr.it.


Subject(s)
Bacteria/genetics , Gene Expression Profiling , Genome, Bacterial , Software , Bacteria/metabolism , Chromosomes, Bacterial , Gene Expression , Genomics , Internet , Molecular Sequence Annotation
14.
BMC Genomics ; 14: 855, 2013 Dec 05.
Article in English | MEDLINE | ID: mdl-24308330

ABSTRACT

BACKGROUND: Recent studies have demonstrated an unexpected complexity of transcription in eukaryotes. The majority of the genome is transcribed and only a little fraction of these transcripts is annotated as protein coding genes and their splice variants. Indeed, most transcripts are the result of antisense, overlapping and non-coding RNA expression. In this frame, one of the key aims of high throughput transcriptome sequencing is the detection of all RNA species present in the cell and the first crucial step for RNA-seq users is represented by the choice of the strategy for cDNA library construction. The protocols developed so far provide the utilization of the entire library for a single sequencing run with a specific platform. RESULTS: We set up a unique protocol to generate and amplify a strand-specific cDNA library representative of all RNA species that may be implemented with all major platforms currently available on the market (Roche 454, Illumina, ABI/SOLiD). Our method is reproducible, fast, easy-to-perform and even allows to start from low input total RNA. Furthermore, we provide a suitable bioinformatics tool for the analysis of the sequences produced following this protocol. CONCLUSION: We tested the efficiency of our strategy, showing that our method is platform-independent, thus allowing the simultaneous analysis of the same sample with different NGS technologies, and providing an accurate quantitative and qualitative portrait of complex whole transcriptomes.


Subject(s)
Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA/methods , Transcriptome , Animals , Cell Line, Tumor , Chromosome Mapping , Expressed Sequence Tags , Gene Expression Regulation , Heterografts , Humans , Mice , Molecular Sequence Annotation
15.
Brief Bioinform ; 13(6): 682-95, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22786784

ABSTRACT

Metagenomics is providing an unprecedented access to the environmental microbial diversity. The amplicon-based metagenomics approach involves the PCR-targeted sequencing of a genetic locus fitting different features. Namely, it must be ubiquitous in the taxonomic range of interest, variable enough to discriminate between different species but flanked by highly conserved sequences, and of suitable size to be sequenced through next-generation platforms. The internal transcribed spacers 1 and 2 (ITS1 and ITS2) of the ribosomal DNA operon and one or more hyper-variable regions of 16S ribosomal RNA gene are typically used to identify fungal and bacterial species, respectively. In this context, reliable reference databases and taxonomies are crucial to assign amplicon sequence reads to the correct phylogenetic ranks. Several resources provide consistent phylogenetic classification of publicly available 16S ribosomal DNA sequences, whereas the state of ribosomal internal transcribed spacers reference databases is notably less advanced. In this review, we aim to give an overview of existing reference resources for both types of markers, highlighting strengths and possible shortcomings of their use for metagenomics purposes. Moreover, we present a new database, ITSoneDB, of well annotated and phylogenetically classified ITS1 sequences to be used as a reference collection in metagenomic studies of environmental fungal communities. ITSoneDB is available for download and browsing at http://itsonedb.ba.itb.cnr.it/.


Subject(s)
Databases, Genetic , Metagenomics/methods , Algorithms , Fungi/classification , Fungi/genetics , Genes, rRNA , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 16S/metabolism
16.
BMC Bioinformatics ; 13 Suppl 4: S21, 2012 Mar 28.
Article in English | MEDLINE | ID: mdl-22536968

ABSTRACT

BACKGROUND: It is known from recent studies that more than 90% of human multi-exon genes are subject to Alternative Splicing (AS), a key molecular mechanism in which multiple transcripts may be generated from a single gene. It is widely recognized that a breakdown in AS mechanisms plays an important role in cellular differentiation and pathologies. Polymerase Chain Reactions, microarrays and sequencing technologies have been applied to the study of transcript diversity arising from alternative expression. Last generation Affymetrix GeneChip Human Exon 1.0 ST Arrays offer a more detailed view of the gene expression profile providing information on the AS patterns. The exon array technology, with more than five million data points, can detect approximately one million exons, and it allows performing analyses at both gene and exon level. In this paper we describe BEAT, an integrated user-friendly bioinformatics framework to store, analyze and visualize exon arrays datasets. It combines a data warehouse approach with some rigorous statistical methods for assessing the AS of genes involved in diseases. Meta statistics are proposed as a novel approach to explore the analysis results. BEAT is available at http://beat.ba.itb.cnr.it. RESULTS: BEAT is a web tool which allows uploading and analyzing exon array datasets using standard statistical methods and an easy-to-use graphical web front-end. BEAT has been tested on a dataset with 173 samples and tuned using new datasets of exon array experiments from 28 colorectal cancer and 26 renal cell cancer samples produced at the Medical Genetics Unit of IRCCS Casa Sollievo della Sofferenza.To highlight all possible AS events, alternative names, accession Ids, Gene Ontology terms and biochemical pathways annotations are integrated with exon and gene level expression plots. The user can customize the results choosing custom thresholds for the statistical parameters and exploiting the available clinical data of the samples for a multivariate AS analysis. CONCLUSIONS: Despite exon array chips being widely used for transcriptomics studies, there is a lack of analysis tools offering advanced statistical features and requiring no programming knowledge. BEAT provides a user-friendly platform for a comprehensive study of AS events in human diseases, displaying the analysis results with easily interpretable and interactive tables and graphics.


Subject(s)
Databases, Genetic , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Alternative Splicing , Carcinoma, Renal Cell/genetics , Colorectal Neoplasms/genetics , Humans , Internet , Kidney Neoplasms/genetics
17.
BMC Bioinformatics ; 13 Suppl 4: S4, 2012 Mar 28.
Article in English | MEDLINE | ID: mdl-22536971

ABSTRACT

BACKGROUND: In the scientific biodiversity community, it is increasingly perceived the need to build a bridge between molecular and traditional biodiversity studies. We believe that the information technology could have a preeminent role in integrating the information generated by these studies with the large amount of molecular data we can find in bioinformatics public databases. This work is primarily aimed at building a bioinformatic infrastructure for the integration of public and private biodiversity data through the development of GIDL, an Intelligent Data Loader coupled with the Molecular Biodiversity Database. The system presented here organizes in an ontological way and locally stores the sequence and annotation data contained in the GenBank primary database. METHODS: The GIDL architecture consists of a relational database and of an intelligent data loader software. The relational database schema is designed to manage biodiversity information (Molecular Biodiversity Database) and it is organized in four areas: MolecularData, Experiment, Collection and Taxonomy. The MolecularData area is inspired to an established standard in Generic Model Organism Databases, the Chado relational schema. The peculiarity of Chado, and also its strength, is the adoption of an ontological schema which makes use of the Sequence Ontology. The Intelligent Data Loader (IDL) component of GIDL is an Extract, Transform and Load software able to parse data, to discover hidden information in the GenBank entries and to populate the Molecular Biodiversity Database. The IDL is composed by three main modules: the Parser, able to parse GenBank flat files; the Reasoner, which automatically builds CLIPS facts mapping the biological knowledge expressed by the Sequence Ontology; the DBFiller, which translates the CLIPS facts into ordered SQL statements used to populate the database. In GIDL Semantic Web technologies have been adopted due to their advantages in data representation, integration and processing. RESULTS AND CONCLUSIONS: Entries coming from Virus (814,122), Plant (1,365,360) and Invertebrate (959,065) divisions of GenBank rel.180 have been loaded in the Molecular Biodiversity Database by GIDL. Our system, combining the Sequence Ontology and the Chado schema, allows a more powerful query expressiveness compared with the most commonly used sequence retrieval systems like Entrez or SRS.


Subject(s)
Biodiversity , Computational Biology/methods , Databases, Nucleic Acid , Expert Systems , Animals , Internet , Software
18.
Curr Protein Pept Sci ; 12(5): 448-54, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21418024

ABSTRACT

PlantPIs is a web querying system for a database collection of plant protease inhibitors data. Protease inhibitors in plants are naturally occurring proteins that inhibit the function of endogenous and exogenous proteases. In this paper the design and development of a web framework providing a clear and very flexible way of querying plant protease inhibitors data is reported. The web resource is based on a relational database, containing data of plants protease inhibitors publicly accessible, and a graphical user interface providing all the necessary browsing tools, including a data exporting function. PlantPIs contains information extracted principally from MEROPS database, filtered, annotated and compared with data stored in other protein and gene public databases, using both automated techniques and domain expert evaluations. The data are organized to allow a flexible and easy way to access stored information. The database is accessible at http://www.plantpis.ba.itb.cnr.it/.


Subject(s)
Databases, Factual , Internet , Plant Proteins , Plants , Protease Inhibitors/metabolism , Health Resources , Plant Proteins/genetics , Plant Proteins/metabolism , Plants/enzymology , Plants/genetics , User-Computer Interface
19.
Nucleic Acids Res ; 39(Database issue): D80-5, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21051348

ABSTRACT

Alternative splicing is emerging as a major mechanism for the expansion of the transcriptome and proteome diversity, particularly in human and other vertebrates. However, the proportion of alternative transcripts and proteins actually endowed with functional activity is currently highly debated. We present here a new release of ASPicDB which now provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256,939 protein variants from 17,191 multi-exon genes have been extensively annotated through state of the art machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user. The retrieval interface also enables the selection of protein variants showing specific differences in the annotated features. ASPicDB is available at http://www.caspur.it/ASPicDB/.


Subject(s)
Alternative Splicing , Databases, Genetic , Proteins/chemistry , Proteins/genetics , Exons , Genetic Variation , Humans , Protein Isoforms/chemistry , Protein Isoforms/genetics , RNA, Messenger/chemistry , Sequence Analysis, Protein , User-Computer Interface
20.
Nucleic Acids Res ; 38(Database issue): D75-80, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19880380

ABSTRACT

The 5' and 3' untranslated regions of eukaryotic mRNAs (UTRs) play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5' and 3' untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated and also collated as the UTRsite database where more specific information on the functional motifs and cross-links to interacting regulatory protein are provided. In the current update, the UTR entries have been organized in a gene-centric structure to better visualize and retrieve 5' and 3'UTR variants generated by alternative initiation and termination of transcription and alternative splicing. Experimentally validated miRNA targets and conserved sequence elements are also annotated. The integration of UTRdb with genomic data has allowed the implementation of an efficient annotation system and a powerful retrieval resource for the selection and extraction of specific UTR subsets. All internet resources implemented for retrieval and functional analysis of 5' and 3' untranslated regions of eukaryotic mRNAs are accessible at http://utrdb.ba.itb.cnr.it/.


Subject(s)
3' Untranslated Regions , 5' Untranslated Regions , Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Algorithms , Animals , Computational Biology/trends , Databases, Protein , Genome, Plant , Humans , Information Storage and Retrieval/methods , Internet , Protein Isoforms , Software , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL