Pesquisa | Portal de Pesquisa da BVS

1.

UTRdb 2.0: a comprehensive, expert curated catalog of eukaryotic mRNAs untranslated regions.

Lo Giudice, Claudio; Zambelli, Federico; Chiara, Matteo; Pavesi, Giulio; Tangaro, Marco Antonio; Picardi, Ernesto; Pesole, Graziano.

Nucleic Acids Res ; 51(D1): D337-D344, 2023 01 06.

Artigo em Inglês | MEDLINE | ID: mdl-36399486

RESUMO

The 5' and 3' untranslated regions of eukaryotic mRNAs (UTRs) play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization, and message stability. Since 1996, we have developed and maintained UTRdb, a specialized database of UTR sequences. Here we present UTRdb 2.0, a major update of UTRdb featuring an extensive collection of eukaryotic 5' and 3' UTR sequences, including over 26 million entries from over 6 million genes and 573 species, enriched with a curated set of functional annotations. Annotations include CAGE tags and polyA signals to label the completeness of 5' and 3'UTRs, respectively. In addition, uORFs and IRES are annotated in 5'UTRs as well as experimentally validated miRNA targets in 3'UTRs. Further annotations include evolutionarily conserved blocks, Rfam motifs, ADAR-mediated RNA editing events, and m6A modifications. A web interface allowing a flexible selection and retrieval of specific subsets of UTRs, selected according to a combination of criteria, has been implemented which also provides comprehensive download facilities. UTRdb 2.0 is accessible at http://utrdb.cloud.ba.infn.it/utrdb/.

Assuntos

Bases de Dados de Ácidos Nucleicos , Eucariotos , RNA Mensageiro , Regiões não Traduzidas , Regiões 3' não Traduzidas/genética , Regiões 5' não Traduzidas , Eucariotos/genética , Células Eucarióticas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo

2.

BALB/c and C57BL/6 Mice Differ in Polyreactive IgA Abundance, which Impacts the Generation of Antigen-Specific IgA and Microbiota Diversity.

Fransen, Floris; Zagato, Elena; Mazzini, Elisa; Fosso, Bruno; Manzari, Caterina; El Aidy, Sahar; Chiavelli, Andrea; D'Erchia, Anna Maria; Sethi, Maya K; Pabst, Oliver; Marzano, Marinella; Moretti, Silvia; Romani, Luigina; Penna, Giuseppe; Pesole, Graziano; Rescigno, Maria.

Immunity ; 43(3): 527-40, 2015 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-26362264

RESUMO

The interrelationship between IgAs and microbiota diversity is still unclear. Here we show that BALB/c mice had higher abundance and diversity of IgAs than C57BL/6 mice and that this correlated with increased microbiota diversity. We show that polyreactive IgAs mediated the entrance of non-invasive bacteria to Peyer's patches, independently of CX3CR1(+) phagocytes. This allowed the induction of bacteria-specific IgA and the establishment of a positive feedback loop of IgA production. Cohousing of mice or fecal transplantation had little or no influence on IgA production and had only partial impact on microbiota composition. Germ-free BALB/c, but not C57BL/6, mice already had polyreactive IgAs that influenced microbiota diversity and selection after colonization. Together, these data suggest that genetic predisposition to produce polyreactive IgAs has a strong impact on the generation of antigen-specific IgAs and the selection and maintenance of microbiota diversity.

Assuntos

Antígenos de Bactérias/imunologia , Variação Genética/imunologia , Imunoglobulina A/imunologia , Microbiota/imunologia , Animais , Bactérias/classificação , Bactérias/genética , Bactérias/imunologia , DNA Bacteriano/química , DNA Bacteriano/genética , Fezes/microbiologia , Citometria de Fluxo , Interações Hospedeiro-Patógeno/imunologia , Imunização , Imunoglobulina A/sangue , Imunoglobulina A/metabolismo , Metagenômica/métodos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Microbiota/genética , Nódulos Linfáticos Agregados/imunologia , Nódulos Linfáticos Agregados/metabolismo , Nódulos Linfáticos Agregados/microbiologia , Filogenia , RNA Ribossômico 16S/genética , Salmonella typhimurium/genética , Salmonella typhimurium/imunologia , Salmonella typhimurium/fisiologia , Especificidade da Espécie

3.

Unraveling C-to-U RNA editing events from direct RNA sequencing.

Fonzino, Adriano; Manzari, Caterina; Spadavecchia, Paola; Munagala, Uday; Torrini, Serena; Conticello, Silvestro; Pesole, Graziano; Picardi, Ernesto.

RNA Biol ; 21(1): 1-14, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38090878

RESUMO

In mammals, RNA editing events involve the conversion of adenosine (A) in inosine (I) by ADAR enzymes or the hydrolytic deamination of cytosine (C) in uracil (U) by the APOBEC family of enzymes, mostly APOBEC1. RNA editing has a plethora of biological functions, and its deregulation has been associated with various human disorders. While the large-scale detection of A-to-I is quite straightforward using the Illumina RNAseq technology, the identification of C-to-U events is a non-trivial task. This difficulty arises from the rarity of such events in eukaryotic genomes and the challenge of distinguishing them from background noise. Direct RNA sequencing by Oxford Nanopore Technology (ONT) permits the direct detection of Us on sequenced RNA reads. Surprisingly, using ONT reads from wild-type (WT) and APOBEC1-knock-out (KO) murine cell lines as well as in vitro synthesized RNA without any modification, we identified a systematic error affecting the accuracy of the Cs call, thereby leading to incorrect identifications of C-to-U events. To overcome this issue in direct RNA reads, here we introduce a novel machine learning strategy based on the isolation Forest (iForest) algorithm in which C-to-U editing events are considered as sequencing anomalies. Using in vitro synthesized and human ONT reads, our model optimizes the signal-to-noise ratio improving the detection of C-to-U editing sites with high accuracy, over 90% in all samples tested. Our results suggest that iForest, known for its rapid implementation and minimal memory requirements, is a promising tool to denoise ONT reads and reliably identify RNA modifications.

Assuntos

Edição de RNA , RNA , Camundongos , Animais , Humanos , RNA/genética , Sequência de Bases , Desaminases APOBEC/genética , Mamíferos/genética , Análise de Sequência de RNA

4.

Next generation sequencing of SARS-CoV-2 genomes: challenges, applications and opportunities.

Chiara, Matteo; D'Erchia, Anna Maria; Gissi, Carmela; Manzari, Caterina; Parisi, Antonio; Resta, Nicoletta; Zambelli, Federico; Picardi, Ernesto; Pavesi, Giulio; Horner, David S; Pesole, Graziano.

Brief Bioinform ; 22(2): 616-630, 2021 03 22.

Artigo em Inglês | MEDLINE | ID: mdl-33279989

RESUMO

Various next generation sequencing (NGS) based strategies have been successfully used in the recent past for tracing origins and understanding the evolution of infectious agents, investigating the spread and transmission chains of outbreaks, as well as facilitating the development of effective and rapid molecular diagnostic tests and contributing to the hunt for treatments and vaccines. The ongoing COVID-19 pandemic poses one of the greatest global threats in modern history and has already caused severe social and economic costs. The development of efficient and rapid sequencing methods to reconstruct the genomic sequence of SARS-CoV-2, the etiological agent of COVID-19, has been fundamental for the design of diagnostic molecular tests and to devise effective measures and strategies to mitigate the diffusion of the pandemic. Diverse approaches and sequencing methods can, as testified by the number of available sequences, be applied to SARS-CoV-2 genomes. However, each technology and sequencing approach has its own advantages and limitations. In the current review, we will provide a brief, but hopefully comprehensive, account of currently available platforms and methodological approaches for the sequencing of SARS-CoV-2 genomes. We also present an outline of current repositories and databases that provide access to SARS-CoV-2 genomic data and associated metadata. Finally, we offer general advice and guidelines for the appropriate sharing and deposition of SARS-CoV-2 data and metadata, and suggest that more efficient and standardized integration of current and future SARS-CoV-2-related data would greatly facilitate the struggle against this new pathogen. We hope that our 'vademecum' for the production and handling of SARS-CoV-2-related sequencing data, will contribute to this objective.

Assuntos

COVID-19/virologia , Genoma Viral , Sequenciamento de Nucleotídeos em Larga Escala/métodos , SARS-CoV-2/genética , COVID-19/epidemiologia , Humanos , Pandemias

5.

REDIportal: millions of novel A-to-I RNA editing events from thousands of RNAseq experiments.

Mansi, Luigi; Tangaro, Marco Antonio; Lo Giudice, Claudio; Flati, Tiziano; Kopel, Eli; Schaffer, Amos Avraham; Castrignanò, Tiziana; Chillemi, Giovanni; Pesole, Graziano; Picardi, Ernesto.

Nucleic Acids Res ; 49(D1): D1012-D1019, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33104797

RESUMO

RNA editing is a relevant epitranscriptome phenomenon able to increase the transcriptome and proteome diversity of eukaryotic organisms. ADAR mediated RNA editing is widespread in humans in which millions of A-to-I changes modify thousands of primary transcripts. RNA editing has pivotal roles in the regulation of gene expression or modulation of the innate immune response or functioning of several neurotransmitter receptors. Massive transcriptome sequencing has fostered the research in this field. Nonetheless, different aspects of the RNA editing biology are still unknown and need to be elucidated. To support the study of A-to-I RNA editing we have updated our REDIportal catalogue raising its content to about 16 millions of events detected in 9642 human RNAseq samples from the GTEx project by using a dedicated pipeline based on the HPC version of the REDItools software. REDIportal now allows searches at sample level, provides overviews of RNA editing profiles per each RNAseq experiment, implements a Gene View module to look at individual events in their genic context and hosts the CLAIRE database. Starting from this novel version, REDIportal will start collecting non-human RNA editing changes for comparative genomics investigations. The database is freely available at http://srv00.recas.ba.infn.it/atlas/index.html.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica , Proteoma/genética , Edição de RNA/genética , Transcriptoma/genética , Sequência de Bases/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Internet , Proteômica/métodos

6.

VID22 counteracts G-quadruplex-induced genome instability.

Galati, Elena; Bosio, Maria C; Novarina, Daniele; Chiara, Matteo; Bernini, Giulia M; Mozzarelli, Alessandro M; García-Rubio, Maria L; Gómez-González, Belén; Aguilera, Andrés; Carzaniga, Thomas; Todisco, Marco; Bellini, Tommaso; Nava, Giulia M; Frigè, Gianmaria; Sertic, Sarah; Horner, David S; Baryshnikova, Anastasia; Manzari, Caterina; D'Erchia, Anna M; Pesole, Graziano; Brown, Grant W; Muzi-Falconi, Marco; Lazzaro, Federico.

Nucleic Acids Res ; 49(22): 12785-12804, 2021 12 16.

Artigo em Inglês | MEDLINE | ID: mdl-34871443

RESUMO

Genome instability is a condition characterized by the accumulation of genetic alterations and is a hallmark of cancer cells. To uncover new genes and cellular pathways affecting endogenous DNA damage and genome integrity, we exploited a Synthetic Genetic Array (SGA)-based screen in yeast. Among the positive genes, we identified VID22, reported to be involved in DNA double-strand break repair. vid22Δ cells exhibit increased levels of endogenous DNA damage, chronic DNA damage response activation and accumulate DNA aberrations in sequences displaying high probabilities of forming G-quadruplexes (G4-DNA). If not resolved, these DNA secondary structures can block the progression of both DNA and RNA polymerases and correlate with chromosome fragile sites. Vid22 binds to and protects DNA at G4-containing regions both in vitro and in vivo. Loss of VID22 causes an increase in gross chromosomal rearrangement (GCR) events dependent on G-quadruplex forming sequences. Moreover, the absence of Vid22 causes defects in the correct maintenance of G4-DNA rich elements, such as telomeres and mtDNA, and hypersensitivity to the G4-stabilizing ligand TMPyP4. We thus propose that Vid22 is directly involved in genome integrity maintenance as a novel regulator of G4 metabolism.

Assuntos

Quadruplex G , Instabilidade Genômica , Proteínas de Membrana/fisiologia , Proteínas de Saccharomyces cerevisiae/fisiologia , Aberrações Cromossômicas , Dano ao DNA , Genoma Fúngico , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Homeostase do Telômero

7.

Hmga2 protein loss alters nuclear envelope and 3D chromatin structure.

Divisato, Giuseppina; Chiariello, Andrea M; Esposito, Andrea; Zoppoli, Pietro; Zambelli, Federico; Elia, Maria Antonietta; Pesole, Graziano; Incarnato, Danny; Passaro, Fabiana; Piscitelli, Silvia; Oliviero, Salvatore; Nicodemi, Mario; Parisi, Silvia; Russo, Tommaso.

BMC Biol ; 20(1): 171, 2022 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-35918713

RESUMO

BACKGROUND: The high-mobility group Hmga family of proteins are non-histone chromatin-interacting proteins which have been associated with a number of nuclear functions, including heterochromatin formation, replication, recombination, DNA repair, transcription, and formation of enhanceosomes. Due to its role based on dynamic interaction with chromatin, Hmga2 has a pathogenic role in diverse tumors and has been mainly studied in a cancer context; however, whether Hmga2 has similar physiological functions in normal cells remains less explored. Hmga2 was additionally shown to be required during the exit of embryonic stem cells (ESCs) from the ground state of pluripotency, to allow their transition into epiblast-like cells (EpiLCs), and here, we use that system to gain further understanding of normal Hmga2 function. RESULTS: We demonstrated that Hmga2 KO pluripotent stem cells fail to develop into EpiLCs. By using this experimental system, we studied the chromatin changes that take place upon the induction of EpiLCs and we observed that the loss of Hmga2 affects the histone mark H3K27me3, whose levels are higher in Hmga2 KO cells. Accordingly, a sustained expression of polycomb repressive complex 2 (PRC2), responsible for H3K27me3 deposition, was observed in KO cells. However, gene expression differences between differentiating wt vs Hmga2 KO cells did not show any significant enrichments of PRC2 targets. Similarly, endogenous Hmga2 association to chromatin in epiblast stem cells did not show any clear relationships with gene expression modification observed in Hmga2 KO. Hmga2 ChIP-seq confirmed that this protein preferentially binds to the chromatin regions associated with nuclear lamina. Starting from this observation, we demonstrated that nuclear lamina underwent severe alterations when Hmga2 KO or KD cells were induced to exit from the naïve state and this phenomenon is accompanied by a mislocalization of the heterochromatin mark H3K9me3 within the nucleus. As nuclear lamina (NL) is involved in the organization of 3D chromatin structure, we explored the possible effects of Hmga2 loss on this phenomenon. The analysis of Hi-C data in wt and Hmga2 KO cells allowed us to observe that inter-TAD (topologically associated domains) interactions in Hmga2 KO cells are different from those observed in wt cells. These differences clearly show a peculiar compartmentalization of inter-TAD interactions in chromatin regions associated or not to nuclear lamina. CONCLUSIONS: Overall, our results indicate that Hmga2 interacts with heterochromatic lamin-associated domains, and highlight a role for Hmga2 in the crosstalk between chromatin and nuclear lamina, affecting the establishment of inter-TAD interactions.

Assuntos

Membrana Nuclear , Células-Tronco Pluripotentes , Cromatina/genética , Cromatina/metabolismo , Proteína HMGA2/genética , Proteína HMGA2/metabolismo , Heterocromatina/metabolismo , Histonas/genética , Membrana Nuclear/metabolismo , Células-Tronco Pluripotentes/metabolismo , Complexo Repressor Polycomb 2/genética

8.

Whole-Exome and Transcriptome Sequencing Expands the Genotype of Majewski Osteodysplastic Primordial Dwarfism Type II.

Marzano, Flaviana; Chiara, Matteo; Consiglio, Arianna; D'Amato, Gabriele; Gentile, Mattia; Mirabelli, Valentina; Piane, Maria; Savio, Camilla; Fabiani, Marco; D'Elia, Domenica; Sbisà, Elisabetta; Scarano, Gioacchino; Lonardo, Fortunato; Tullo, Apollonia; Pesole, Graziano; Faienza, Maria Felicia.

Int J Mol Sci ; 24(15)2023 Jul 31.

Artigo em Inglês | MEDLINE | ID: mdl-37569667

RESUMO

Microcephalic Osteodysplastic Primordial Dwarfism type II (MOPDII) represents the most common form of primordial dwarfism. MOPD clinical features include severe prenatal and postnatal growth retardation, postnatal severe microcephaly, hypotonia, and an increased risk for cerebrovascular disease and insulin resistance. Autosomal recessive biallelic loss-of-function genomic variants in the centrosomal pericentrin (PCNT) gene on chromosome 21q22 cause MOPDII. Over the past decade, exome sequencing (ES) and massive RNA sequencing have been effectively employed for both the discovery of novel disease genes and to expand the genotypes of well-known diseases. In this paper we report the results both the RNA sequencing and ES of three patients affected by MOPDII with the aim of exploring whether differentially expressed genes and previously uncharacterized gene variants, in addition to PCNT pathogenic variants, could be associated with the complex phenotype of this disease. We discovered a downregulation of key factors involved in growth, such as IGF1R, IGF2R, and RAF1, in all three investigated patients. Moreover, ES identified a shortlist of genes associated with deleterious, rare variants in MOPDII patients. Our results suggest that Next Generation Sequencing (NGS) technologies can be successfully applied for the molecular characterization of the complex genotypic background of MOPDII.

Assuntos

Nanismo , Microcefalia , Osteocondrodisplasias , Humanos , Feminino , Gravidez , Microcefalia/genética , Exoma/genética , Transcriptoma , Retardo do Crescimento Fetal/genética , Nanismo/genética , Osteocondrodisplasias/genética , Genótipo , Mutação

9.

YAP contributes to DNA methylation remodeling upon mouse embryonic stem cell differentiation.

Passaro, Fabiana; De Martino, Ilaria; Zambelli, Federico; Di Benedetto, Giorgia; Barbato, Matteo; D'Erchia, Anna Maria; Manzari, Caterina; Pesole, Graziano; Mutarelli, Margherita; Cacchiarelli, Davide; Antonini, Dario; Parisi, Silvia; Russo, Tommaso.

J Biol Chem ; 296: 100138, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33268382

RESUMO

The Yes-associated protein (YAP), one of the major effectors of the Hippo pathway together with its related protein WW-domain-containing transcription regulator 1 (WWTR1; also known as TAZ), mediates a range of cellular processes from proliferation and death to morphogenesis. YAP and WW-domain-containing transcription regulator 1 (WWTR1; also known as TAZ) regulate a large number of target genes, acting as coactivators of DNA-binding transcription factors or as negative regulators of transcription by interacting with the nucleosome remodeling and histone deacetylase complexes. YAP is expressed in self-renewing embryonic stem cells (ESCs), although it is still debated whether it plays any crucial roles in the control of either stemness or differentiation. Here we show that the transient downregulation of YAP in mouse ESCs perturbs cellular homeostasis, leading to the inability to differentiate properly. Bisulfite genomic sequencing revealed that this transient knockdown caused a genome-wide alteration of the DNA methylation remodeling that takes place during the early steps of differentiation, suggesting that the phenotype we observed might be due to the dysregulation of some of the mechanisms involved in regulation of ESC exit from pluripotency. By gene expression analysis, we identified two molecules that could have a role in the altered genome-wide methylation profile: the long noncoding RNA ephemeron, whose rapid upregulation is crucial for the transition of ESCs into epiblast, and the methyltransferase-like protein Dnmt3l, which, during the embryo development, cooperates with Dnmt3a and Dnmt3b to contribute to the de novo DNA methylation that governs early steps of ESC differentiation. These data suggest a new role for YAP in the governance of the epigenetic dynamics of exit from pluripotency.

Assuntos

Proteínas Adaptadoras de Transdução de Sinal/metabolismo , Diferenciação Celular , DNA (Citosina-5-)-Metiltransferases/metabolismo , Metilação de DNA , Células-Tronco Embrionárias Murinas/citologia , Proteínas Adaptadoras de Transdução de Sinal/genética , Animais , DNA (Citosina-5-)-Metiltransferases/genética , Camundongos , Células-Tronco Embrionárias Murinas/metabolismo , Transdução de Sinais , Proteínas de Sinalização YAP , DNA Metiltransferase 3B

10.

Comparative Genomics Reveals Early Emergence and Biased Spatiotemporal Distribution of SARS-CoV-2.

Chiara, Matteo; Horner, David S; Gissi, Carmela; Pesole, Graziano.

Mol Biol Evol ; 38(6): 2547-2565, 2021 05 19.

Artigo em Inglês | MEDLINE | ID: mdl-33605421

RESUMO

Effective systems for the analysis of molecular data are fundamental for monitoring the spread of infectious diseases and studying pathogen evolution. The rapid identification of emerging viral strains, and/or genetic variants potentially associated with novel phenotypic features is one of the most important objectives of genomic surveillance of human pathogens and represents one of the first lines of defense for the control of their spread. During the COVID 19 pandemic, several taxonomic frameworks have been proposed for the classification of SARS-Cov-2 isolates. These systems, which are typically based on phylogenetic approaches, represent essential tools for epidemiological studies as well as contributing to the study of the origin of the outbreak. Here, we propose an alternative, reproducible, and transparent phenetic method to study changes in SARS-CoV-2 genomic diversity over time. We suggest that our approach can complement other systems and facilitate the identification of biologically relevant variants in the viral genome. To demonstrate the validity of our approach, we present comparative genomic analyses of more than 175,000 genomes. Our method delineates 22 distinct SARS-CoV-2 haplogroups, which, based on the distribution of high-frequency genetic variants, fall into four major macrohaplogroups. We highlight biased spatiotemporal distributions of SARS-CoV-2 genetic profiles and show that seven of the 22 haplogroups (and of all of the four haplogroup clusters) showed a broad geographic distribution within China by the time the outbreak was widely recognized-suggesting early emergence and widespread cryptic circulation of the virus well before its isolation in January 2020. General patterns of genomic variability are remarkably similar within all major SARS-CoV-2 haplogroups, with UTRs consistently exhibiting the greatest variability, with s2m, a conserved secondary structure element of unknown function in the 3'-UTR of the viral genome showing evidence of a functional shift. Although several polymorphic sites that are specific to one or more haplogroups were predicted to be under positive or negative selection, overall our analyses suggest that the emergence of novel types is unlikely to be driven by convergent evolution and independent fixation of advantageous substitutions, or by selection of recombined strains. In the absence of extensive clinical metadata for most available genome sequences, and in the context of extensive geographic and temporal biases in the sampling, many questions regarding the evolution and clinical characteristics of SARS-CoV-2 isolates remain open. However, our data indicate that the approach outlined here can be usefully employed in the identification of candidate SARS-CoV-2 genetic variants of clinical and epidemiological importance.

Assuntos

COVID-19/genética , Evolução Molecular , Genoma Viral , Genômica , Filogenia , SARS-CoV-2/genética , Humanos

11.

Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data.

Chiara, Matteo; Zambelli, Federico; Picardi, Ernesto; Horner, David S; Pesole, Graziano.

Brief Bioinform ; 21(6): 1971-1986, 2020 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-31792498

RESUMO

A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results. In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data. Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.

Assuntos

Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Dados de Sequência Molecular , Análise de Sequência de DNA , Alelos , Mapeamento Cromossômico , Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos

12.

CorGAT: a tool for the functional annotation of SARS-CoV-2 genomes.

Chiara, Matteo; Zambelli, Federico; Tangaro, Marco Antonio; Mandreoli, Pietro; Horner, David S; Pesole, Graziano.

Bioinformatics ; 36(22-23): 5522-5523, 2021 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-33346830

RESUMO

SUMMARY: While over 200 000 genomic sequences are currently available through dedicated repositories, ad hoc methods for the functional annotation of SARS-CoV-2 genomes do not harness all currently available resources for the annotation of functionally relevant genomic sites. Here, we present CorGAT, a novel tool for the functional annotation of SARS-CoV-2 genomic variants. By comparisons with other state of the art methods we demonstrate that, by providing a more comprehensive and rich annotation, our method can facilitate the identification of evolutionary patterns in the genome of SARS-CoV-2. AVAILABILITYAND IMPLEMENTATION: Galaxy.http://corgat.cloud.ba.infn.it/galaxy; software: https://github.com/matteo14c/CorGAT/tree/Revision_V1; docker: https://hub.docker.com/r/laniakeacloud/galaxy_corgat. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

13.

VINYL: Variant prIoritizatioN bY survivaL analysis.

Chiara, Matteo; Mandreoli, Pietro; Tangaro, Marco Antonio; D'Erchia, Anna Maria; Sorrentino, Sandro; Forleo, Cinzia; Horner, David S; Zambelli, Federico; Pesole, Graziano.

Bioinformatics ; 36(24): 5590-5599, 2021 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-33367501

RESUMO

MOTIVATION: Clinical applications of genome re-sequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants potentially associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance. RESULTS: In this article, we present VINYL, a flexible and fully automated system for the functional annotation and prioritization of genetic variants. Extensive analyses of both real and simulated datasets suggest that VINYL can identify clinically relevant genetic variants in a more accurate manner compared to equivalent state of the art methods, allowing a more rapid and effective prioritization of genetic variants in different experimental settings. As such we believe that VINYL can establish itself as a valuable tool to assist healthcare operators and researchers in clinical genomics investigations. AVAILABILITY AND IMPLEMENTATION: VINYL is available at http://beaconlab.it/VINYL and https://github.com/matteo14c/VINYL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

14.

ITSoneWB: profiling global taxonomic diversity of eukaryotic communities on Galaxy.

Tangaro, Marco A; Defazio, Giuseppe; Fosso, Bruno; Licciulli, Vito Flavio; Grillo, Giorgio; Donvito, Giacinto; Lavezzo, Enrico; Baruzzo, Giacomo; Pesole, Graziano; Santamaria, Monica.

Bioinformatics ; 37(22): 4253-4254, 2021 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-34117876

RESUMO

SUMMARY: ITSoneWB (ITSone WorkBench) is a Galaxy-based bioinformatic environment where comprehensive and high-quality reference data are connected with established pipelines and new tools in an automated and easy-to-use service targeted at global taxonomic analysis of eukaryotic communities based on Internal Transcribed Spacer 1 variants high-throughput sequencing. AVAILABILITY AND IMPLEMENTATION: ITSoneWB has been deployed on the INFN-Bari ReCaS cloud facility and is freely available on the web at http://itsonewb.cloud.ba.infn.it/galaxy. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Eucariotos , Software , Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Confiabilidade dos Dados

15.

Microbiome composition indicate dysbiosis and lower richness in tumor breast tissues compared to healthy adjacent paired tissue, within the same women.

Esposito, Maria Valeria; Fosso, Bruno; Nunziato, Marcella; Casaburi, Giorgio; D'Argenio, Valeria; Calabrese, Alessandra; D'Aiuto, Massimiliano; Botti, Gerardo; Pesole, Graziano; Salvatore, Francesco.

BMC Cancer ; 22(1): 30, 2022 Jan 03.

Artigo em Inglês | MEDLINE | ID: mdl-34980006

RESUMO

BACKGROUND: Breast cancer (BC) is the most common malignancy in women, in whom it reaches 20% of the total neoplasia incidence. Most BCs are considered sporadic and a number of factors, including familiarity, age, hormonal cycles and diet, have been reported to be BC risk factors. Also the gut microbiota plays a role in breast cancer development. In fact, its imbalance has been associated to various human diseases including cancer although a consequential cause-effect phenomenon has never been proven. METHODS: The aim of this work was to characterize the breast tissue microbiome in 34 women affected by BC using an NGS-based method, and analyzing the tumoral and the adjacent non-tumoral tissue of each patient. RESULTS: The healthy and tumor tissues differed in bacterial composition and richness: the number of Amplicon Sequence Variants (ASVs) was higher in healthy tissues than in tumor tissues (p = 0.001). Moreover, our analyses, able to investigate from phylum down to species taxa for each sample, revealed major differences in the two richest phyla, namely, Proteobacteria and Actinobacteria. Notably, the levels of Actinobacteria and Proteobacteria were, respectively, higher and lower in healthy with respect to tumor tissues. CONCLUSIONS: Our study provides information about the breast tissue microbial composition, as compared with very closely adjacent healthy tissue (paired samples within the same woman); the differences found are such to have possible diagnostic and therapeutic implications; further studies are necessary to clarify if the differences found in the breast tissue microbiome are simply an association or a concausative pathogenetic effect in BC. A comparison of different results on similar studies seems not to assess a universal microbiome signature, but single ones depending on the environmental cohorts' locations.

Assuntos

Neoplasias da Mama/microbiologia , Mama/microbiologia , Disbiose/microbiologia , Microbioma Gastrointestinal/genética , Adulto , Biodiversidade , Feminino , Humanos , Pessoa de Meia-Idade , RNA Ribossômico 16S/análise

16.

Laniakea@ReCaS: exploring the potential of customisable Galaxy on-demand instances as a cloud-based service.

Tangaro, Marco Antonio; Mandreoli, Pietro; Chiara, Matteo; Donvito, Giacinto; Antonacci, Marica; Parisi, Antonio; Bianco, Angelica; Romano, Angelo; Bianchi, Daniela Manila; Cangelosi, Davide; Uva, Paolo; Molineris, Ivan; Nosi, Vladimir; Calogero, Raffaele A; Alessandri, Luca; Pedrini, Elena; Mordenti, Marina; Bonetti, Emanuele; Sangiorgi, Luca; Pesole, Graziano; Zambelli, Federico.

BMC Bioinformatics ; 22(Suppl 15): 544, 2021 Nov 08.

Artigo em Inglês | MEDLINE | ID: mdl-34749633

RESUMO

BACKGROUND: Improving the availability and usability of data and analytical tools is a critical precondition for further advancing modern biological and biomedical research. For instance, one of the many ramifications of the COVID-19 global pandemic has been to make even more evident the importance of having bioinformatics tools and data readily actionable by researchers through convenient access points and supported by adequate IT infrastructures. One of the most successful efforts in improving the availability and usability of bioinformatics tools and data is represented by the Galaxy workflow manager and its thriving community. In 2020 we introduced Laniakea, a software platform conceived to streamline the configuration and deployment of "on-demand" Galaxy instances over the cloud. By facilitating the set-up and configuration of Galaxy web servers, Laniakea provides researchers with a powerful and highly customisable platform for executing complex bioinformatics analyses. The system can be accessed through a dedicated and user-friendly web interface that allows the Galaxy web server's initial configuration and deployment. RESULTS: "Laniakea@ReCaS", the first instance of a Laniakea-based service, is managed by ELIXIR-IT and was officially launched in February 2020, after about one year of development and testing that involved several users. Researchers can request access to Laniakea@ReCaS through an open-ended call for use-cases. Ten project proposals have been accepted since then, totalling 18 Galaxy on-demand virtual servers that employ ~ 100 CPUs, ~ 250 GB of RAM and ~ 5 TB of storage and serve several different communities and purposes. Herein, we present eight use cases demonstrating the versatility of the platform. CONCLUSIONS: During this first year of activity, the Laniakea-based service emerged as a flexible platform that facilitated the rapid development of bioinformatics tools, the efficient delivery of training activities, and the provision of public bioinformatics services in different settings, including food safety and clinical research. Laniakea@ReCaS provides a proof of concept of how enabling access to appropriate, reliable IT resources and ready-to-use bioinformatics tools can considerably streamline researchers' work.

Assuntos

COVID-19 , Computação em Nuvem , Biologia Computacional , Humanos , SARS-CoV-2 , Software

17.

Elucidating the editome: bioinformatics approaches for RNA editing detection.

Diroma, Maria Angela; Ciaccia, Loredana; Pesole, Graziano; Picardi, Ernesto.

Brief Bioinform ; 20(2): 436-447, 2019 03 22.

Artigo em Inglês | MEDLINE | ID: mdl-29040360

RESUMO

RNA editing is a widespread co/posttranscriptional mechanism affecting primary RNAs by specific nucleotide modifications, which plays relevant roles in molecular processes including regulation of gene expression and/or the processing of noncoding RNAs. In recent years, the detection of editing sites has been improved through the availability of high-throughput RNA sequencing (RNA-Seq) technologies. Accurate bioinformatics pipelines are essential for the analysis of next-generation sequencing (NGS) data to ensure the correct identification of edited sites. Several pipelines, using various read mappers and variant callers with a wide range of adjustable parameters, are available for the detection of RNA editing events. In this review, we discuss some of the most recent and popular tools and provide guidelines for RNA-Seq data generation and analysis for the detection of RNA editing in massive transcriptome data. Using simulated and real data sets, we provide an overview of their behavior, emphasizing the fact that the RNA editing detection in NGS data sets remains a challenging task.

Assuntos

Biologia Computacional/métodos , Genoma Humano , Edição de RNA , Transcriptoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Análise de Sequência de RNA/métodos , Software

18.

HPC-REDItools: a novel HPC-aware tool for improved large scale RNA-editing analysis.

Flati, Tiziano; Gioiosa, Silvia; Spallanzani, Nicola; Tagliaferri, Ilario; Diroma, Maria Angela; Pesole, Graziano; Chillemi, Giovanni; Picardi, Ernesto; Castrignanò, Tiziana.

BMC Bioinformatics ; 21(Suppl 10): 353, 2020 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-32838738

RESUMO

BACKGROUND: RNA editing is a widespread co-/post-transcriptional mechanism that alters primary RNA sequences through the modification of specific nucleotides and it can increase both the transcriptome and proteome diversity. The automatic detection of RNA-editing from RNA-seq data is computational intensive and limited to small data sets, thus preventing a reliable genome-wide characterisation of such process. RESULTS: In this work we introduce HPC-REDItools, an upgraded tool for accurate RNA-editing events discovery from large dataset repositories. AVAILABILITY: https://github.com/BioinfoUNIBA/REDItools2 . CONCLUSIONS: HPC-REDItools is dramatically faster than the previous version, REDItools, enabling big-data analysis by means of a MPI-based implementation and scaling almost linearly with the number of available cores.

Assuntos

Metodologias Computacionais , Edição de RNA/genética , Software , Algoritmos , Sequência de Bases , Genoma , Transcriptoma/genética

19.

ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community.

Castrignanò, Tiziana; Gioiosa, Silvia; Flati, Tiziano; Cestari, Mirko; Picardi, Ernesto; Chiara, Matteo; Fratelli, Maddalena; Amente, Stefano; Cirilli, Marco; Tangaro, Marco Antonio; Chillemi, Giovanni; Pesole, Graziano; Zambelli, Federico.

BMC Bioinformatics ; 21(Suppl 10): 352, 2020 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-32838759

RESUMO

BACKGROUND: The advent of Next Generation Sequencing (NGS) technologies and the concomitant reduction in sequencing costs allows unprecedented high throughput profiling of biological systems in a cost-efficient manner. Modern biological experiments are increasingly becoming both data and computationally intensive and the wealth of publicly available biological data is introducing bioinformatics into the "Big Data" era. For these reasons, the effective application of High Performance Computing (HPC) architectures is becoming progressively more recognized also by bioinformaticians. Here we describe HPC resources provisioning pilot programs dedicated to bioinformaticians, run by the Italian Node of ELIXIR (ELIXIR-IT) in collaboration with CINECA, the main Italian supercomputing center. RESULTS: Starting from April 2016, CINECA and ELIXIR-IT launched the pilot Call "ELIXIR-IT HPC@CINECA", offering streamlined access to HPC resources for bioinformatics. Resources are made available either through web front-ends to dedicated workflows developed at CINECA or by providing direct access to the High Performance Computing systems through a standard command-line interface tailored for bioinformatics data analysis. This allows to offer to the biomedical research community a production scale environment, continuously updated with the latest available versions of publicly available reference datasets and bioinformatic tools. Currently, 63 research projects have gained access to the HPC@CINECA program, for a total handout of ~ 8 Millions of CPU/hours and, for data storage, ~ 100 TB of permanent and ~ 300 TB of temporary space. CONCLUSIONS: Three years after the beginning of the ELIXIR-IT HPC@CINECA program, we can appreciate its impact over the Italian bioinformatics community and draw some considerations. Several Italian researchers who applied to the program have gained access to one of the top-ranking public scientific supercomputing facilities in Europe. Those investigators had the opportunity to sensibly reduce computational turnaround times in their research projects and to process massive amounts of data, pursuing research approaches that would have been otherwise difficult or impossible to undertake. Moreover, by taking advantage of the wealth of documentation and training material provided by CINECA, participants had the opportunity to improve their skills in the usage of HPC systems and be better positioned to apply to similar EU programs of greater scale, such as PRACE. To illustrate the effective usage and impact of the resources awarded by the program - in different research applications - we report five successful use cases, which have already published their findings in peer-reviewed journals.

Assuntos

Biologia Computacional , Metodologias Computacionais , Software , Algoritmos , Animais , Linhagem Celular , Bases de Dados Genéticas , Fusão Gênica , Genoma , Humanos , Prunus persica/genética , Edição de RNA , Andorinhas/genética

20.

RNentropy: an entropy-based tool for the detection of significant variation of gene expression across multiple RNA-Seq experiments.

Zambelli, Federico; Mastropasqua, Francesca; Picardi, Ernesto; D'Erchia, Anna Maria; Pesole, Graziano; Pavesi, Giulio.

Nucleic Acids Res ; 46(8): e46, 2018 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-29390085

RESUMO

RNA sequencing (RNA-Seq) has become the experimental standard in transcriptome studies. While most of the bioinformatic pipelines for the analysis of RNA-Seq data and the identification of significant changes in transcript abundance are based on the comparison of two conditions, it is common practice to perform several experiments in parallel (e.g. from different individuals, developmental stages, tissues), for the identification of genes showing a significant variation of expression across all the conditions studied. In this work we present RNentropy, a methodology based on information theory devised for this task, which given expression estimates from any number of RNA-Seq samples and conditions identifies genes or transcripts with a significant variation of expression across all the conditions studied, together with the samples in which they are over- or under-expressed. To show the capabilities offered by our methodology, we applied it to different RNA-Seq datasets: 48 biological replicates of two different yeast conditions; samples extracted from six human tissues of three individuals; seven different mouse brain cell types; human liver samples from six individuals. Results, and their comparison to different state of the art bioinformatic methods, show that RNentropy can provide a quick and in depth analysis of significant changes in gene expression profiles over any number of conditions.

Assuntos

Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência de RNA/estatística & dados numéricos , Software , Animais , Encéfalo/metabolismo , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Genes Fúngicos , Marcadores Genéticos , Humanos , Fígado/metabolismo , Masculino , Camundongos , Mutação , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Análise Espaço-Temporal

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA