Pesquisa | Portal Regional da BVS

1.

Overview of data preprocessing for machine learning applications in human microbiome research.

Ibrahimi, Eliana; Lopes, Marta B; Dhamo, Xhilda; Simeon, Andrea; Shigdel, Rajesh; Hron, Karel; Stres, Blaz; D'Elia, Domenica; Berland, Magali; Marcos-Zambrano, Laura Judith.

Front Microbiol ; 14: 1250909, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37869650

RESUMO

Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics.

2.

Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action.

D'Elia, Domenica; Truu, Jaak; Lahti, Leo; Berland, Magali; Papoutsoglou, Georgios; Ceci, Michelangelo; Zomer, Aldert; Lopes, Marta B; Ibrahimi, Eliana; Gruca, Aleksandra; Nechyporenko, Alina; Frohme, Marcus; Klammsteiner, Thomas; Pau, Enrique Carrillo-de Santa; Marcos-Zambrano, Laura Judith; Hron, Karel; Pio, Gianvito; Simeon, Andrea; Suharoschi, Ramona; Moreno-Indias, Isabel; Temko, Andriy; Nedyalkova, Miroslava; Apostol, Elena-Simona; Truica, Ciprian-Octavian; Shigdel, Rajesh; Telalovic, Jasminka Hasic; Bongcam-Rudloff, Erik; Przymus, Piotr; Jordamovic, Naida Babic; Falquet, Laurent; Tarazona, Sonia; Sampri, Alexia; Isola, Gaetano; Pérez-Serrano, David; Trajkovik, Vladimir; Klucar, Lubos; Loncar-Turukalo, Tatjana; Havulinna, Aki S; Jansen, Christian; Bertelsen, Randi J; Claesson, Marcus Joakim.

Front Microbiol ; 14: 1257002, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37808321

RESUMO

The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices.

3.

Whole-Exome and Transcriptome Sequencing Expands the Genotype of Majewski Osteodysplastic Primordial Dwarfism Type II.

Marzano, Flaviana; Chiara, Matteo; Consiglio, Arianna; D'Amato, Gabriele; Gentile, Mattia; Mirabelli, Valentina; Piane, Maria; Savio, Camilla; Fabiani, Marco; D'Elia, Domenica; Sbisà, Elisabetta; Scarano, Gioacchino; Lonardo, Fortunato; Tullo, Apollonia; Pesole, Graziano; Faienza, Maria Felicia.

Int J Mol Sci ; 24(15)2023 Jul 31.

Artigo em Inglês | MEDLINE | ID: mdl-37569667

RESUMO

Microcephalic Osteodysplastic Primordial Dwarfism type II (MOPDII) represents the most common form of primordial dwarfism. MOPD clinical features include severe prenatal and postnatal growth retardation, postnatal severe microcephaly, hypotonia, and an increased risk for cerebrovascular disease and insulin resistance. Autosomal recessive biallelic loss-of-function genomic variants in the centrosomal pericentrin (PCNT) gene on chromosome 21q22 cause MOPDII. Over the past decade, exome sequencing (ES) and massive RNA sequencing have been effectively employed for both the discovery of novel disease genes and to expand the genotypes of well-known diseases. In this paper we report the results both the RNA sequencing and ES of three patients affected by MOPDII with the aim of exploring whether differentially expressed genes and previously uncharacterized gene variants, in addition to PCNT pathogenic variants, could be associated with the complex phenotype of this disease. We discovered a downregulation of key factors involved in growth, such as IGF1R, IGF2R, and RAF1, in all three investigated patients. Moreover, ES identified a shortlist of genes associated with deleterious, rare variants in MOPDII patients. Our results suggest that Next Generation Sequencing (NGS) technologies can be successfully applied for the molecular characterization of the complex genotypic background of MOPDII.

Assuntos

Nanismo , Microcefalia , Osteocondrodisplasias , Humanos , Feminino , Gravidez , Microcefalia/genética , Exoma/genética , Transcriptoma , Retardo do Crescimento Fetal/genética , Nanismo/genética , Osteocondrodisplasias/genética , Genótipo , Mutação

4.

Ten simple rules on how to develop a stakeholder engagement plan.

Hollmann, Susanne; Regierer, Babette; Bechis, Jaele; Tobin, Lesley; D'Elia, Domenica.

PLoS Comput Biol ; 18(10): e1010520, 2022 10.

Artigo em Inglês | MEDLINE | ID: mdl-36227852

RESUMO

To make research responsible and research outcomes meaningful, it is necessary to communicate our research and to involve as many relevant stakeholders as possible, especially in application-oriented-including information and communications technology (ICT)-research. Nowadays, stakeholder engagement is of fundamental importance to project success and achieving the expected impact and is often mandatory in a third-party funding context. Ultimately, research and development can only be successful if people react positively to the results and benefits generated by a project. For the wider acceptance of research outcomes, it is therefore essential that the public is made aware of and has an opportunity to discuss the results of research undertaken through two-way communication (interpersonal communication) with researchers. Responsible Research and Innovation (RRI), an approach that anticipates and assesses potential implications and societal expectations regarding research and innovation, aims to foster inclusive and sustainable research and innovation. Research and innovation processes need to become more responsive and adaptive to these grand challenges. This implies, among other things, the introduction of broader foresight and impact assessments for new technologies beyond their anticipated market benefits and risks. Therefore, this article provides a structured workflow that explains "how to develop a stakeholder engagement plan" step by step.

Assuntos

Comunicação , Participação dos Interessados , Humanos , Pesquisadores

5.

Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions.

Moreno-Indias, Isabel; Lahti, Leo; Nedyalkova, Miroslava; Elbere, Ilze; Roshchupkin, Gennady; Adilovic, Muhamed; Aydemir, Onder; Bakir-Gungor, Burcu; Santa Pau, Enrique Carrillo-de; D'Elia, Domenica; Desai, Mahesh S; Falquet, Laurent; Gundogdu, Aycan; Hron, Karel; Klammsteiner, Thomas; Lopes, Marta B; Marcos-Zambrano, Laura Judith; Marques, Cláudia; Mason, Michael; May, Patrick; Pasic, Lejla; Pio, Gianvito; Pongor, Sándor; Promponas, Vasilis J; Przymus, Piotr; Saez-Rodriguez, Julio; Sampri, Alexia; Shigdel, Rajesh; Stres, Blaz; Suharoschi, Ramona; Truu, Jaak; Truica, Ciprian-Octavian; Vilne, Baiba; Vlachakis, Dimitrios; Yilmaz, Ercument; Zeller, Georg; Zomer, Aldert L; Gómez-Cabrero, David; Claesson, Marcus J.

Front Microbiol ; 12: 635781, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33692771

RESUMO

The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 "ML4Microbiome" that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.

6.

Plant miRNAs Reduce Cancer Cell Proliferation by Targeting MALAT1 and NEAT1: A Beneficial Cross-Kingdom Interaction.

Marzano, Flaviana; Caratozzolo, Mariano Francesco; Consiglio, Arianna; Licciulli, Flavio; Liuni, Sabino; Sbisà, Elisabetta; D'Elia, Domenica; Tullo, Apollonia; Catalano, Domenico.

Front Genet ; 11: 552490, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33193626

RESUMO

MicroRNAs (miRNAs) are ubiquitous regulators of gene expression, evolutionarily conserved in plants and mammals. In recent years, although a growing number of papers debate the role of plant miRNAs on human gene expression, the molecular mechanisms through which this effect is achieved are still not completely elucidated. Some evidence suggest that this interaction might be sequence specific, and in this work, we investigated this possibility by transcriptomic and bioinformatics approaches. Plant and human miRNA sequences from primary databases were collected and compared for their similarities (global or local alignments). Out of 2,588 human miRNAs, 1,606 showed a perfect match of their seed sequence with the 5' end of 3,172 plant miRNAs. Further selections were applied based on the role of the human target genes or of the miRNA in cell cycle regulation (as an oncogene, tumor suppressor, or a biomarker for prognosis, or diagnosis in cancer). Based on these criteria, 20 human miRNAs were selected as potential functional analogous of 7 plant miRNAs, which were in turn transfected in different cell lines to evaluate their effect on cell proliferation. A significant decrease was observed in colorectal carcinoma HCT116 cell line. RNA-Seq demonstrated that 446 genes were differentially expressed 72 h after transfection. Noteworthy, we demonstrated that the plant mtr-miR-5754 and gma-miR4995 directly target the tumor-associated long non-coding RNA metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) and nuclear paraspeckle assembly transcript 1 (NEAT1) in a sequence-specific manner. In conclusion, according to other recent discoveries, our study strengthens and expands the hypothesis that plant miRNAs can have a regulatory effect in mammals by targeting both protein-coding and non-coding RNA, thus suggesting new biotechnological applications.

7.

Ten simple rules on how to write a standard operating procedure.

Hollmann, Susanne; Frohme, Marcus; Endrullat, Christoph; Kremer, Andreas; D'Elia, Domenica; Regierer, Babette; Nechyporenko, Alina.

PLoS Comput Biol ; 16(9): e1008095, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32881868

RESUMO

Research publications and data nowadays should be publicly available on the internet and, theoretically, usable for everyone to develop further research, products, or services. The long-term accessibility of research data is, therefore, fundamental in the economy of the research production process. However, the availability of data is not sufficient by itself, but also their quality must be verifiable. Measures to ensure reuse and reproducibility need to include the entire research life cycle, from the experimental design to the generation of data, quality control, statistical analysis, interpretation, and validation of the results. Hence, high-quality records, particularly for providing a string of documents for the verifiable origin of data, are essential elements that can act as a certificate for potential users (customers). These records also improve the traceability and transparency of data and processes, therefore, improving the reliability of results. Standards for data acquisition, analysis, and documentation have been fostered in the last decade driven by grassroot initiatives of researchers and organizations such as the Research Data Alliance (RDA). Nevertheless, what is still largely missing in the life science academic research are agreed procedures for complex routine research workflows. Here, well-crafted documentation like standard operating procedures (SOPs) offer clear direction and instructions specifically designed to avoid deviations as an absolute necessity for reproducibility. Therefore, this paper provides a standardized workflow that explains step by step how to write an SOP to be used as a starting point for appropriate research documentation.

Assuntos

Métodos , Registros , Redação/normas , Documentação , Humanos , Reprodutibilidade dos Testes , Projetos de Pesquisa/normas , Fluxo de Trabalho

8.

Microarray data and pathway analyses of peripheral blood mononuclear cells from healthy subjects after a three weeks grape-rich diet.

Milella, Rosa Anna; Gasparro, Marica; Alagna, Fiammetta; Cardone, Maria Francesca; Rotunno, Silvia; Ammollo, Concetta Tiziana; Semeraro, Fabrizio; Tullo, Apollonia; Marzano, Flaviana; Catalano, Domenico; Antonacci, Donato; Colucci, Mario; D'Elia, Domenica.

Data Brief ; 29: 105278, 2020 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-32123709

RESUMO

Using Human Gene Expression Microarrays (Agilent) technologies, we investigated changes of the level of gene expression in peripheral blood mononuclear cells of healthy subjects after 21 days of fresh table grape-rich diet and after an additional 28-day washout. Several hundreds of genes were differentially expressed after grape intake or after washout. The functional analysis of these genes detected significant changes in key processes such as inflammation and immunity, thrombosis, DNA and protein repair, autophagy and mitochondrial biogenesis. Moreover, fresh grape intake was found to influence the expression of many long non-coding RNA genes. The data can be valuable for researchers interested in nutrigenetics and nutrigenomics studies and are related to the research article "Gene expression signature induced by grape intake in healthy subjects reveals wide-spread beneficial effects on PBMCs" [1].

9.

Prediction of new associations between ncRNAs and diseases exploiting multi-type hierarchical clustering.

Barracchia, Emanuele Pio; Pio, Gianvito; D'Elia, Domenica; Ceci, Michelangelo.

BMC Bioinformatics ; 21(1): 70, 2020 Feb 24.

Artigo em Inglês | MEDLINE | ID: mdl-32093606

RESUMO

BACKGROUND: The study of functional associations between ncRNAs and human diseases is a pivotal task of modern research to develop new and more effective therapeutic approaches. Nevertheless, it is not a trivial task since it involves entities of different types, such as microRNAs, lncRNAs or target genes whose expression also depends on endogenous or exogenous factors. Such a complexity can be faced by representing the involved biological entities and their relationships as a network and by exploiting network-based computational approaches able to identify new associations. However, existing methods are limited to homogeneous networks (i.e., consisting of only one type of objects and relationships) or can exploit only a small subset of the features of biological entities, such as the presence of a particular binding domain, enzymatic properties or their involvement in specific diseases. RESULTS: To overcome the limitations of existing approaches, we propose the system LP-HCLUS, which exploits a multi-type hierarchical clustering method to predict possibly unknown ncRNA-disease relationships. In particular, LP-HCLUS analyzes heterogeneous networks consisting of several types of objects and relationships, each possibly described by a set of features, and extracts multi-type clusters that are subsequently exploited to predict new ncRNA-disease associations. The extracted clusters are overlapping, hierarchically organized, involve entities of different types, and allow LP-HCLUS to catch multiple roles of ncRNAs in diseases at different levels of granularity. Our experimental evaluation, performed on heterogeneous attributed networks consisting of microRNAs, lncRNAs, diseases, genes and their known relationships, shows that LP-HCLUS is able to obtain better results with respect to existing approaches. The biological relevance of the obtained results was evaluated according to both quantitative (i.e., TPR@k, Areas Under the TPR@k, ROC and Precision-Recall curves) and qualitative (i.e., according to the consultation of the existing literature) criteria. CONCLUSIONS: The obtained results prove the utility of LP-HCLUS to conduct robust predictive studies on the biological role of ncRNAs in human diseases. The produced predictions can therefore be reliably considered as new, previously unknown, relationships among ncRNAs and diseases.

Assuntos

Doença/genética , MicroRNAs/metabolismo , RNA Longo não Codificante/metabolismo , Análise por Conglomerados , Humanos , RNA não Traduzido/metabolismo

10.

The need for standardisation in life science research - an approach to excellence and trust.

Hollmann, Susanne; Kremer, Andreas; Baebler, Spela; Trefois, Christophe; Gruden, Kristina; Rudnicki, Witold R; Tong, Weida; Gruca, Aleksandra; Bongcam-Rudloff, Erik; Evelo, Chris T; Nechyporenko, Alina; Frohme, Marcus; Safránek, David; Regierer, Babette; D'Elia, Domenica.

F1000Res ; 9: 1398, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33604028

RESUMO

Today, academic researchers benefit from the changes driven by digital technologies and the enormous growth of knowledge and data, on globalisation, enlargement of the scientific community, and the linkage between different scientific communities and the society. To fully benefit from this development, however, information needs to be shared openly and transparently. Digitalisation plays a major role here because it permeates all areas of business, science and society and is one of the key drivers for innovation and international cooperation. To address the resulting opportunities, the EU promotes the development and use of collaborative ways to produce and share knowledge and data as early as possible in the research process, but also to appropriately secure results with the European strategy for Open Science (OS). It is now widely recognised that making research results more accessible to all societal actors contributes to more effective and efficient science; it also serves as a boost for innovation in the public and private sectors. However for research data to be findable, accessible, interoperable and reusable the use of standards is essential. At the metadata level, considerable efforts in standardisation have already been made (e.g. Data Management Plan and FAIR Principle etc.), whereas in context with the raw data these fundamental efforts are still fragmented and in some cases completely missing. The CHARME consortium, funded by the European Cooperation in Science and Technology (COST) Agency, has identified needs and gaps in the field of standardisation in the life sciences and also discussed potential hurdles for implementation of standards in current practice. Here, the authors suggest four measures in response to current challenges to ensure a high quality of life science research data and their re-usability for research and innovation.

Assuntos

Disciplinas das Ciências Biológicas , Confiança , Cooperação Internacional , Metadados , Qualidade de Vida

11.

Exploiting transfer learning for the reconstruction of the human gene regulatory network.

Mignone, Paolo; Pio, Gianvito; D'Elia, Domenica; Ceci, Michelangelo.

Bioinformatics ; 36(5): 1553-1561, 2020 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-31608946

RESUMO

MOTIVATION: The reconstruction of gene regulatory networks (GRNs) from gene expression data has received increasing attention in recent years, due to its usefulness in the understanding of regulatory mechanisms involved in human diseases. Most of the existing methods reconstruct the network through machine learning approaches, by analyzing known examples of interactions. However, (i) they often produce poor results when the amount of labeled examples is limited, or when no negative example is available and (ii) they are not able to exploit information extracted from GRNs of other (better studied) related organisms, when this information is available. RESULTS: In this paper, we propose a novel machine learning method that overcomes these limitations, by exploiting the knowledge about the GRN of a source organism for the reconstruction of the GRN of the target organism, by means of a novel transfer learning technique. Moreover, the proposed method is natively able to work in the positive-unlabeled setting, where no negative example is available, by fruitfully exploiting a (possibly large) set of unlabeled examples. In our experiments, we reconstructed the human GRN, by exploiting the knowledge of the GRN of Mus musculus. Results showed that the proposed method outperforms state-of-the-art approaches and identifies previously unknown functional relationships among the analyzed genes. AVAILABILITY AND IMPLEMENTATION: http://www.di.uniba.it/â¼mignone/systems/biosfer/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Redes Reguladoras de Genes , Animais , Biologia Computacional , Expressão Gênica , Perfilação da Expressão Gênica , Humanos , Aprendizado de Máquina , Camundongos

12.

NOTCH3 and CADASIL syndrome: a genetic and structural overview.

Papakonstantinou, Eleni; Bacopoulou, Flora; Brouzas, Dimitrios; Megalooikonomou, Vasileios; D'Elia, Domenica; Bongcam-Rudloff, Erik; Vlachakis, Dimitrios.

EMBnet J ; 242019.

Artigo em Inglês | MEDLINE | ID: mdl-31218211

RESUMO

CADASIL syndrome is a rare disease that belongs to a group of disorders called leukodystrophies. It is well established that NOTCH3 gene on chromosome 19 is primarily responsible for the development of the CADASIL syndrome. Herein, an attempt is made to shed light on the actual molecular mechanism underlying CADASIL syndrome, through insights extracted from comprehensive evolutionary studies and in silico modelling on Notch 3 protein. In particular, we suggest the use of optical coherence tomography angiography for the detection of early signs of small vessel diseases, which are the major precursors to a repertoire of neurodegenerative conditions, including CADASIL.

13.

Arena-Idb: a platform to build human non-coding RNA interaction networks.

Bonnici, Vincenzo; Caro, Giorgio De; Constantino, Giorgio; Liuni, Sabino; D'Elia, Domenica; Bombieri, Nicola; Licciulli, Flavio; Giugno, Rosalba.

BMC Bioinformatics ; 19(Suppl 10): 350, 2018 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-30367585

RESUMO

BACKGROUND: High throughput technologies have provided the scientific community an unprecedented opportunity for large-scale analysis of genomes. Non-coding RNAs (ncRNAs), for a long time believed to be non-functional, are emerging as one of the most important and large family of gene regulators and key elements for genome maintenance. Functional studies have been able to assign to ncRNAs a wide spectrum of functions in primary biological processes, and for this reason they are assuming a growing importance as a potential new family of cancer therapeutic targets. Nevertheless, the number of functionally characterized ncRNAs is still too poor if compared to the number of new discovered ncRNAs. Thus platforms able to merge information from available resources addressing data integration issues are necessary and still insufficient to elucidate ncRNAs biological roles. RESULTS: In this paper, we describe a platform called Arena-Idb for the retrieval of comprehensive and non-redundant annotated ncRNAs interactions. Arena-Idb provides a framework for network reconstruction of ncRNA heterogeneous interactions (i.e., with other type of molecules) and relationships with human diseases which guide the integration of data, extracted from different sources, via mapping of entities and minimization of ambiguity. CONCLUSIONS: Arena-Idb provides a schema and a visualization system to integrate ncRNA interactions that assists in discovering ncRNA functions through the extraction of heterogeneous interaction networks. The Arena-Idb is available at http://arenaidb.ba.itb.cnr.it.

Assuntos

Redes Reguladoras de Genes , RNA não Traduzido/genética , Software , Bases de Dados Genéticas , Humanos , Interface Usuário-Computador

14.

The joint NETTAB/Integrative Bioinformatics 2015 Meeting: aims, topics and outcomes.

Romano, Paolo; Hofestädt, Ralf; Lange, Matthias; D'Elia, Domenica.

BMC Bioinformatics ; 18(Suppl 5): 101, 2017 Mar 23.

Artigo em Inglês | MEDLINE | ID: mdl-28361713

RESUMO

The 15th International NETTAB workshop and the 11th Integrative Bioinformatics Symposium were held together in Bari, on October 14-16, 2016, as Joint NETTAB/IB 2015 Meeting. A special topic for the meeting was "Bioinformatics for ncRNA", but the traditional topics of both meetings series were also included in the event.About 60 scientific contributions were presented, including six keynote lectures, one special guest lecture, and many oral communications and posters. A "Two-Day Hands-on Tutorial" event was organised before the workshop.Selected full papers from some of the best works presented in Bari were submitted either to the Journal of Integrative Bioinformatics or to a purpose Call for a Supplement of BMC Bioinformatics.Here, we provide an overview of meeting aims and scope. We also shortly introduce selected papers that have been either accepted for publication in this Supplement or published in the Journal of Integrative Bioinformatics, for a more complete presentation of the outcomes of the meeting.

Assuntos

Biologia Computacional/métodos , RNA não Traduzido , Animais , Humanos

15.

ComiRNet: a web-based system for the analysis of miRNA-gene regulatory networks.

Pio, Gianvito; Ceci, Michelangelo; Malerba, Donato; D'Elia, Domenica.

BMC Bioinformatics ; 16 Suppl 9: S7, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26051695

RESUMO

BACKGROUND: The understanding of mechanisms and functions of microRNAs (miRNAs) is fundamental for the study of many biological processes and for the elucidation of the pathogenesis of many human diseases. Technological advances represented by high-throughput technologies, such as microarray and next-generation sequencing, have significantly aided miRNA research in the last decade. Nevertheless, the identification of true miRNA targets and the complete elucidation of the rules governing their functional targeting remain nebulous. Computational tools have been proven to be fundamental for guiding experimental validations for the discovery of new miRNAs, for the identification of their targets and for the elucidation of their regulatory mechanisms. DESCRIPTION: ComiRNet (Co-clustered miRNA Regulatory Networks) is a web-based database specifically designed to provide biologists and clinicians with user-friendly and effective tools for the study of miRNA-gene target interaction data and for the discovery of miRNA functions and mechanisms. Data in ComiRNet are produced by a combined computational approach based on: 1) a semi-supervised ensemble-based classifier, which learns to combine miRNA-gene target interactions (MTIs) from several prediction algorithms, and 2) the biclustering algorithm HOCCLUS2, which exploits the large set of produced predictions, with the associated probabilities, to identify overlapping and hierarchically organized biclusters that represent miRNA-gene regulatory networks (MGRNs). CONCLUSIONS: ComiRNet represents a valuable resource for elucidating the miRNAs' role in complex biological processes by exploiting data on their putative function in the context of MGRNs. ComiRnet currently stores about 5 million predicted MTIs between 934 human miRNAs and 30,875 mRNAs, as well as 15 bicluster hierarchies, each of which represents MGRNs at different levels of granularity. The database can be freely accessed at: http://comirnet.di.uniba.it.

Assuntos

Algoritmos , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , MicroRNAs/genética , RNA Mensageiro/genética , Humanos

16.

Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach.

Pio, Gianvito; Malerba, Donato; D'Elia, Domenica; Ceci, Michelangelo.

BMC Bioinformatics ; 15 Suppl 1: S4, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24564296

RESUMO

BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferation, development, differentiation and cell homeostasis, as well as in many types of human tumors. To this aim, we have recently presented the biclustering method HOCCLUS2, for the discovery of miRNA regulatory networks. Experiments on predicted interactions revealed that the statistical and biological consistency of the obtained networks is negatively affected by the poor reliability of the output of miRNA target prediction algorithms. Recently, some learning approaches have been proposed to learn to combine the outputs of distinct prediction algorithms and improve their accuracy. However, the application of classical supervised learning algorithms presents two challenges: i) the presence of only positive examples in datasets of experimentally verified interactions and ii) unbalanced number of labeled and unlabeled examples. RESULTS: We present a learning algorithm that learns to combine the score returned by several prediction algorithms, by exploiting information conveyed by (only positively labeled/) validated and unlabeled examples of interactions. To face the two related challenges, we resort to a semi-supervised ensemble learning setting. Results obtained using miRTarBase as the set of labeled (positive) interactions and mirDIP as the set of unlabeled interactions show a significant improvement, over competitive approaches, in the quality of the predictions. This solution also improves the effectiveness of HOCCLUS2 in discovering biologically realistic miRNA:mRNA regulatory networks from large-scale prediction data. Using the miR-17-92 gene cluster family as a reference system and comparing results with previous experiments, we find a large increase in the number of significantly enriched biclusters in pathways, consistent with miR-17-92 functions. CONCLUSION: The proposed approach proves to be fundamental for the computational discovery of miRNA regulatory networks from large-scale predictions. This paves the way to the systematic application of HOCCLUS2 for a comprehensive reconstruction of all the possible multiple interactions established by miRNAs in regulating the expression of gene networks, which would be otherwise impossible to reconstruct by considering only experimentally validated interactions.

Assuntos

Redes Reguladoras de Genes , MicroRNAs/genética , Família Multigênica , RNA Mensageiro/genética , Algoritmos , Regulação da Expressão Gênica , Humanos , MicroRNAs/metabolismo , RNA Mensageiro/metabolismo , Reprodutibilidade dos Testes

17.

A platform independent RNA-Seq protocol for the detection of transcriptome complexity.

Calabrese, Claudia; Mangiulli, Marina; Manzari, Caterina; Paluscio, Anna Maria; Caratozzolo, Mariano Francesco; Marzano, Flaviana; Kurelac, Ivana; D'Erchia, Anna Maria; D'Elia, Domenica; Licciulli, Flavio; Liuni, Sabino; Picardi, Ernesto; Attimonelli, Marcella; Gasparre, Giuseppe; Porcelli, Anna Maria; Pesole, Graziano; Sbisà, Elisabetta; Tullo, Apollonia.

BMC Genomics ; 14: 855, 2013 Dec 05.

Artigo em Inglês | MEDLINE | ID: mdl-24308330

RESUMO

BACKGROUND: Recent studies have demonstrated an unexpected complexity of transcription in eukaryotes. The majority of the genome is transcribed and only a little fraction of these transcripts is annotated as protein coding genes and their splice variants. Indeed, most transcripts are the result of antisense, overlapping and non-coding RNA expression. In this frame, one of the key aims of high throughput transcriptome sequencing is the detection of all RNA species present in the cell and the first crucial step for RNA-seq users is represented by the choice of the strategy for cDNA library construction. The protocols developed so far provide the utilization of the entire library for a single sequencing run with a specific platform. RESULTS: We set up a unique protocol to generate and amplify a strand-specific cDNA library representative of all RNA species that may be implemented with all major platforms currently available on the market (Roche 454, Illumina, ABI/SOLiD). Our method is reproducible, fast, easy-to-perform and even allows to start from low input total RNA. Furthermore, we provide a suitable bioinformatics tool for the analysis of the sequences produced following this protocol. CONCLUSION: We tested the efficiency of our strategy, showing that our method is platform-independent, thus allowing the simultaneous analysis of the same sample with different NGS technologies, and providing an accurate quantitative and qualitative portrait of complex whole transcriptomes.

Assuntos

Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de RNA/métodos , Transcriptoma , Animais , Linhagem Celular Tumoral , Mapeamento Cromossômico , Etiquetas de Sequências Expressas , Regulação da Expressão Gênica , Xenoenxertos , Humanos , Camundongos , Anotação de Sequência Molecular

18.

A novel biclustering algorithm for the discovery of meaningful biological correlations between microRNAs and their target genes.

Pio, Gianvito; Ceci, Michelangelo; D'Elia, Domenica; Loglisci, Corrado; Malerba, Donato.

BMC Bioinformatics ; 14 Suppl 7: S8, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23815553

RESUMO

BACKGROUND: microRNAs (miRNAs) are a class of small non-coding RNAs which have been recognized as ubiquitous post-transcriptional regulators. The analysis of interactions between different miRNAs and their target genes is necessary for the understanding of miRNAs' role in the control of cell life and death. In this paper we propose a novel data mining algorithm, called HOCCLUS2, specifically designed to bicluster miRNAs and target messenger RNAs (mRNAs) on the basis of their experimentally-verified and/or predicted interactions. Indeed, existing biclustering approaches, typically used to analyze gene expression data, fail when applied to miRNA:mRNA interactions since they usually do not extract possibly overlapping biclusters (miRNAs and their target genes may have multiple roles), extract a huge amount of biclusters (difficult to browse and rank on the basis of their importance) and work on similarities of feature values (do not limit the analysis to reliable interactions). RESULTS: To overcome these limitations, HOCCLUS2 i) extracts possibly overlapping biclusters, to catch multiple roles of both miRNAs and their target genes; ii) extracts hierarchically organized biclusters, to facilitate bicluster browsing and to distinguish between universe and pathway-specific miRNAs; iii) extracts highly cohesive biclusters, to consider only reliable interactions; iv) ranks biclusters according to the functional similarities, computed on the basis of Gene Ontology, to facilitate bicluster analysis. CONCLUSIONS: Our results show that HOCCLUS2 is a valid tool to support biologists in the identification of context-specific miRNAs regulatory modules and in the detection of possibly unknown miRNAs target genes. Indeed, results prove that HOCCLUS2 is able to extract cohesiveness-preserving biclusters, when compared with competitive approaches, and statistically confirm (at a confidence level of 99%) that mRNAs which belong to the same biclusters are, on average, more functionally similar than mRNAs which belong to different biclusters. Finally, the hierarchy of biclusters provides useful insights to understand the intrinsic hierarchical organization of miRNAs and their potential multiple interactions on target genes.

Assuntos

Algoritmos , Regulação da Expressão Gênica , MicroRNAs/metabolismo , RNA Mensageiro/genética , Animais , Humanos , MicroRNAs/genética

19.

The 20th anniversary of EMBnet: 20 years of bioinformatics for the Life Sciences community.

D'Elia, Domenica; Gisel, Andreas; Eriksson, Nils-Einar; Kossida, Sophia; Mattila, Kimmo; Klucar, Lubos; Bongcam-Rudloff, Erik.

BMC Bioinformatics ; 10 Suppl 6: S1, 2009 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-19534734

RESUMO

The EMBnet Conference 2008, focusing on 'Leading Applications and Technologies in Bioinformatics', was organized by the European Molecular Biology network (EMBnet) to celebrate its 20th anniversary. Since its foundation in 1988, EMBnet has been working to promote collaborative development of bioinformatics services and tools to serve the European community of molecular biology laboratories. This conference was the first meeting organized by the network that was open to the international scientific community outside EMBnet. The conference covered a broad range of research topics in bioinformatics with a main focus on new achievements and trends in emerging technologies supporting genomics, transcriptomics and proteomics analyses such as high-throughput sequencing and data managing, text and data-mining, ontologies and Grid technologies. Papers selected for publication, in this supplement to BMC Bioinformatics, cover a broad range of the topics treated, providing also an overview of the main bioinformatics research fields that the EMBnet community is involved in.

Assuntos

Biologia Computacional/tendências , Biologia Computacional/métodos , Congressos como Assunto/história , Genômica , História do Século XX , História do Século XXI , Biologia Molecular , Proteômica

20.

Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining.

Turi, Antonio; Loglisci, Corrado; Salvemini, Eliana; Grillo, Giorgio; Malerba, Donato; D'Elia, Domenica.

BMC Bioinformatics ; 10 Suppl 6: S25, 2009 Jun 16.

Artigo em Inglês | MEDLINE | ID: mdl-19534751

RESUMO

BACKGROUND: Many studies report about detection and functional characterization of cis-regulatory motifs in untranslated regions (UTRs) of mRNAs but little is known about the nature and functional role of their distribution. To address this issue we have developed a computational approach based on the use of data mining techniques. The idea is that of mining frequent combinations of translation regulatory motifs, since their significant co-occurrences could reveal functional relationships important for the post-transcriptional control of gene expression. The experimentation has been focused on targeted mitochondrial transcripts to elucidate the role of translational control in mitochondrial biogenesis and function. RESULTS: The analysis is based on a two-stepped procedure using a sequential pattern mining algorithm. The first step searches for frequent patterns (FPs) of motifs without taking into account their spatial displacement. In the second step, frequent sequential patterns (FSPs) of spaced motifs are generated by taking into account the conservation of spacers between each ordered pair of co-occurring motifs. The algorithm makes no assumption on the relation among motifs and on the number of motifs involved in a pattern. Different FSPs can be found depending on different combinations of two parameters, i.e. the threshold of the minimum percentage of sequences supporting the pattern, and the granularity of spacer discretization. Results can be retrieved at the UTRminer web site: http://utrminer.ba.itb.cnr.it/. The discovered FPs of motifs amount to 216 in the overall dataset and to 140 in the human subset. For each FP, the system provides information on the discovered FSPs, if any. A variety of search options help users in browsing the web resource. The list of sequence IDs supporting each pattern can be used for the retrieval of information from the UTRminer database. CONCLUSION: Computational prediction of structural properties of regulatory sequences is not trivial. The presented data mining approach is able to overcome some limits observed in other competitive tools. Preliminary results on UTR sequences from nuclear transcripts targeting mitochondria are promising and lead us to be confident on the effectiveness of the approach for future developments.

Assuntos

Biologia Computacional/métodos , Regiões não Traduzidas/genética , Humanos , Análise de Sequência de RNA/métodos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA