Pesquisa | Portal Regional da BVS

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Zhou, Naihui; Jiang, Yuxiang; Bergquist, Timothy R; Lee, Alexandra J; Kacsoh, Balint Z; Crocker, Alex W; Lewis, Kimberley A; Georghiou, George; Nguyen, Huy N; Hamid, Md Nafiz; Davis, Larry; Dogan, Tunca; Atalay, Volkan; Rifaioglu, Ahmet S; Dalkiran, Alperen; Cetin Atalay, Rengul; Zhang, Chengxin; Hurto, Rebecca L; Freddolino, Peter L; Zhang, Yang; Bhat, Prajwal; Supek, Fran; Fernández, José M; Gemovic, Branislava; Perovic, Vladimir R; Davidovic, Radoslav S; Sumonja, Neven; Veljkovic, Nevena; Asgari, Ehsaneddin; Mofrad, Mohammad R K; Profiti, Giuseppe; Savojardo, Castrense; Martelli, Pier Luigi; Casadio, Rita; Boecker, Florian; Schoof, Heiko; Kahanda, Indika; Thurlby, Natalie; McHardy, Alice C; Renaux, Alexandre; Saidi, Rabie; Gough, Julian; Freitas, Alex A; Antczak, Magdalena; Fabris, Fabio; Wass, Mark N; Hou, Jie; Cheng, Jianlin; Wang, Zheng; Romero, Alfonso E.

Genome Biol ; 20(1): 244, 2019 11 19.

Artigo em Inglês | MEDLINE | ID: mdl-31744546

RESUMO

BACKGROUND: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

Assuntos

Anotação de Sequência Molecular/tendências , Animais , Biofilmes , Candida albicans/genética , Drosophila melanogaster/genética , Genoma Bacteriano , Genoma Fúngico , Humanos , Locomoção , Memória de Longo Prazo , Anotação de Sequência Molecular/métodos , Pseudomonas aeruginosa/genética

Neurogenomic Signatures of Successes and Failures in Life-History Transitions in a Key Insect Pollinator.

Manfredini, Fabio; Romero, Alfonso E; Pedroso, Inti; Paccanaro, Alberto; Sumner, Seirian; Brown, Mark J F.

Genome Biol Evol ; 9(11): 3059-3072, 2017 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-29087523

RESUMO

Life-history transitions require major reprogramming at the behavioral and physiological level. Mating and reproductive maturation are known to trigger changes in gene transcription in reproductive tissues in a wide range of organisms, but we understand little about the molecular consequences of a failure to mate or become reproductively mature, and it is not clear to what extent these processes trigger neural as well as physiological changes. In this study, we examined the molecular processes underpinning the behavioral changes that accompany the major life-history transitions in a key pollinator, the bumblebee Bombus terrestris. We compared neuro-transcription in queens that succeeded or failed in switching from virgin and immature states, to mated and reproductively mature states. Both successes and failures were associated with distinct molecular profiles, illustrating how development during adulthood triggers distinct molecular profiles within a single caste of a eusocial insect. Failures in both mating and reproductive maturation were explained by a general up-regulation of brain gene transcription. We identified 21 genes that were highly connected in a gene coexpression network analysis: nine genes are involved in neural processes and four are regulators of gene expression. This suggests that negotiating life-history transitions involves significant neural processing and reprogramming, and not just changes in physiology. These findings provide novel insights into basic life-history transitions of an insect. Failure to mate or to become reproductively mature is an overlooked component of variation in natural systems, despite its prevalence in many sexually reproducing organisms, and deserves deeper investigation in the future.

Assuntos

Abelhas/genética , Regulação da Expressão Gênica , Reprodução/genética , Comportamento Sexual Animal , Animais , Abelhas/fisiologia , Encéfalo/metabolismo , Encéfalo/fisiologia , Biologia Computacional , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica , Redes Reguladoras de Genes/genética , Marcadores Genéticos , Proteínas de Insetos/genética , Proteínas de Insetos/metabolismo , Proteínas de Insetos/fisiologia , RNA/genética , Reprodução/fisiologia

Activity Recognition for Diabetic Patients Using a Smartphone.

Cvetkovic, Bozidara; Janko, Vito; Romero, Alfonso E; Kafali, Özgür; Stathis, Kostas; Lustrek, Mitja.

J Med Syst ; 40(12): 256, 2016 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-27722975

RESUMO

Diabetes is a disease that has to be managed through appropriate lifestyle. Technology can help with this, particularly when it is designed so that it does not impose an additional burden on the patient. This paper presents an approach that combines machine-learning and symbolic reasoning to recognise high-level lifestyle activities using sensor data obtained primarily from the patient's smartphone. We compare five methods for machine-learning which differ in the amount of manually labelled data by the user, to investigate the trade-off between the labelling effort and recognition accuracy. In an evaluation on real-life data, the highest accuracy of 83.4 % was achieved by the MCAT method, which is capable of gradually adapting to each user.

Assuntos

Acelerometria/instrumentação , Diabetes Mellitus/fisiopatologia , Aprendizado de Máquina , Monitorização Ambulatorial/métodos , Atividade Motora/fisiologia , Smartphone , Algoritmos , Eletrocardiografia , Sistemas de Informação Geográfica , Humanos

mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome.

Meyer, Michael J; Lapcevic, Ryan; Romero, Alfonso E; Yoon, Mark; Das, Jishnu; Beltrán, Juan Felipe; Mort, Matthew; Stenson, Peter D; Cooper, David N; Paccanaro, Alberto; Yu, Haiyuan.

Hum Mutat ; 37(5): 447-56, 2016 May.

Artigo em Inglês | MEDLINE | ID: mdl-26841357

RESUMO

A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported.

Assuntos

Substituição de Aminoácidos , Neoplasias/genética , Proteoma/genética , Navegador , Algoritmos , Análise por Conglomerados , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Estrutura Terciária de Proteína , Proteoma/química

A network medicine approach to quantify distance between hereditary disease modules on the interactome.

Caniza, Horacio; Romero, Alfonso E; Paccanaro, Alberto.

Sci Rep ; 5: 17658, 2015 Dec 03.

Artigo em Inglês | MEDLINE | ID: mdl-26631976

RESUMO

We introduce a MeSH-based method that accurately quantifies similarity between heritable diseases at molecular level. This method effectively brings together the existing information about diseases that is scattered across the vast corpus of biomedical literature. We prove that sets of MeSH terms provide a highly descriptive representation of heritable disease and that the structure of MeSH provides a natural way of combining individual MeSH vocabularies. We show that our measure can be used effectively in the prediction of candidate disease genes. We developed a web application to query more than 28.5 million relationships between 7,574 hereditary diseases (96% of OMIM) based on our similarity measure.

Assuntos

Doenças Genéticas Inatas , Medical Subject Headings , Mineração de Dados/métodos , Genes , Doenças Genéticas Inatas/genética , Humanos , Internet

An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo.

Artif Intell Med ; 61(2): 63-78, 2014 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-24726035

RESUMO

OBJECTIVE: In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. MATERIALS AND METHODS: We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. RESULTS: The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. CONCLUSIONS: Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.

Assuntos

Algoritmos , Inteligência Artificial , Redes Reguladoras de Genes , Genômica/métodos , Humanos , Medical Subject Headings

GOssTo: a stand-alone application and a web tool for calculating semantic similarities on the Gene Ontology.

Caniza, Horacio; Romero, Alfonso E; Heron, Samuel; Yang, Haixuan; Devoto, Alessandra; Frasca, Marco; Mesiti, Marco; Valentini, Giorgio; Paccanaro, Alberto.

Bioinformatics ; 30(15): 2235-6, 2014 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-24659104

RESUMO

SUMMARY: We present GOssTo, the Gene Ontology semantic similarity Tool, a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology. GOssTo is bundled with six semantic similarity measures, including both term- and graph-based measures, and has extension capabilities to allow the user to add new similarities. Importantly, for any measure, GOssTo can also calculate the Random Walk Contribution that has been shown to greatly improve the accuracy of similarity measures. GOssTo is very fast, easy to use, and it allows the calculation of similarities on a genomic scale in a few minutes on a regular desktop machine. CONTACT: alberto@cs.rhul.ac.uk AVAILABILITY: GOssTo is available both as a stand-alone application running on GNU/Linux, Windows and MacOS from www.paccanarolab.org/gossto and as a web application from www.paccanarolab.org/gosstoweb. The stand-alone application features a simple and concise command line interface for easy integration into high-throughput data processing pipelines.

Assuntos

Mineração de Dados/métodos , Ontologia Genética , Internet , Semântica , Software , Proteínas/genética , Vocabulário Controlado

A large-scale evaluation of computational protein function prediction.

Radivojac, Predrag; Clark, Wyatt T; Oron, Tal Ronnen; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kaßner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian.

Nat Methods ; 10(3): 221-7, 2013 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23353650

RESUMO

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

Assuntos

Biologia Computacional/métodos , Biologia Molecular/métodos , Anotação de Sequência Molecular , Proteínas/fisiologia , Algoritmos , Animais , Bases de Dados de Proteínas , Exorribonucleases/classificação , Exorribonucleases/genética , Exorribonucleases/fisiologia , Previsões , Humanos , Proteínas/química , Proteínas/classificação , Proteínas/genética , Especificidade da Espécie

Exploring the evolutionary path of plant MAPK networks.

Dóczi, Róbert; Okrész, László; Romero, Alfonso E; Paccanaro, Alberto; Bögre, László.

Trends Plant Sci ; 17(9): 518-25, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22682803

RESUMO

The evolutionarily conserved mitogen-activated protein kinase (MAPK) signaling network comprises connected protein kinases arranged in MAPK modules. In this Opinion article, we analyze MAPK signaling components in evolutionarily representative species of the plant lineage and in Naegleria gruberi, a member of an early diverging eukaryotic clade. In Naegleria, there are two closely related MAPK kinases (MKKs) and a single conventional MAPK, whereas in several species of algae, there are two distinct MKKs and multiple MAPKs belonging to different groups. This suggests that the formation of multiple MAPK modules began early during plant evolution. The expansion of MAPK signaling components through gene duplications and the evolution of interaction motifs could have contributed to the highly connected complex MAPK signaling network that we know in Arabidopsis.

Assuntos

Evolução Biológica , Sistema de Sinalização das MAP Quinases/genética , Proteínas Quinases Ativadas por Mitógeno/genética , Plantas/enzimologia , Sequência Conservada , Duplicação Gênica , Naegleria/enzimologia , Naegleria/genética , Filogenia , Plantas/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA