Pesquisa | BVS Economia da Saúde

Critical assessment of protein intrinsic disorder prediction (CAID) - Results of round 2.

Conte, Alessio Del; Mehdiabadi, Mahta; Bouhraoua, Adel; Miguel Monzon, Alexander; Tosatto, Silvio C E; Piovesan, Damiano.

Proteins ; 91(12): 1925-1934, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37621223

RESUMO

Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The CAID2 challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. Methods based on AlphaFold2 seem to be good ID predictors but they are better at detecting absence of order rather than ID regions as defined in DisProt. The CAID2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.

Assuntos

Proteínas Intrinsicamente Desordenadas , Proteínas , Software , Bases de Dados de Proteínas

CAGI6 ID-Challenge: Assessment of phenotype and variant predictions in 415 children with Neurodevelopmental Disorders (NDDs).

Aspromonte, Maria Cristina; Conte, Alessio Del; Zhu, Shaowen; Tan, Wuwei; Shen, Yang; Zhang, Yexian; Li, Qi; Wang, Maggie Haitian; Babbi, Giulia; Bovo, Samuele; Martelli, Pier Luigi; Casadio, Rita; Althagafi, Azza; Toonsi, Sumyyah; Kulmanov, Maxat; Hoehndorf, Robert; Katsonis, Panagiotis; Williams, Amanda; Lichtarge, Olivier; Xian, Su; Surento, Wesley; Pejaver, Vikas; Mooney, Sean D; Sunderam, Uma; Srinivasan, Rajgopal; Murgia, Alessandra; Piovesan, Damiano; Tosatto, Silvio C E; Leonardi, Emanuela.

Res Sq ; 2023 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-37577579

RESUMO

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

Critical assessment of protein intrinsic disorder prediction.

Necci, Marco; Piovesan, Damiano; Tosatto, Silvio C E.

Nat Methods ; 18(5): 472-481, 2021 05.

Artigo em Inglês | MEDLINE | ID: mdl-33875885

RESUMO

Intrinsically disordered proteins, defying the traditional protein structure-function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.

Assuntos

Biologia Computacional , Proteínas Intrinsicamente Desordenadas/química , Sequência de Aminoácidos , Bases de Dados de Proteínas , Ligação Proteica , Conformação Proteica , Dobramento de Proteína , Software

A comprehensive assessment of long intrinsic protein disorder from the DisProt database.

Necci, Marco; Piovesan, Damiano; Dosztányi, Zsuzsanna; Tompa, Peter; Tosatto, Silvio C E.

Bioinformatics ; 34(3): 445-452, 2018 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-28968848

RESUMO

Motivation: Intrinsic disorder (ID), i.e. the lack of a unique folded conformation at physiological conditions, is a common feature for many proteins, which requires specialized biochemical experiments that are not high-throughput. Missing X-ray residues from the PDB have been widely used as a proxy for ID when developing computational methods. This may lead to a systematic bias, where predictors deviate from biologically relevant ID. Large benchmarking sets on experimentally validated ID are scarce. Recently, the DisProt database has been renewed and expanded to include manually curated ID annotations for several hundred new proteins. This provides a large benchmark set which has not yet been used for training ID predictors. Results: Here, we describe the first systematic benchmarking of ID predictors on the new DisProt dataset. In contrast to previous assessments based on missing X-ray data, this dataset contains mostly long ID regions and a significant amount of fully ID proteins. The benchmarking shows that ID predictors work quite well on the new dataset, especially for long ID segments. However, a large fraction of ID still goes virtually undetected and the ranking of methods is different than for PDB data. In particular, many predictors appear to confound ID and regions outside X-ray structures. This suggests that the ID prediction methods capture different flavors of disorder and can benefit from highly accurate curated examples. Availability and implementation: The raw data used for the evaluation are available from URL: http://www.disprot.org/assessment/. Contact: silvio.tosatto@unipd.it. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Conformação Proteica , Análise de Sequência de Proteína/métodos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA