Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Proteins ; 91(12): 1925-1934, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37621223

RESUMO

Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The CAID2 challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. Methods based on AlphaFold2 seem to be good ID predictors but they are better at detecting absence of order rather than ID regions as defined in DisProt. The CAID2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.


Assuntos
Proteínas Intrinsicamente Desordenadas , Proteínas , Software , Bases de Dados de Proteínas
2.
Res Sq ; 2023 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-37577579

RESUMO

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

3.
Nat Methods ; 18(5): 472-481, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33875885

RESUMO

Intrinsically disordered proteins, defying the traditional protein structure-function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.


Assuntos
Biologia Computacional , Proteínas Intrinsicamente Desordenadas/química , Sequência de Aminoácidos , Bases de Dados de Proteínas , Ligação Proteica , Conformação Proteica , Dobramento de Proteína , Software
4.
Hum Mutat ; 40(9): 1346-1363, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31209962

RESUMO

Intellectual disability (ID) and autism spectrum disorder (ASD) are clinically and genetically heterogeneous diseases. Recent whole exome sequencing studies indicated that genes associated with different neurological diseases are shared across disorders and converge on common functional pathways. Using the Ion Torrent platform, we developed a low-cost next-generation sequencing gene panel that has been transferred into clinical practice, replacing single disease-gene analyses for the early diagnosis of individuals with ID/ASD. The gene panel was designed using an innovative in silico approach based on disease networks and mining data from public resources to score disease-gene associations. We analyzed 150 unrelated individuals with ID and/or ASD and a confident diagnosis has been reached in 26 cases (17%). Likely pathogenic mutations have been identified in another 15 patients, reaching a total diagnostic yield of 27%. Our data also support the pathogenic role of genes recently proposed to be involved in ASD. Although many of the identified variants need further investigation to be considered disease-causing, our results indicate the efficiency of the targeted gene panel on the identification of novel and rare variants in patients with ID and ASD.


Assuntos
Transtorno do Espectro Autista/diagnóstico , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Deficiência Intelectual/diagnóstico , Adolescente , Adulto , Transtorno do Espectro Autista/genética , Criança , Pré-Escolar , Comorbidade , Simulação por Computador , Mineração de Dados , Bases de Dados Genéticas , Diagnóstico Precoce , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala/economia , Humanos , Deficiência Intelectual/genética , Masculino , Mutação , Sequenciamento do Exoma/economia , Sequenciamento do Exoma/métodos , Adulto Jovem
5.
Hum Mutat ; 40(9): 1330-1345, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31144778

RESUMO

The Critical Assessment of Genome Interpretation-5 intellectual disability challenge asked to use computational methods to predict patient clinical phenotypes and the causal variant(s) based on an analysis of their gene panel sequence data. Sequence data for 74 genes associated with intellectual disability (ID) and/or autism spectrum disorders (ASD) from a cohort of 150 patients with a range of neurodevelopmental manifestations (i.e. ID, autism, epilepsy, microcephaly, macrocephaly, hypotonia, ataxia) have been made available for this challenge. For each patient, predictors had to report the causative variants and which of the seven phenotypes were present. Since neurodevelopmental disorders are characterized by strong comorbidity, tested individuals often present more than one pathological condition. Considering the overall clinical manifestation of each patient, the correct phenotype has been predicted by at least one group for 93 individuals (62%). ID and ASD were the best predicted among the seven phenotypic traits. Also, causative or potentially pathogenic variants were predicted correctly by at least one group. However, the prediction of the correct causative variant seems to be insufficient to predict the correct phenotype. In some cases, the correct prediction has been supported by rare or common variants in genes different from the causative one.


Assuntos
Transtorno do Espectro Autista/genética , Biologia Computacional/métodos , Deficiência Intelectual/genética , Análise de Sequência de DNA/métodos , Feminino , Predisposição Genética para Doença , Humanos , Masculino , Fenótipo , Locos de Características Quantitativas
6.
Bioinformatics ; 34(3): 445-452, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28968848

RESUMO

Motivation: Intrinsic disorder (ID), i.e. the lack of a unique folded conformation at physiological conditions, is a common feature for many proteins, which requires specialized biochemical experiments that are not high-throughput. Missing X-ray residues from the PDB have been widely used as a proxy for ID when developing computational methods. This may lead to a systematic bias, where predictors deviate from biologically relevant ID. Large benchmarking sets on experimentally validated ID are scarce. Recently, the DisProt database has been renewed and expanded to include manually curated ID annotations for several hundred new proteins. This provides a large benchmark set which has not yet been used for training ID predictors. Results: Here, we describe the first systematic benchmarking of ID predictors on the new DisProt dataset. In contrast to previous assessments based on missing X-ray data, this dataset contains mostly long ID regions and a significant amount of fully ID proteins. The benchmarking shows that ID predictors work quite well on the new dataset, especially for long ID segments. However, a large fraction of ID still goes virtually undetected and the ranking of methods is different than for PDB data. In particular, many predictors appear to confound ID and regions outside X-ray structures. This suggests that the ID prediction methods capture different flavors of disorder and can benefit from highly accurate curated examples. Availability and implementation: The raw data used for the evaluation are available from URL: http://www.disprot.org/assessment/. Contact: silvio.tosatto@unipd.it. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Conformação Proteica , Análise de Sequência de Proteína/métodos
7.
Hum Mutat ; 38(9): 1182-1192, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28634997

RESUMO

Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships.


Assuntos
Transtorno Bipolar/genética , Doença de Crohn/genética , Sequenciamento do Exoma/métodos , Medicina de Precisão/métodos , Varfarina/uso terapêutico , Biologia Computacional/métodos , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Disseminação de Informação , Variantes Farmacogenômicos , Fenótipo , Varfarina/farmacologia
8.
Bioinformatics ; 31(2): 201-8, 2015 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-25246432

RESUMO

MOTIVATION: Intrinsically disordered regions are key for the function of numerous proteins. Due to the difficulties in experimental disorder characterization, many computational predictors have been developed with various disorder flavors. Their performance is generally measured on small sets mainly from experimentally solved structures, e.g. Protein Data Bank (PDB) chains. MobiDB has only recently started to collect disorder annotations from multiple experimental structures. RESULTS: MobiDB annotates disorder for UniProt sequences, allowing us to conduct the first large-scale assessment of fast disorder predictors on 25 833 different sequences with X-ray crystallographic structures. In addition to a comprehensive ranking of predictors, this analysis produced the following interesting observations. (i) The predictors cluster according to their disorder definition, with a consensus giving more confidence. (ii) Previous assessments appear over-reliant on data annotated at the PDB chain level and performance is lower on entire UniProt sequences. (iii) Long disordered regions are harder to predict. (iv) Depending on the structural and functional types of the proteins, differences in prediction performance of up to 10% are observed. AVAILABILITY: The datasets are available from Web site at URL: http://mobidb.bio.unipd.it/lsd. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas/química , Análise de Sequência de Proteína/métodos , Proteína Supressora de Tumor p53/química , Cristalografia por Raios X , Bases de Dados de Proteínas , Humanos , Anotação de Sequência Molecular , Estrutura Terciária de Proteína
9.
Hum Mutat ; 35(7): 828-40, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24659262

RESUMO

CDKN2A codes for two oncosuppressors by alternative splicing of two first exons: p16INK4a and p14ARF. Germline mutations are found in about 40% of melanoma-prone families, and most of them are missense mutations mainly affecting p16INK4a. A growing number of p16INK4a variants of uncertain significance (VUS) are being identified but, unless their pathogenic role can be demonstrated, they cannot be used for identification of carriers at risk. Predicting the effect of these VUS by either a "standard" in silico approach, or functional tests alone, is rather difficult. Here, we report a protocol for the assessment of any p16INK4a VUS, which combines experimental and computational tools in an integrated approach. We analyzed p16INK4a VUS from melanoma patients as well as variants derived through permutation of conserved p16INK4a amino acids. Variants were expressed in a p16INK4a-null cell line (U2-OS) and tested for their ability to block proliferation. In parallel, these VUS underwent in silico prediction analysis and molecular dynamics simulations. Evaluation of in silico and functional data disclosed a high agreement for 15/16 missense mutations, suggesting that this approach could represent a pilot study for the definition of a protocol applicable to VUS in general, involved in other diseases, as well.


Assuntos
Inibidor p16 de Quinase Dependente de Ciclina/genética , Variação Genética , Melanoma/genética , Motivos de Aminoácidos , Substituição de Aminoácidos , Linhagem Celular Tumoral , Proliferação de Células , Inibidor p16 de Quinase Dependente de Ciclina/metabolismo , Humanos , Melanoma/diagnóstico , Modelos Moleculares , Mutação de Sentido Incorreto , Conformação Proteica , Neoplasias Cutâneas , Melanoma Maligno Cutâneo
10.
Proteins ; 71(1): 261-77, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17932912

RESUMO

In protein structure prediction, a considerable number of alternative models are usually produced from which subsequently the final model has to be selected. Thus, a scoring function for the identification of the best model within an ensemble of alternative models is a key component of most protein structure prediction pipelines. QMEAN, which stands for Qualitative Model Energy ANalysis, is a composite scoring function describing the major geometrical aspects of protein structures. Five different structural descriptors are used. The local geometry is analyzed by a new kind of torsion angle potential over three consecutive amino acids. A secondary structure-specific distance-dependent pairwise residue-level potential is used to assess long-range interactions. A solvation potential describes the burial status of the residues. Two simple terms describing the agreement of predicted and calculated secondary structure and solvent accessibility, respectively, are also included. A variety of different implementations are investigated and several approaches to combine and optimize them are discussed. QMEAN was tested on several standard decoy sets including a molecular dynamics simulation decoy set as well as on a comprehensive data set of totally 22,420 models from server predictions for the 95 targets of CASP7. In a comparison to five well-established model quality assessment programs, QMEAN shows a statistically significant improvement over nearly all quality measures describing the ability of the scoring function to identify the native structure and to discriminate good from bad models. The three-residue torsion angle potential turned out to be very effective in recognizing the native fold.


Assuntos
Estudos de Avaliação como Assunto , Modelos Moleculares , Proteínas/química , Conformação Proteica , Estrutura Secundária de Proteína , Solventes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA