Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
PLoS Comput Biol ; 12(8): e1005047, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27536940

RESUMO

Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease.


Assuntos
Biologia Computacional/métodos , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Proteínas/genética , Animais , Bases de Dados de Proteínas , Modelos Animais de Doenças , Humanos , Camundongos
2.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799431

RESUMO

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Assuntos
Conformação Proteica , Software , Substituição de Aminoácidos , Sítios de Ligação , Ontologia Genética , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas de Membrana/química , Mutação , Mapeamento de Interação de Proteínas , Proteínas/análise , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
3.
Nucleic Acids Res ; 42(Web Server issue): W350-5, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24848019

RESUMO

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18=80±3% for eukaryotes and a six-state accuracy Q6=89±4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.


Assuntos
Proteínas/análise , Software , Proteínas Arqueais/análise , Inteligência Artificial , Proteínas de Bactérias/análise , Internet , Homologia de Sequência de Aminoácidos
4.
BMC Genomics ; 16 Suppl 8: S1, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26110438

RESUMO

Elucidating the effects of naturally occurring genetic variation is one of the major challenges for personalized health and personalized medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over the state-of-the-art in distinguishing between effect and neutral variants. Our method's improved performance results from screening many potentially relevant protein features and from refining our development data sets. Cross-validated on >100k experimentally annotated variants, SNAP2 significantly outperformed other methods, attaining a two-state accuracy (effect/neutral) of 83%. SNAP2 also outperformed combinations of other methods. Performance increased for human variants but much more so for other organisms. Our method's carefully calibrated reliability index informs selection of variants for experimental follow up, with the most strongly predicted half of all effect variants predicted at over 96% accuracy. As expected, the evolutionary information from automatically generated multiple sequence alignments gave the strongest signal for the prediction. However, we also optimized our new method to perform surprisingly well even without alignments. This feature reduces prediction runtime by over two orders of magnitude, enables cross-genome comparisons, and renders our new method as the best solution for the 10-20% of sequence orphans. SNAP2 is available at: https://rostlab.org/services/snap2web.


Assuntos
Variação Genética , Redes Neurais de Computação , Isoformas de Proteínas/metabolismo , Software , Biologia Computacional , Evolução Molecular , Humanos , Isoformas de Proteínas/genética
5.
BMC Bioinformatics ; 14 Suppl 3: S7, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23514582

RESUMO

BACKGROUND: Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. METHODS: Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. RESULTS AND CONCLUSIONS: During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.


Assuntos
Proteínas/fisiologia , Homologia de Sequência de Aminoácidos , Algoritmos , Proteínas/genética
6.
Am J Perinatol ; 30(1): 75-80, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22836819

RESUMO

OBJECTIVE: Determine the Bishop score most predictive of induction of labor (IOL) success for different maternal weight groups. STUDY DESIGN: Retrospective cohort study. Prospectively collected database utilized to determine the optimum Bishop score within each prepregnancy body mass index (BMI) category of term, nulliparous patients undergoing IOL. RESULTS: For the total group (n = 696), Bishop score ≥ 5 was most predictive of success (75% versus 56%, p < 0.0001). Within each BMI category, Bishop score ≥ 5 remained most predictive: normal weight (79% versus 64%, p < 0.01); overweight (72% versus 58%, p = 0.03); and obese (73% versus 45%, p < 0.0001). Overall, nonobese patients had more success than obese patients (70% versus 59%, p < 0.01). The nonobese group had more success than the obese group when the Bishop score was < 3 (57% versus 39%, p < 0.05) but not when it was ≥ 3 (72% versus 65%, p = 0.1). Also, there was a higher fraction of patients with Bishop score < 3 in the obese group compared with the nonobese group (25% versus 14%, p < 0.001). CONCLUSION: The optimum Bishop score for predicting successful IOL in nulliparous patients was 5 regardless of BMI class. The higher IOL failure rate observed in obese women was associated with lower starting Bishop scores and was compounded by higher failure rates in obese women with Bishop scores < 3.


Assuntos
Índice de Massa Corporal , Colo do Útero/fisiologia , Trabalho de Parto Induzido , Obesidade/complicações , Adulto , Feminino , Humanos , Modelos Logísticos , Análise Multivariada , Paridade , Valor Preditivo dos Testes , Gravidez , Estudos Retrospectivos , Adulto Jovem
7.
Sci Rep ; 7(1): 1608, 2017 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-28487536

RESUMO

Any two unrelated individuals differ by about 10,000 single amino acid variants (SAVs). Do these impact molecular function? Experimental answers cannot answer comprehensively, while state-of-the-art prediction methods can. We predicted the functional impacts of SAVs within human and for variants between human and other species. Several surprising results stood out. Firstly, four methods (CADD, PolyPhen-2, SIFT, and SNAP2) agreed within 10 percentage points on the percentage of rare SAVs predicted with effect. However, they differed substantially for the common SAVs: SNAP2 predicted, on average, more effect for common than for rare SAVs. Given the large ExAC data sets sampling 60,706 individuals, the differences were extremely significant (p-value < 2.2e-16). We provided evidence that SNAP2 might be closer to reality for common SAVs than the other methods, due to its different focus in development. Secondly, we predicted significantly higher fractions of SAVs with effect between healthy individuals than between species; the difference increased for more distantly related species. The same trends were maintained for subsets of only housekeeping proteins and when moving from exomes of 1,000 to 60,000 individuals. SAVs frozen at speciation might maintain protein function, while many variants within a species might bring about crucial changes, for better or worse.


Assuntos
Variação Genética , Humanos , Mutação/genética , Proteoma/metabolismo , Software
8.
F1000Res ; 3: 48, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24860644

RESUMO

SUMMARY: The HeatMapViewer is a BioJS component that lays-out and renders two-dimensional (2D) plots or heat maps that are ideally suited to visualize matrix formatted data in biology such as for the display of microarray experiments or the outcome of mutational studies and the study of SNP-like sequence variants. It can be easily integrated into documents and provides a powerful, interactive way to visualize heat maps in web applications. The software uses a scalable graphics technology that adapts the visualization component to any required resolution, a useful feature for a presentation with many different data-points. The component can be applied to present various biological data types. Here, we present two such cases - showing gene expression data and visualizing mutability landscape analysis. AVAILABILITY: https://github.com/biojs/biojs; http://dx.doi.org/10.5281/zenodo.7706.

9.
J Mol Biol ; 425(21): 3937-48, 2013 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-23896297

RESUMO

Some mutations of protein residues matter more than others, and these are often conserved evolutionarily. The explosion of deep sequencing and genotyping increasingly requires the distinction between effect and neutral variants. The simplest approach predicts all mutations of conserved residues to have an effect; however, this works poorly, at best. Many computational tools that are optimized to predict the impact of point mutations provide more detail. Here, we expand the perspective from the view of single variants to the level of sketching the entire mutability landscape. This landscape is defined by the impact of substituting every residue at each position in a protein by each of the 19 non-native amino acids. We review some of the powerful conclusions about protein function, stability and their robustness to mutation that can be drawn from such an analysis. Large-scale experimental and computational mutagenesis experiments are increasingly furthering our understanding of protein function and of the genotype-phenotype associations. We also discuss how these can be used to improve predictions of protein function and pathogenicity of missense variants.


Assuntos
Predisposição Genética para Doença , Mutação de Sentido Incorreto , Mutação Puntual , Proteínas/genética , Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Humanos , Modelos Moleculares , Estabilidade Proteica , Proteínas/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa