Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Mol Cell Proteomics ; 22(10): 100629, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37557955

RESUMEN

Neurodegenerative dementias are progressive diseases that cause neuronal network breakdown in different brain regions often because of accumulation of misfolded proteins in the brain extracellular matrix, such as amyloids or inside neurons or other cell types of the brain. Several diagnostic protein biomarkers in body fluids are being used and implemented, such as for Alzheimer's disease. However, there is still a lack of biomarkers for co-pathologies and other causes of dementia. Such biofluid-based biomarkers enable precision medicine approaches for diagnosis and treatment, allow to learn more about underlying disease processes, and facilitate the development of patient inclusion and evaluation tools in clinical trials. When designing studies to discover novel biofluid-based biomarkers, choice of technology is an important starting point. But there are so many technologies to choose among. To address this, we here review the technologies that are currently available in research settings and, in some cases, in clinical laboratory practice. This presents a form of lexicon on each technology addressing its use in research and clinics, its strengths and limitations, and a future perspective.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Encéfalo , Biomarcadores , Neuronas , Medicina de Precisión , Péptidos beta-Amiloides
2.
Proteins ; 92(5): 649-664, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38149328

RESUMEN

Glial fibrillary acidic protein (GFAP) is a promising biomarker for brain and spinal cord disorders. Recent studies have highlighted the differences in the reliability of GFAP measurements in different biological matrices. The reason for these discrepancies is poorly understood as our knowledge of the protein's 3-dimensional conformation, proteoforms, and aggregation remains limited. Here, we investigate the structural properties of GFAP under different conditions. For this, we characterized recombinant GFAP proteins from various suppliers and applied hydrogen-deuterium exchange mass spectrometry (HDX-MS) to provide a snapshot of the conformational dynamics of GFAP in artificial cerebrospinal fluid (aCSF) compared to the phosphate buffer. Our findings indicate that recombinant GFAP exists in various conformational species. Furthermore, we show that GFAP dimers remained intact under denaturing conditions. HDX-MS experiments show an overall decrease in H-bonding and an increase in solvent accessibility of GFAP in aCSF compared to the phosphate buffer, with clear indications of mixed EX2 and EX1 kinetics. To understand possible structural interface regions and the evolutionary conservation profiles, we combined HDX-MS results with the predicted GFAP-dimer structure by AlphaFold-Multimer. We found that deprotected regions with high structural flexibility in aCSF overlap with predicted conserved dimeric 1B and 2B domain interfaces. Structural property predictions combined with the HDX data show an overall deprotection and signatures of aggregation in aCSF. We anticipate that the outcomes of this research will contribute to a deeper understanding of the structural flexibility of GFAP and ultimately shed light on its behavior in different biological matrices.


Asunto(s)
Medición de Intercambio de Deuterio , Proteína Ácida Fibrilar de la Glía , Fosfatos , Humanos , Medición de Intercambio de Deuterio/métodos , Proteína Ácida Fibrilar de la Glía/química , Proteína Ácida Fibrilar de la Glía/genética , Proteína Ácida Fibrilar de la Glía/fisiología , Conformación Proteica , Reproducibilidad de los Resultados , Proteínas Recombinantes
3.
J Proteome Res ; 22(9): 3068-3080, 2023 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-37606934

RESUMEN

Cerebrospinal fluid (CSF) is an essential matrix for the discovery of neurological disease biomarkers. However, the high dynamic range of protein concentrations in CSF hinders the detection of the least abundant protein biomarkers by untargeted mass spectrometry. It is thus beneficial to gain a deeper understanding of the secretion processes within the brain. Here, we aim to explore if and how the secretion of brain proteins to the CSF can be predicted. By combining a curated CSF proteome and the brain elevated proteome of the Human Protein Atlas, brain proteins were classified as CSF or non-CSF secreted. A machine learning model was trained on a range of sequence-based features to differentiate between CSF and non-CSF groups and effectively predict the brain origin of proteins. The classification model achieves an area under the curve of 0.89 if using high confidence CSF proteins. The most important prediction features include the subcellular localization, signal peptides, and transmembrane regions. The classifier generalized well to the larger brain detected proteome and is able to correctly predict novel CSF proteins identified by affinity proteomics. In addition to elucidating the underlying mechanisms of protein secretion, the trained classification model can support biomarker candidate selection.


Asunto(s)
Investigación Biomédica , Proteoma , Humanos , Encéfalo , Transporte de Proteínas , Transporte Biológico , Proteínas del Líquido Cefalorraquídeo
4.
Bioinformatics ; 38(8): 2111-2118, 2022 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-35150231

RESUMEN

MOTIVATION: The interactions between proteins and other molecules are essential to many biological and cellular processes. Experimental identification of interface residues is a time-consuming, costly and challenging task, while protein sequence data are ubiquitous. Consequently, many computational and machine learning approaches have been developed over the years to predict such interface residues from sequence. However, the effectiveness of different Deep Learning (DL) architectures and learning strategies for protein-protein, protein-nucleotide and protein-small molecule interface prediction has not yet been investigated in great detail. Therefore, we here explore the prediction of protein interface residues using six DL architectures and various learning strategies with sequence-derived input features. RESULTS: We constructed a large dataset dubbed BioDL, comprising protein-protein interactions from the PDB, and DNA/RNA and small molecule interactions from the BioLip database. We also constructed six DL architectures, and evaluated them on the BioDL benchmarks. This shows that no single architecture performs best on all instances. An ensemble architecture, which combines all six architectures, does consistently achieve peak prediction accuracy. We confirmed these results on the published benchmark set by Zhang and Kurgan (ZK448), and on our own existing curated homo- and heteromeric protein interaction dataset. Our PIPENN sequence-based ensemble predictor outperforms current state-of-the-art sequence-based protein interface predictors on ZK448 on all interaction types, achieving an AUC-ROC of 0.718 for protein-protein, 0.823 for protein-nucleotide and 0.842 for protein-small molecule. AVAILABILITY AND IMPLEMENTATION: Source code and datasets are available at https://github.com/ibivu/pipenn/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Automático , Proteínas , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Nucleótidos , Biología Computacional/métodos
5.
BMC Bioinformatics ; 23(1): 487, 2022 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-36384426

RESUMEN

BACKGROUND: Current methods of high-dimensional unsupervised clustering of mass cytometry data lack means to monitor and evaluate clustering results. Whether unsupervised clustering is correct is typically evaluated by agreement with dimensionality reduction techniques or based on benchmarking with manually classified cells. The ambiguity and lack of reproducibility of sequential gating has been replaced with ambiguity in interpretation of clustering results. On the other hand, spurious overclustering of data leads to loss of statistical power. We have developed INFLECT, an R-package designed to give insight in clustering results and provide an optimal number of clusters. In our approach, a mass cytometry dataset is overclustered intentionally to ensure the smallest phenotypically different subsets are captured using FlowSOM. A range of metacluster number endpoints are generated and evaluated using marker interquartile range and distribution unimodality checks. The fraction of marker distributions that pass these checks is taken as a measure of clustering success. The fraction of unimodal distributions within metaclusters is plotted against the number of generated metaclusters and reaches a plateau of diminishing returns. The inflection point at which this occurs gives an optimal point of capturing cellular heterogeneity versus statistical power. RESULTS: We applied INFLECT to four publically available mass cytometry datasets of different size and number of markers. The unimodality score consistently reached a plateau, with an inflection point dependent on dataset size and number of dimensions. We tested both ConsenusClusterPlus metaclustering and hierarchical clustering. While hierarchical clustering is less computationally expensive and thus faster, it achieved similar results to ConsensusClusterPlus. The four datasets consisted of labeled data and we compared INFLECT metaclustering to published results. INFLECT identified a higher optimal number of metaclusters for all datasets. We illustrated the underlying heterogeneity within labels, showing that these labels encompass distinct types of cells. CONCLUSION: INFLECT addresses a knowledge gap in high-dimensional cytometry analysis, namely assessing clustering results. This is done through monitoring marker distributions for interquartile range and unimodality across a range of metacluster numbers. The inflection point is the optimal trade-off between cellular heterogeneity and statistical power, applied in this work for FlowSOM clustering on mass cytometry datasets.


Asunto(s)
Reproducibilidad de los Resultados , Análisis por Conglomerados , Biomarcadores
6.
Bioinformatics ; 37(20): 3421-3427, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33974039

RESUMEN

MOTIVATION: Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen's epitope region, as a special type of protein-protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. RESULTS: We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody-antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. AVAILABILITY AND IMPLEMENTATION: Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
PLoS Comput Biol ; 16(5): e1007767, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32365068

RESUMEN

Many proteins have the potential to aggregate into amyloid fibrils, protein polymers associated with a wide range of human disorders such as Alzheimer's and Parkinson's disease. The thermodynamic stability of amyloid fibrils, in contrast to that of folded proteins, is not well understood: the balance between entropic and enthalpic terms, including the chain entropy and the hydrophobic effect, are poorly characterised. Using a combination of theory, in vitro experiments, simulations of a coarse-grained protein model and meta-data analysis, we delineate the enthalpic and entropic contributions that dominate amyloid fibril elongation. Our prediction of a characteristic temperature-dependent enthalpic signature is confirmed by the performed calorimetric experiments and a meta-analysis over published data. From these results we are able to define the necessary conditions to observe cold denaturation of amyloid fibrils. Overall, we show that amyloid fibril elongation is associated with a negative heat capacity, the magnitude of which correlates closely with the hydrophobic surface area that is buried upon fibril formation, highlighting the importance of hydrophobicity for fibril stability.


Asunto(s)
Amiloide/química , Amiloide/fisiología , Amiloide/metabolismo , Péptidos beta-Amiloides/química , Péptidos beta-Amiloides/fisiología , Proteínas Amiloidogénicas/química , Proteínas Amiloidogénicas/fisiología , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Teóricos , Simulación de Dinámica Molecular , Desnaturalización Proteica , Pliegue de Proteína , Temperatura , Termodinámica
8.
Bioinformatics ; 35(24): 5315-5317, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31368486

RESUMEN

SUMMARY: PRALINE 2 is a toolkit for custom multiple sequence alignment workflows. It can be used to incorporate sequence annotations, such as secondary structure or (DNA) motifs, into the alignment scoring, as well as to customize many other aspects of a progressive multiple alignment workflow. AVAILABILITY AND IMPLEMENTATION: PRALINE 2 is implemented in Python and available as open source software on GitHub: https://github.com/ibivu/PRALINE/.


Asunto(s)
Programas Informáticos , ADN , Estructura Secundaria de Proteína , Alineación de Secuencia
9.
Bioinformatics ; 35(22): 4794-4796, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31116381

RESUMEN

MOTIVATION: Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random forest method for protein-protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein-protein interactions. Here, we present a webserver that implements this method efficiently. RESULTS: With the aim of accelerating our previous approach, we obtained sequence conservation profiles by re-mastering the alignment of homologous sequences found by PSI-BLAST. This yielded a more than 10-fold speedup and at least the same accuracy, as reported previously for our method; these results allowed us to offer the method as a webserver. The web-server interface is targeted to the non-expert user. The input is simply a sequence of the protein of interest, and the output a table with scores indicating the likelihood of having an interaction interface at a certain position. As the method is sequence-based and not sensitive to the type of protein interaction, we expect this webserver to be of interest to many biological researchers in academia and in industry. AVAILABILITY AND IMPLEMENTATION: Webserver, source code and datasets are available at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Proteínas , Análisis de Secuencia de Proteína
10.
Bioinformatics ; 34(13): i4-i12, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29950011

RESUMEN

Motivation: Our society has become data-rich to the extent that research in many areas has become impossible without computational approaches. Educational programmes seem to be lagging behind this development. At the same time, there is a growing need not only for strong data science skills, but foremost for the ability to both translate between tools and methods on the one hand, and application and problems on the other. Results: Here we present our experiences with shaping and running a masters' programme in bioinformatics and systems biology in Amsterdam. From this, we have developed a comprehensive philosophy on how translation in training may be achieved in a dynamic and multidisciplinary research area, which is described here. We furthermore describe two requirements that enable translation, which we have found to be crucial: sufficient depth and focus on multidisciplinary topic areas, coupled with a balanced breadth from adjacent disciplines. Finally, we present concrete suggestions on how this may be implemented in practice, which may be relevant for the effectiveness of life science and data science curricula in general, and of particular interest to those who are in the process of setting up such curricula. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/educación , Curriculum , Ciencia de los Datos/educación , Humanos
11.
PLoS Comput Biol ; 14(11): e1006547, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30383764

RESUMEN

Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSAs) with a correct alignment of the conserved sequence motifs is still difficult to achieve, due to the fact that the contribution of these typically short fragments is overshadowed by the rest of the sequence. Here we extended the PRALINE multiple sequence alignment program with a novel motif-aware MSA algorithm in order to address this shortcoming. This method can incorporate explicit information about the presence of externally provided sequence motifs, which is then used in the dynamic programming step by boosting the amino acid substitution matrix towards the motif. The strength of the boost is controlled by a parameter, α. Using a benchmark set of alignments we confirm that a good compromise can be found that improves the matching of motif regions while not significantly reducing the overall alignment quality. By estimating α on an unrelated set of reference alignments we find there is indeed a strong conservation signal for motifs. A number of typical but difficult MSA use cases are explored to exemplify the problems in correctly aligning functional sequence motifs and how the motif-aware alignment method can be employed to alleviate these problems.


Asunto(s)
Secuencias de Aminoácidos , ADN/química , Proteínas/química , Alineación de Secuencia/normas , Algoritmos , Secuencia de Aminoácidos , Secuencia Conservada , VIH-1/química , Homología de Secuencia de Aminoácido , Productos del Gen env del Virus de la Inmunodeficiencia Humana/química
12.
Bioinformatics ; 32(12): i60-i69, 2016 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-27307645

RESUMEN

MOTIVATION: Biological pathways play a key role in most cellular functions. To better understand these functions, diverse computational and cell biology researchers use biological pathway data for various analysis and modeling purposes. For specifying these biological pathways, a community of researchers has defined BioPAX and provided various tools for creating, validating and visualizing BioPAX models. However, a generic software framework for simulating BioPAX models is missing. Here, we attempt to fill this gap by introducing a generic simulation framework for BioPAX. The framework explicitly separates the execution model from the model structure as provided by BioPAX, with the advantage that the modelling process becomes more reproducible and intrinsically more modular; this ensures natural biological constraints are satisfied upon execution. The framework is based on the principles of discrete event systems and multi-agent systems, and is capable of automatically generating a hierarchical multi-agent system for a given BioPAX model. RESULTS: To demonstrate the applicability of the framework, we simulated two types of biological network models: a gene regulatory network modeling the haematopoietic stem cell regulators and a signal transduction network modeling the Wnt/ß-catenin signaling pathway. We observed that the results of the simulations performed using our framework were entirely consistent with the simulation results reported by the researchers who developed the original models in a proprietary language. AVAILABILITY AND IMPLEMENTATION: The framework, implemented in Java, is open source and its source code, documentation and tutorial are available at http://www.ibi.vu.nl/programs/BioASF CONTACT: j.heringa@vu.nl.


Asunto(s)
Redes Reguladoras de Genes , Modelos Biológicos , Transducción de Señal , Programas Informáticos , Simulación por Computador , Humanos , Lenguajes de Programación
13.
Bioinformatics ; 32(11): 1678-85, 2016 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-26342232

RESUMEN

MOTIVATION: The human microbiome plays a key role in health and disease. Thanks to comparative metatranscriptomics, the cellular functions that are deregulated by the microbiome in disease can now be computationally explored. Unlike gene-centric approaches, pathway-based methods provide a systemic view of such functions; however, they typically consider each pathway in isolation and in its entirety. They can therefore overlook the key differences that (i) span multiple pathways, (ii) contain bidirectionally deregulated components, (iii) are confined to a pathway region. To capture these properties, computational methods that reach beyond the scope of predefined pathways are needed. RESULTS: By integrating an existing module discovery algorithm into comparative metatranscriptomic analysis, we developed metaModules, a novel computational framework for automated identification of the key functional differences between health- and disease-associated communities. Using this framework, we recovered significantly deregulated subnetworks that were indeed recognized to be involved in two well-studied, microbiome-mediated oral diseases, such as butanoate production in periodontal disease and metabolism of sugar alcohols in dental caries. More importantly, our results indicate that our method can be used for hypothesis generation based on automated discovery of novel, disease-related functional subnetworks, which would otherwise require extensive and laborious manual assessment. AVAILABILITY AND IMPLEMENTATION: metaModules is available at https://bitbucket.org/alimay/metamodules/ CONTACT: a.may@vu.nl or s.abeln@vu.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Microbiota , Algoritmos , Caries Dental , Humanos
14.
Nucleic Acids Res ; 43(W1): W301-5, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25878034

RESUMEN

Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species composition in a multitude of ecological niches. These sequencing runs often contain a sample with known composition that can be used to evaluate the sequencing quality or to detect novel sequence variants. With NGS-eval, the reads from such (mock) samples can be used to (i) explore the differences between the reads and their references and to (ii) estimate the sequencing error rate. This tool maps these reads to references and calculates as well as visualizes the different types of sequencing errors. Clearly, sequencing errors can only be accurately calculated if the reference sequences are correct. However, even with known strains, it is not straightforward to select the correct references from databases. We previously analysed a pyrosequencing dataset from a mock sample to estimate sequencing error rates and detected sequence variants in our mock community, allowing us to obtain an accurate error estimation. Here, we demonstrate the variant detection and error analysis capability of NGS-eval with Illumina MiSeq reads from the same mock community. While tailored towards the field of metagenomics, this server can be used for any type of MGM-based reads. NGS-eval is available at http://www.ibi.vu.nl/programs/ngsevalwww/.


Asunto(s)
Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica/métodos , Programas Informáticos , Marcadores Genéticos , Internet
15.
Phys Rev Lett ; 116(7): 078101, 2016 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-26943560

RESUMEN

The hydrophobic effect stabilizes the native structure of proteins by minimizing the unfavorable interactions between hydrophobic residues and water through the formation of a hydrophobic core. Here, we include the entropic and enthalpic contributions of the hydrophobic effect explicitly in an implicit solvent model. This allows us to capture two important effects: a length-scale dependence and a temperature dependence for the solvation of a hydrophobic particle. This consistent treatment of the hydrophobic effect explains cold denaturation and heat capacity measurements of solvated proteins.


Asunto(s)
Modelos Químicos , Proteínas/química , Frío , Interacciones Hidrofóbicas e Hidrofílicas , Método de Montecarlo , Péptidos/química , Desnaturalización Proteica , Pliegue de Proteína , Agua/química
16.
PLoS Comput Biol ; 11(5): e1004277, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-26000449

RESUMEN

The hydrophobic effect is the main driving force in protein folding. One can estimate the relative strength of this hydrophobic effect for each amino acid by mining a large set of experimentally determined protein structures. However, the hydrophobic force is known to be strongly temperature dependent. This temperature dependence is thought to explain the denaturation of proteins at low temperatures. Here we investigate if it is possible to extract this temperature dependence directly from a large set of protein structures determined at different temperatures. Using NMR structures filtered for sequence identity, we were able to extract hydrophobicity propensities for all amino acids at five different temperature ranges (spanning 265-340 K). These propensities show that the hydrophobicity becomes weaker at lower temperatures, in line with current theory. Alternatively, one can conclude that the temperature dependence of the hydrophobic effect has a measurable influence on protein structures. Moreover, this work provides a method for probing the individual temperature dependence of the different amino acid types, which is difficult to obtain by direct experiment.


Asunto(s)
Aminoácidos/química , Proteínas/química , Algoritmos , Biología Computacional , Cristalografía por Rayos X , Bases de Datos de Proteínas , Interacciones Hidrofóbicas e Hidrofílicas , Espectroscopía de Resonancia Magnética , Modelos Estadísticos , Desnaturalización Proteica , Pliegue de Proteína , Solventes , Temperatura , Agua/química
17.
PLoS Comput Biol ; 11(10): e1004435, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26505754

RESUMEN

It has been recently shown that the coarse-graining of the structures of polypeptide chains as self-avoiding tubes can provide an effective representation of the conformational space of proteins. In order to fully exploit the opportunities offered by such a 'tube model' approach, we present here a strategy to combine it with molecular dynamics simulations. This strategy is based on the incorporation of the 'CamTube' force field into the Gromacs molecular dynamics package. By considering the case of a 60-residue polyvaline chain, we show that CamTube molecular dynamics simulations can comprehensively explore the conformational space of proteins. We obtain this result by a 20 µs metadynamics simulation of the polyvaline chain that recapitulates the currently known protein fold universe. We further show that, if residue-specific interaction potentials are added to the CamTube force field, it is possible to fold a protein into a topology close to that of its native state. These results illustrate how the CamTube force field can be used to explore efficiently the universe of protein folds with good accuracy and very limited computational cost.


Asunto(s)
Algoritmos , Modelos Químicos , Simulación de Dinámica Molecular , Pliegue de Proteína , Proteínas/química , Proteínas/ultraestructura , Lenguajes de Programación , Conformación Proteica , Programas Informáticos , Estrés Mecánico
18.
Brief Bioinform ; 14(5): 589-98, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23603092

RESUMEN

Teaching students with very diverse backgrounds can be extremely challenging. This article uses the Bioinformatics and Systems Biology MSc in Amsterdam as a case study to describe how the knowledge gap for students with heterogeneous backgrounds can be bridged. We show that a mix in backgrounds can be turned into an advantage by creating a stimulating learning environment for the students. In the MSc Programme, conversion classes help to bridge differences between students, by mending initial knowledge and skill gaps. Mixing students from different backgrounds in a group to solve a complex task creates an opportunity for the students to reflect on their own abilities. We explain how a truly interdisciplinary approach to teaching helps students of all backgrounds to achieve the MSc end terms. Moreover, transferable skills obtained by the students in such a mixed study environment are invaluable for their later careers.


Asunto(s)
Biología Computacional/educación , Biología de Sistemas/educación , Curriculum , Educación de Postgrado , Humanos , Países Bajos , Estudiantes
19.
Bioinformatics ; 30(11): 1530-8, 2014 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24519382

RESUMEN

MOTIVATION: 16S rDNA pyrosequencing is a powerful approach that requires extensive usage of computational methods for delineating microbial compositions. Previously, it was shown that outcomes of studies relying on this approach vastly depend on the choice of pre-processing and clustering algorithms used. However, obtaining insights into the effects and accuracy of these algorithms is challenging due to difficulties in generating samples of known composition with high enough diversity. Here, we use in silico microbial datasets to better understand how the experimental data are transformed into taxonomic clusters by computational methods. RESULTS: We were able to qualitatively replicate the raw experimental pyrosequencing data after rigorously adjusting existing simulation software. This allowed us to simulate datasets of real-life complexity, which we used to assess the influence and performance of two widely used pre-processing methods along with 11 clustering algorithms. We show that the choice, order and mode of the pre-processing methods have a larger impact on the accuracy of the clustering pipeline than the clustering methods themselves. Without pre-processing, the difference between the performances of clustering methods is large. Depending on the clustering algorithm, the most optimal analysis pipeline resulted in significant underestimations of the expected number of clusters (minimum: 3.4%; maximum: 13.6%), allowing us to make quantitative estimations of the bacterial complexity of real microbiome samples.


Asunto(s)
Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Clasificación , Análisis por Conglomerados , Simulación por Computador , ADN Ribosómico/química , ADN Ribosómico/clasificación , Microbiota , Programas Informáticos
20.
Bioinformatics ; 30(3): 326-34, 2014 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-24273239

RESUMEN

MOTIVATION: To assess whether two proteins will interact under physiological conditions, information on the interaction free energy is needed. Statistical learning techniques and docking methods for predicting protein-protein interactions cannot quantitatively estimate binding free energies. Full atomistic molecular simulation methods do have this potential, but are completely unfeasible for large-scale applications in terms of computational cost required. Here we investigate whether applying coarse-grained (CG) molecular dynamics simulations is a viable alternative for complexes of known structure. RESULTS: We calculate the free energy barrier with respect to the bound state based on molecular dynamics simulations using both a full atomistic and a CG force field for the TCR-pMHC complex and the MP1-p14 scaffolding complex. We find that the free energy barriers from the CG simulations are of similar accuracy as those from the full atomistic ones, while achieving a speedup of >500-fold. We also observe that extensive sampling is extremely important to obtain accurate free energy barriers, which is only within reach for the CG models. Finally, we show that the CG model preserves biological relevance of the interactions: (i) we observe a strong correlation between evolutionary likelihood of mutations and the impact on the free energy barrier with respect to the bound state; and (ii) we confirm the dominant role of the interface core in these interactions. Therefore, our results suggest that CG molecular simulations can realistically be used for the accurate prediction of protein-protein interaction strength. AVAILABILITY AND IMPLEMENTATION: The python analysis framework and data files are available for download at http://www.ibi.vu.nl/downloads/bioinformatics-2013-btt675.tgz.


Asunto(s)
Simulación de Dinámica Molecular , Mapeo de Interacción de Proteínas/métodos , Complejos Multiproteicos/química , Complejos Multiproteicos/genética , Mutación , Termodinámica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA