Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 22(10): 100629, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37557955

RESUMO

Neurodegenerative dementias are progressive diseases that cause neuronal network breakdown in different brain regions often because of accumulation of misfolded proteins in the brain extracellular matrix, such as amyloids or inside neurons or other cell types of the brain. Several diagnostic protein biomarkers in body fluids are being used and implemented, such as for Alzheimer's disease. However, there is still a lack of biomarkers for co-pathologies and other causes of dementia. Such biofluid-based biomarkers enable precision medicine approaches for diagnosis and treatment, allow to learn more about underlying disease processes, and facilitate the development of patient inclusion and evaluation tools in clinical trials. When designing studies to discover novel biofluid-based biomarkers, choice of technology is an important starting point. But there are so many technologies to choose among. To address this, we here review the technologies that are currently available in research settings and, in some cases, in clinical laboratory practice. This presents a form of lexicon on each technology addressing its use in research and clinics, its strengths and limitations, and a future perspective.


Assuntos
Doença de Alzheimer , Humanos , Encéfalo , Biomarcadores , Neurônios , Medicina de Precisão , Peptídeos beta-Amiloides
2.
Proteins ; 92(5): 649-664, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38149328

RESUMO

Glial fibrillary acidic protein (GFAP) is a promising biomarker for brain and spinal cord disorders. Recent studies have highlighted the differences in the reliability of GFAP measurements in different biological matrices. The reason for these discrepancies is poorly understood as our knowledge of the protein's 3-dimensional conformation, proteoforms, and aggregation remains limited. Here, we investigate the structural properties of GFAP under different conditions. For this, we characterized recombinant GFAP proteins from various suppliers and applied hydrogen-deuterium exchange mass spectrometry (HDX-MS) to provide a snapshot of the conformational dynamics of GFAP in artificial cerebrospinal fluid (aCSF) compared to the phosphate buffer. Our findings indicate that recombinant GFAP exists in various conformational species. Furthermore, we show that GFAP dimers remained intact under denaturing conditions. HDX-MS experiments show an overall decrease in H-bonding and an increase in solvent accessibility of GFAP in aCSF compared to the phosphate buffer, with clear indications of mixed EX2 and EX1 kinetics. To understand possible structural interface regions and the evolutionary conservation profiles, we combined HDX-MS results with the predicted GFAP-dimer structure by AlphaFold-Multimer. We found that deprotected regions with high structural flexibility in aCSF overlap with predicted conserved dimeric 1B and 2B domain interfaces. Structural property predictions combined with the HDX data show an overall deprotection and signatures of aggregation in aCSF. We anticipate that the outcomes of this research will contribute to a deeper understanding of the structural flexibility of GFAP and ultimately shed light on its behavior in different biological matrices.


Assuntos
Medição da Troca de Deutério , Proteína Glial Fibrilar Ácida , Fosfatos , Humanos , Medição da Troca de Deutério/métodos , Proteína Glial Fibrilar Ácida/química , Proteína Glial Fibrilar Ácida/genética , Proteína Glial Fibrilar Ácida/fisiologia , Conformação Proteica , Reprodutibilidade dos Testes , Proteínas Recombinantes
3.
J Proteome Res ; 22(9): 3068-3080, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-37606934

RESUMO

Cerebrospinal fluid (CSF) is an essential matrix for the discovery of neurological disease biomarkers. However, the high dynamic range of protein concentrations in CSF hinders the detection of the least abundant protein biomarkers by untargeted mass spectrometry. It is thus beneficial to gain a deeper understanding of the secretion processes within the brain. Here, we aim to explore if and how the secretion of brain proteins to the CSF can be predicted. By combining a curated CSF proteome and the brain elevated proteome of the Human Protein Atlas, brain proteins were classified as CSF or non-CSF secreted. A machine learning model was trained on a range of sequence-based features to differentiate between CSF and non-CSF groups and effectively predict the brain origin of proteins. The classification model achieves an area under the curve of 0.89 if using high confidence CSF proteins. The most important prediction features include the subcellular localization, signal peptides, and transmembrane regions. The classifier generalized well to the larger brain detected proteome and is able to correctly predict novel CSF proteins identified by affinity proteomics. In addition to elucidating the underlying mechanisms of protein secretion, the trained classification model can support biomarker candidate selection.


Assuntos
Pesquisa Biomédica , Proteoma , Humanos , Encéfalo , Transporte Proteico , Transporte Biológico , Proteínas do Líquido Cefalorraquidiano
4.
Bioinformatics ; 38(8): 2111-2118, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35150231

RESUMO

MOTIVATION: The interactions between proteins and other molecules are essential to many biological and cellular processes. Experimental identification of interface residues is a time-consuming, costly and challenging task, while protein sequence data are ubiquitous. Consequently, many computational and machine learning approaches have been developed over the years to predict such interface residues from sequence. However, the effectiveness of different Deep Learning (DL) architectures and learning strategies for protein-protein, protein-nucleotide and protein-small molecule interface prediction has not yet been investigated in great detail. Therefore, we here explore the prediction of protein interface residues using six DL architectures and various learning strategies with sequence-derived input features. RESULTS: We constructed a large dataset dubbed BioDL, comprising protein-protein interactions from the PDB, and DNA/RNA and small molecule interactions from the BioLip database. We also constructed six DL architectures, and evaluated them on the BioDL benchmarks. This shows that no single architecture performs best on all instances. An ensemble architecture, which combines all six architectures, does consistently achieve peak prediction accuracy. We confirmed these results on the published benchmark set by Zhang and Kurgan (ZK448), and on our own existing curated homo- and heteromeric protein interaction dataset. Our PIPENN sequence-based ensemble predictor outperforms current state-of-the-art sequence-based protein interface predictors on ZK448 on all interaction types, achieving an AUC-ROC of 0.718 for protein-protein, 0.823 for protein-nucleotide and 0.842 for protein-small molecule. AVAILABILITY AND IMPLEMENTATION: Source code and datasets are available at https://github.com/ibivu/pipenn/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Proteínas , Proteínas/química , Software , Sequência de Aminoácidos , Nucleotídeos , Biologia Computacional/métodos
5.
BMC Bioinformatics ; 23(1): 487, 2022 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-36384426

RESUMO

BACKGROUND: Current methods of high-dimensional unsupervised clustering of mass cytometry data lack means to monitor and evaluate clustering results. Whether unsupervised clustering is correct is typically evaluated by agreement with dimensionality reduction techniques or based on benchmarking with manually classified cells. The ambiguity and lack of reproducibility of sequential gating has been replaced with ambiguity in interpretation of clustering results. On the other hand, spurious overclustering of data leads to loss of statistical power. We have developed INFLECT, an R-package designed to give insight in clustering results and provide an optimal number of clusters. In our approach, a mass cytometry dataset is overclustered intentionally to ensure the smallest phenotypically different subsets are captured using FlowSOM. A range of metacluster number endpoints are generated and evaluated using marker interquartile range and distribution unimodality checks. The fraction of marker distributions that pass these checks is taken as a measure of clustering success. The fraction of unimodal distributions within metaclusters is plotted against the number of generated metaclusters and reaches a plateau of diminishing returns. The inflection point at which this occurs gives an optimal point of capturing cellular heterogeneity versus statistical power. RESULTS: We applied INFLECT to four publically available mass cytometry datasets of different size and number of markers. The unimodality score consistently reached a plateau, with an inflection point dependent on dataset size and number of dimensions. We tested both ConsenusClusterPlus metaclustering and hierarchical clustering. While hierarchical clustering is less computationally expensive and thus faster, it achieved similar results to ConsensusClusterPlus. The four datasets consisted of labeled data and we compared INFLECT metaclustering to published results. INFLECT identified a higher optimal number of metaclusters for all datasets. We illustrated the underlying heterogeneity within labels, showing that these labels encompass distinct types of cells. CONCLUSION: INFLECT addresses a knowledge gap in high-dimensional cytometry analysis, namely assessing clustering results. This is done through monitoring marker distributions for interquartile range and unimodality across a range of metacluster numbers. The inflection point is the optimal trade-off between cellular heterogeneity and statistical power, applied in this work for FlowSOM clustering on mass cytometry datasets.


Assuntos
Reprodutibilidade dos Testes , Análise por Conglomerados , Biomarcadores
6.
Bioinformatics ; 37(20): 3421-3427, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-33974039

RESUMO

MOTIVATION: Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen's epitope region, as a special type of protein-protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. RESULTS: We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody-antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. AVAILABILITY AND IMPLEMENTATION: Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
PLoS Comput Biol ; 16(5): e1007767, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32365068

RESUMO

Many proteins have the potential to aggregate into amyloid fibrils, protein polymers associated with a wide range of human disorders such as Alzheimer's and Parkinson's disease. The thermodynamic stability of amyloid fibrils, in contrast to that of folded proteins, is not well understood: the balance between entropic and enthalpic terms, including the chain entropy and the hydrophobic effect, are poorly characterised. Using a combination of theory, in vitro experiments, simulations of a coarse-grained protein model and meta-data analysis, we delineate the enthalpic and entropic contributions that dominate amyloid fibril elongation. Our prediction of a characteristic temperature-dependent enthalpic signature is confirmed by the performed calorimetric experiments and a meta-analysis over published data. From these results we are able to define the necessary conditions to observe cold denaturation of amyloid fibrils. Overall, we show that amyloid fibril elongation is associated with a negative heat capacity, the magnitude of which correlates closely with the hydrophobic surface area that is buried upon fibril formation, highlighting the importance of hydrophobicity for fibril stability.


Assuntos
Amiloide/química , Amiloide/fisiologia , Amiloide/metabolismo , Peptídeos beta-Amiloides/química , Peptídeos beta-Amiloides/fisiologia , Proteínas Amiloidogênicas/química , Proteínas Amiloidogênicas/fisiologia , Humanos , Interações Hidrofóbicas e Hidrofílicas , Modelos Teóricos , Simulação de Dinâmica Molecular , Desnaturação Proteica , Dobramento de Proteína , Temperatura , Termodinâmica
8.
Bioinformatics ; 35(24): 5315-5317, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31368486

RESUMO

SUMMARY: PRALINE 2 is a toolkit for custom multiple sequence alignment workflows. It can be used to incorporate sequence annotations, such as secondary structure or (DNA) motifs, into the alignment scoring, as well as to customize many other aspects of a progressive multiple alignment workflow. AVAILABILITY AND IMPLEMENTATION: PRALINE 2 is implemented in Python and available as open source software on GitHub: https://github.com/ibivu/PRALINE/.


Assuntos
Software , DNA , Estrutura Secundária de Proteína , Alinhamento de Sequência
9.
Bioinformatics ; 35(22): 4794-4796, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31116381

RESUMO

MOTIVATION: Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random forest method for protein-protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein-protein interactions. Here, we present a webserver that implements this method efficiently. RESULTS: With the aim of accelerating our previous approach, we obtained sequence conservation profiles by re-mastering the alignment of homologous sequences found by PSI-BLAST. This yielded a more than 10-fold speedup and at least the same accuracy, as reported previously for our method; these results allowed us to offer the method as a webserver. The web-server interface is targeted to the non-expert user. The input is simply a sequence of the protein of interest, and the output a table with scores indicating the likelihood of having an interaction interface at a certain position. As the method is sequence-based and not sensitive to the type of protein interaction, we expect this webserver to be of interest to many biological researchers in academia and in industry. AVAILABILITY AND IMPLEMENTATION: Webserver, source code and datasets are available at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Algoritmos , Sequência de Aminoácidos , Proteínas , Análise de Sequência de Proteína
10.
Bioinformatics ; 34(13): i4-i12, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29950011

RESUMO

Motivation: Our society has become data-rich to the extent that research in many areas has become impossible without computational approaches. Educational programmes seem to be lagging behind this development. At the same time, there is a growing need not only for strong data science skills, but foremost for the ability to both translate between tools and methods on the one hand, and application and problems on the other. Results: Here we present our experiences with shaping and running a masters' programme in bioinformatics and systems biology in Amsterdam. From this, we have developed a comprehensive philosophy on how translation in training may be achieved in a dynamic and multidisciplinary research area, which is described here. We furthermore describe two requirements that enable translation, which we have found to be crucial: sufficient depth and focus on multidisciplinary topic areas, coupled with a balanced breadth from adjacent disciplines. Finally, we present concrete suggestions on how this may be implemented in practice, which may be relevant for the effectiveness of life science and data science curricula in general, and of particular interest to those who are in the process of setting up such curricula. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/educação , Currículo , Ciência de Dados/educação , Humanos
11.
PLoS Comput Biol ; 14(11): e1006547, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30383764

RESUMO

Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSAs) with a correct alignment of the conserved sequence motifs is still difficult to achieve, due to the fact that the contribution of these typically short fragments is overshadowed by the rest of the sequence. Here we extended the PRALINE multiple sequence alignment program with a novel motif-aware MSA algorithm in order to address this shortcoming. This method can incorporate explicit information about the presence of externally provided sequence motifs, which is then used in the dynamic programming step by boosting the amino acid substitution matrix towards the motif. The strength of the boost is controlled by a parameter, α. Using a benchmark set of alignments we confirm that a good compromise can be found that improves the matching of motif regions while not significantly reducing the overall alignment quality. By estimating α on an unrelated set of reference alignments we find there is indeed a strong conservation signal for motifs. A number of typical but difficult MSA use cases are explored to exemplify the problems in correctly aligning functional sequence motifs and how the motif-aware alignment method can be employed to alleviate these problems.


Assuntos
Motivos de Aminoácidos , DNA/química , Proteínas/química , Alinhamento de Sequência/normas , Algoritmos , Sequência de Aminoácidos , Sequência Conservada , HIV-1/química , Homologia de Sequência de Aminoácidos , Produtos do Gene env do Vírus da Imunodeficiência Humana/química
12.
Bioinformatics ; 32(12): i60-i69, 2016 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-27307645

RESUMO

MOTIVATION: Biological pathways play a key role in most cellular functions. To better understand these functions, diverse computational and cell biology researchers use biological pathway data for various analysis and modeling purposes. For specifying these biological pathways, a community of researchers has defined BioPAX and provided various tools for creating, validating and visualizing BioPAX models. However, a generic software framework for simulating BioPAX models is missing. Here, we attempt to fill this gap by introducing a generic simulation framework for BioPAX. The framework explicitly separates the execution model from the model structure as provided by BioPAX, with the advantage that the modelling process becomes more reproducible and intrinsically more modular; this ensures natural biological constraints are satisfied upon execution. The framework is based on the principles of discrete event systems and multi-agent systems, and is capable of automatically generating a hierarchical multi-agent system for a given BioPAX model. RESULTS: To demonstrate the applicability of the framework, we simulated two types of biological network models: a gene regulatory network modeling the haematopoietic stem cell regulators and a signal transduction network modeling the Wnt/ß-catenin signaling pathway. We observed that the results of the simulations performed using our framework were entirely consistent with the simulation results reported by the researchers who developed the original models in a proprietary language. AVAILABILITY AND IMPLEMENTATION: The framework, implemented in Java, is open source and its source code, documentation and tutorial are available at http://www.ibi.vu.nl/programs/BioASF CONTACT: j.heringa@vu.nl.


Assuntos
Redes Reguladoras de Genes , Modelos Biológicos , Transdução de Sinais , Software , Simulação por Computador , Humanos , Linguagens de Programação
13.
Bioinformatics ; 32(11): 1678-85, 2016 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-26342232

RESUMO

MOTIVATION: The human microbiome plays a key role in health and disease. Thanks to comparative metatranscriptomics, the cellular functions that are deregulated by the microbiome in disease can now be computationally explored. Unlike gene-centric approaches, pathway-based methods provide a systemic view of such functions; however, they typically consider each pathway in isolation and in its entirety. They can therefore overlook the key differences that (i) span multiple pathways, (ii) contain bidirectionally deregulated components, (iii) are confined to a pathway region. To capture these properties, computational methods that reach beyond the scope of predefined pathways are needed. RESULTS: By integrating an existing module discovery algorithm into comparative metatranscriptomic analysis, we developed metaModules, a novel computational framework for automated identification of the key functional differences between health- and disease-associated communities. Using this framework, we recovered significantly deregulated subnetworks that were indeed recognized to be involved in two well-studied, microbiome-mediated oral diseases, such as butanoate production in periodontal disease and metabolism of sugar alcohols in dental caries. More importantly, our results indicate that our method can be used for hypothesis generation based on automated discovery of novel, disease-related functional subnetworks, which would otherwise require extensive and laborious manual assessment. AVAILABILITY AND IMPLEMENTATION: metaModules is available at https://bitbucket.org/alimay/metamodules/ CONTACT: a.may@vu.nl or s.abeln@vu.nl SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Microbiota , Algoritmos , Cárie Dentária , Humanos
14.
Nucleic Acids Res ; 43(W1): W301-5, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25878034

RESUMO

Massively parallel sequencing of microbial genetic markers (MGMs) is used to uncover the species composition in a multitude of ecological niches. These sequencing runs often contain a sample with known composition that can be used to evaluate the sequencing quality or to detect novel sequence variants. With NGS-eval, the reads from such (mock) samples can be used to (i) explore the differences between the reads and their references and to (ii) estimate the sequencing error rate. This tool maps these reads to references and calculates as well as visualizes the different types of sequencing errors. Clearly, sequencing errors can only be accurately calculated if the reference sequences are correct. However, even with known strains, it is not straightforward to select the correct references from databases. We previously analysed a pyrosequencing dataset from a mock sample to estimate sequencing error rates and detected sequence variants in our mock community, allowing us to obtain an accurate error estimation. Here, we demonstrate the variant detection and error analysis capability of NGS-eval with Illumina MiSeq reads from the same mock community. While tailored towards the field of metagenomics, this server can be used for any type of MGM-based reads. NGS-eval is available at http://www.ibi.vu.nl/programs/ngsevalwww/.


Assuntos
Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenômica/métodos , Software , Marcadores Genéticos , Internet
15.
Phys Rev Lett ; 116(7): 078101, 2016 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-26943560

RESUMO

The hydrophobic effect stabilizes the native structure of proteins by minimizing the unfavorable interactions between hydrophobic residues and water through the formation of a hydrophobic core. Here, we include the entropic and enthalpic contributions of the hydrophobic effect explicitly in an implicit solvent model. This allows us to capture two important effects: a length-scale dependence and a temperature dependence for the solvation of a hydrophobic particle. This consistent treatment of the hydrophobic effect explains cold denaturation and heat capacity measurements of solvated proteins.


Assuntos
Modelos Químicos , Proteínas/química , Temperatura Baixa , Interações Hidrofóbicas e Hidrofílicas , Método de Monte Carlo , Peptídeos/química , Desnaturação Proteica , Dobramento de Proteína , Água/química
16.
PLoS Comput Biol ; 11(5): e1004277, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-26000449

RESUMO

The hydrophobic effect is the main driving force in protein folding. One can estimate the relative strength of this hydrophobic effect for each amino acid by mining a large set of experimentally determined protein structures. However, the hydrophobic force is known to be strongly temperature dependent. This temperature dependence is thought to explain the denaturation of proteins at low temperatures. Here we investigate if it is possible to extract this temperature dependence directly from a large set of protein structures determined at different temperatures. Using NMR structures filtered for sequence identity, we were able to extract hydrophobicity propensities for all amino acids at five different temperature ranges (spanning 265-340 K). These propensities show that the hydrophobicity becomes weaker at lower temperatures, in line with current theory. Alternatively, one can conclude that the temperature dependence of the hydrophobic effect has a measurable influence on protein structures. Moreover, this work provides a method for probing the individual temperature dependence of the different amino acid types, which is difficult to obtain by direct experiment.


Assuntos
Aminoácidos/química , Proteínas/química , Algoritmos , Biologia Computacional , Cristalografia por Raios X , Bases de Dados de Proteínas , Interações Hidrofóbicas e Hidrofílicas , Espectroscopia de Ressonância Magnética , Modelos Estatísticos , Desnaturação Proteica , Dobramento de Proteína , Solventes , Temperatura , Água/química
17.
PLoS Comput Biol ; 11(10): e1004435, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26505754

RESUMO

It has been recently shown that the coarse-graining of the structures of polypeptide chains as self-avoiding tubes can provide an effective representation of the conformational space of proteins. In order to fully exploit the opportunities offered by such a 'tube model' approach, we present here a strategy to combine it with molecular dynamics simulations. This strategy is based on the incorporation of the 'CamTube' force field into the Gromacs molecular dynamics package. By considering the case of a 60-residue polyvaline chain, we show that CamTube molecular dynamics simulations can comprehensively explore the conformational space of proteins. We obtain this result by a 20 µs metadynamics simulation of the polyvaline chain that recapitulates the currently known protein fold universe. We further show that, if residue-specific interaction potentials are added to the CamTube force field, it is possible to fold a protein into a topology close to that of its native state. These results illustrate how the CamTube force field can be used to explore efficiently the universe of protein folds with good accuracy and very limited computational cost.


Assuntos
Algoritmos , Modelos Químicos , Simulação de Dinâmica Molecular , Dobramento de Proteína , Proteínas/química , Proteínas/ultraestrutura , Linguagens de Programação , Conformação Proteica , Software , Estresse Mecânico
18.
Brief Bioinform ; 14(5): 589-98, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23603092

RESUMO

Teaching students with very diverse backgrounds can be extremely challenging. This article uses the Bioinformatics and Systems Biology MSc in Amsterdam as a case study to describe how the knowledge gap for students with heterogeneous backgrounds can be bridged. We show that a mix in backgrounds can be turned into an advantage by creating a stimulating learning environment for the students. In the MSc Programme, conversion classes help to bridge differences between students, by mending initial knowledge and skill gaps. Mixing students from different backgrounds in a group to solve a complex task creates an opportunity for the students to reflect on their own abilities. We explain how a truly interdisciplinary approach to teaching helps students of all backgrounds to achieve the MSc end terms. Moreover, transferable skills obtained by the students in such a mixed study environment are invaluable for their later careers.


Assuntos
Biologia Computacional/educação , Biologia de Sistemas/educação , Currículo , Educação de Pós-Graduação , Humanos , Países Baixos , Estudantes
19.
Bioinformatics ; 30(11): 1530-8, 2014 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-24519382

RESUMO

MOTIVATION: 16S rDNA pyrosequencing is a powerful approach that requires extensive usage of computational methods for delineating microbial compositions. Previously, it was shown that outcomes of studies relying on this approach vastly depend on the choice of pre-processing and clustering algorithms used. However, obtaining insights into the effects and accuracy of these algorithms is challenging due to difficulties in generating samples of known composition with high enough diversity. Here, we use in silico microbial datasets to better understand how the experimental data are transformed into taxonomic clusters by computational methods. RESULTS: We were able to qualitatively replicate the raw experimental pyrosequencing data after rigorously adjusting existing simulation software. This allowed us to simulate datasets of real-life complexity, which we used to assess the influence and performance of two widely used pre-processing methods along with 11 clustering algorithms. We show that the choice, order and mode of the pre-processing methods have a larger impact on the accuracy of the clustering pipeline than the clustering methods themselves. Without pre-processing, the difference between the performances of clustering methods is large. Depending on the clustering algorithm, the most optimal analysis pipeline resulted in significant underestimations of the expected number of clusters (minimum: 3.4%; maximum: 13.6%), allowing us to make quantitative estimations of the bacterial complexity of real microbiome samples.


Assuntos
Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodos , Algoritmos , Classificação , Análise por Conglomerados , Simulação por Computador , DNA Ribossômico/química , DNA Ribossômico/classificação , Microbiota , Software
20.
Bioinformatics ; 30(3): 326-34, 2014 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-24273239

RESUMO

MOTIVATION: To assess whether two proteins will interact under physiological conditions, information on the interaction free energy is needed. Statistical learning techniques and docking methods for predicting protein-protein interactions cannot quantitatively estimate binding free energies. Full atomistic molecular simulation methods do have this potential, but are completely unfeasible for large-scale applications in terms of computational cost required. Here we investigate whether applying coarse-grained (CG) molecular dynamics simulations is a viable alternative for complexes of known structure. RESULTS: We calculate the free energy barrier with respect to the bound state based on molecular dynamics simulations using both a full atomistic and a CG force field for the TCR-pMHC complex and the MP1-p14 scaffolding complex. We find that the free energy barriers from the CG simulations are of similar accuracy as those from the full atomistic ones, while achieving a speedup of >500-fold. We also observe that extensive sampling is extremely important to obtain accurate free energy barriers, which is only within reach for the CG models. Finally, we show that the CG model preserves biological relevance of the interactions: (i) we observe a strong correlation between evolutionary likelihood of mutations and the impact on the free energy barrier with respect to the bound state; and (ii) we confirm the dominant role of the interface core in these interactions. Therefore, our results suggest that CG molecular simulations can realistically be used for the accurate prediction of protein-protein interaction strength. AVAILABILITY AND IMPLEMENTATION: The python analysis framework and data files are available for download at http://www.ibi.vu.nl/downloads/bioinformatics-2013-btt675.tgz.


Assuntos
Simulação de Dinâmica Molecular , Mapeamento de Interação de Proteínas/métodos , Complexos Multiproteicos/química , Complexos Multiproteicos/genética , Mutação , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA