RESUMO
The physiology of every living cell is regulated at some level by transporter proteins which constitute a relevant portion of membrane-bound proteins and are involved in the movement of ions, small and macromolecules across bio-membranes. The importance of transporter proteins is unquestionable. The prediction and study of previously unknown transporters can lead to the discovery of new biological pathways, drugs and treatments. Here we present PortPred, a tool to accurately identify transporter proteins and their substrate starting from the protein amino acid sequence. PortPred successfully combines pre-trained deep learning-based protein embeddings and machine learning classification approaches and outperforms other state-of-the-art methods. In addition, we present a comparison of the most promising protein sequence embeddings (Unirep, SeqVec, ProteinBERT, ESM-1b) and their performances for this specific task.
Assuntos
Aprendizado Profundo , Sequência de Aminoácidos , Biologia Computacional/métodos , Aprendizado de Máquina , Proteínas de Membrana Transportadoras/metabolismo , Proteínas de Membrana/metabolismoRESUMO
Peroxisomes are ubiquitous membrane-bound organelles, and aberrant localisation of peroxisomal proteins contributes to the pathogenesis of several disorders. Many computational methods focus on assigning protein sequences to subcellular compartments, but there are no specific tools tailored for the sub-localisation (matrix vs. membrane) of peroxisome proteins. We present here In-Pero, a new method for predicting protein sub-peroxisomal cellular localisation. In-Pero combines standard machine learning approaches with recently proposed multi-dimensional deep-learning representations of the protein amino-acid sequence. It showed a classification accuracy above 0.9 in predicting peroxisomal matrix and membrane proteins. The method is trained and tested using a double cross-validation approach on a curated data set comprising 160 peroxisomal proteins with experimental evidence for sub-peroxisomal localisation. We further show that the proposed approach can be easily adapted (In-Mito) to the prediction of mitochondrial protein localisation obtaining performances for certain classes of proteins (matrix and inner-membrane) superior to existing tools.
Assuntos
Aprendizado Profundo , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Peroxissomos/metabolismo , Software , Algoritmos , Sequência de Aminoácidos , Proteínas Mitocondriais/metabolismo , Transporte Proteico , Reprodutibilidade dos TestesRESUMO
Computational approaches are practical when investigating putative peroxisomal proteins and for sub-peroxisomal protein localization in unknown protein sequences. Nowadays, advancements in computational methods and Machine Learning (ML) can be used to hasten the discovery of novel peroxisomal proteins and can be combined with more established computational methodologies. Here, we explain and list some of the most used tools and methodologies for novel peroxisomal protein detection and localization.
Assuntos
Peroxissomos , Proteínas , Peroxissomos/metabolismo , Transporte Proteico , Sequência de Aminoácidos , Proteínas/metabolismo , Aprendizado de MáquinaRESUMO
We present the OrganelX e-Science Web Server that provides a user-friendly implementation of the In-Pero and In-Mito classifiers for sub-peroxisomal and sub-mitochondrial localization of peroxisomal and mitochondrial proteins and the Is-PTS1 algorithm for detecting and validating potential peroxisomal proteins carrying a PTS1 signal sequence. The OrganelX e-Science Web Server is available at https://organelx.hpc.rug.nl/fasta/.
RESUMO
Peroxisomes are ubiquitous, oxidative subcellular organelles with important functions in cellular lipid metabolism and redox homeostasis. Loss of peroxisomal functions causes severe disorders with developmental and neurological abnormalities. Zebrafish are emerging as an attractive vertebrate model to study peroxisomal disorders as well as cellular lipid metabolism. Here, we combined bioinformatics analyses with molecular cell biology and reveal the first comprehensive inventory of Danio rerio peroxisomal proteins, which we systematically compared with those of human peroxisomes. Through bioinformatics analysis of all PTS1-carrying proteins, we demonstrate that D. rerio lacks two well-known mammalian peroxisomal proteins (BAAT and ZADH2/PTGR3), but possesses a putative peroxisomal malate synthase (Mlsl) and verified differences in the presence of purine degrading enzymes. Furthermore, we revealed novel candidate peroxisomal proteins in D. rerio, whose function and localisation is discussed. Our findings confirm the suitability of zebrafish as a vertebrate model for peroxisome research and open possibilities for the study of novel peroxisomal candidate proteins in zebrafish and humans.
RESUMO
The amlyoid-ß peptide (Aß) is closely linked to the development of Alzheimer's disease. Molecular dynamics (MD) simulations have become an indispensable tool for studying the behavior of this peptide at the atomistic level. General key aspects of MD simulations are the force field used for modeling the peptide and its environment, which is important for accurate modeling of the system of interest, and the length of the simulations, which determines whether or not equilibrium is reached. In this study we address these points by analyzing 30-µs MD simulations acquired for Aß40 using seven different force fields. We assess the convergence of these simulations based on the convergence of various structural properties and of NMR and fluorescence spectroscopic observables. Moreover, we calculate Markov state models for the different MD simulations, which provide an unprecedented view of the thermodynamics and kinetics of the amyloid-ß peptide. This further allows us to provide answers for pertinent questions, like: which force fields are suitable for modeling Aß? (a99SB-UCB and a99SB-ILDN/TIP4P-D); what does Aß peptide really look like? (mostly extended and disordered) and; how long does it take MD simulations of Aß to attain equilibrium? (at least 20-30 µs). We believe the analyses presented in this study will provide a useful reference guide for important questions relating to the structure and dynamics of Aß in particular, and by extension other similar disordered proteins.