Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38686701

RESUMO

CONTEXT: The role of glucagon-like peptide-1(GLP-1) in Type 2 diabetes (T2D) and obesity is not fully understood. OBJECTIVE: We investigate the association of cardiometabolic, diet and lifestyle parameters on fasting and postprandial GLP-1 in people at risk of, or living with, T2D. METHOD: We analysed cross-sectional data from the two Innovative Medicines Initiative (IMI) Diabetes Research on Patient Stratification (DIRECT) cohorts, cohort 1(n=2127) individuals at risk of diabetes; cohort 2 (n=789) individuals with new-onset of T2D. RESULTS: Our multiple regression analysis reveals that fasting total GLP-1 is associated with an insulin resistant phenotype and observe a strong independent relationship with male sex, increased adiposity and liver fat particularly in the prediabetes population. In contrast, we showed that incremental GLP-1 decreases with worsening glycaemia, higher adiposity, liver fat, male sex and reduced insulin sensitivity in the prediabetes cohort. Higher fasting total GLP-1 was associated with a low intake of wholegrain, fruit and vegetables inpeople with prediabetes, and with a high intake of red meat and alcohol in people with diabetes. CONCLUSION: These studies provide novel insights into the association between fasting and incremental GLP-1, metabolic traits of diabetes and obesity, and dietary intake and raise intriguing questions regarding the relevance of fasting GLP-1 in the pathophysiology T2D.

2.
Nat Commun ; 14(1): 5062, 2023 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-37604891

RESUMO

We evaluate the shared genetic regulation of mRNA molecules, proteins and metabolites derived from whole blood from 3029 human donors. We find abundant allelic heterogeneity, where multiple variants regulate a particular molecular phenotype, and pleiotropy, where a single variant associates with multiple molecular phenotypes over multiple genomic regions. The highest proportion of share genetic regulation is detected between gene expression and proteins (66.6%), with a further median shared genetic associations across 49 different tissues of 78.3% and 62.4% between plasma proteins and gene expression. We represent the genetic and molecular associations in networks including 2828 known GWAS variants, showing that GWAS variants are more often connected to gene expression in trans than other molecular phenotypes in the network. Our work provides a roadmap to understanding molecular networks and deriving the underlying mechanism of action of GWAS variants using different molecular phenotypes in an accessible tissue.


Assuntos
Genômica , Herança Multifatorial , Humanos , Fenótipo , RNA Mensageiro , Pesquisadores
3.
Nat Biotechnol ; 40(7): 1023-1025, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-34980915

RESUMO

Signal peptides (SPs) are short amino acid sequences that control protein secretion and translocation in all living organisms. SPs can be predicted from sequence data, but existing algorithms are unable to detect all known types of SPs. We introduce SignalP 6.0, a machine learning model that detects all five SP types and is applicable to metagenomic data.


Assuntos
Idioma , Sinais Direcionadores de Proteínas , Algoritmos , Sequência de Aminoácidos , Sinais Direcionadores de Proteínas/genética , Proteínas
4.
Comput Struct Biotechnol J ; 19: 6090-6097, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34849210

RESUMO

Hidden Markov Models (HMMs) are amongst the most successful methods for predicting protein features in biological sequence analysis. However, there are biological problems where the Markovian assumption is not sufficient since the sequence context can provide useful information for prediction purposes. Several extensions of HMMs have appeared in the literature in order to overcome their limitations. We apply here a hybrid method that combines HMMs and Neural Networks (NNs), termed Hidden Neural Networks (HNNs), for biological sequence analysis in a straightforward manner. In this framework, the traditional HMM probability parameters are replaced by NN outputs. As a case study, we focus on the topology prediction of for alpha-helical and beta-barrel membrane proteins. The HNNs show performance gains compared to standard HMMs and the respective predictors outperform the top-scoring methods in the field. The implementation of HNNs can be found in the package JUCHMME, downloadable from http://www.compgen.org/tools/juchmme, https://github.com/pbagos/juchmme. The updated PRED-TMBB2 and HMM-TM prediction servers can be accessed at www.compgen.org.

5.
Diabetes ; 70(9): 2092-2106, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34233929

RESUMO

Differences in glucose metabolism among categories of prediabetes have not been systematically investigated. In this longitudinal study, participants (N = 2,111) underwent a 2-h 75-g oral glucose tolerance test (OGTT) at baseline and 48 months. HbA1c was also measured. We classified participants as having isolated prediabetes defect (impaired fasting glucose [IFG], impaired glucose tolerance [IGT], or HbA1c indicative of prediabetes [IA1c]), two defects (IFG+IGT, IFG+IA1c, or IGT+IA1c), or all defects (IFG+IGT+IA1c). ß-Cell function (BCF) and insulin sensitivity were assessed from OGTT. At baseline, in pooling of participants with isolated defects, they showed impairment in both BCF and insulin sensitivity compared with healthy control subjects. Pooled groups with two or three defects showed progressive further deterioration. Among groups with isolated defect, those with IGT showed lower insulin sensitivity, insulin secretion at reference glucose (ISRr), and insulin secretion potentiation (P < 0.002). Conversely, those with IA1c showed higher insulin sensitivity and ISRr (P < 0.0001). Among groups with two defects, we similarly found differences in both BCF and insulin sensitivity. At 48 months, we found higher type 2 diabetes incidence for progressively increasing number of prediabetes defects (odds ratio >2, P < 0.008). In conclusion, the prediabetes groups showed differences in type/degree of glucometabolic impairment. Compared with the pooled group with isolated defects, those with double or triple defect showed progressive differences in diabetes incidence.


Assuntos
Intolerância à Glucose/metabolismo , Glucose/metabolismo , Hemoglobinas Glicadas/análise , Resistência à Insulina/fisiologia , Estado Pré-Diabético/metabolismo , Adulto , Idoso , Glicemia , Jejum/sangue , Feminino , Teste de Tolerância a Glucose , Humanos , Secreção de Insulina , Masculino , Pessoa de Meia-Idade , Fenótipo
6.
Front Bioinform ; 1: 646581, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-36303794

RESUMO

OMPdb (www.ompdb.org) was introduced as a database for ß-barrel outer membrane proteins from Gram-negative bacteria in 2011 and then included 69,354 entries classified into 85 families. The database has been updated continuously using a collection of characteristic profile Hidden Markov Models able to discriminate between the different families of prokaryotic transmembrane ß-barrels. The number of families has increased ultimately to a total of 129 families in the current, second major version of OMPdb. New additions have been made in parallel with efforts to update existing families and add novel families. Here, we present the upgrade of OMPdb, which from now on aims to become a global repository for all transmembrane ß-barrel proteins, both eukaryotic and bacterial.

7.
Diabetes Care ; 44(2): 511-518, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33323478

RESUMO

OBJECTIVE: We investigated the processes underlying glycemic deterioration in type 2 diabetes (T2D). RESEARCH DESIGN AND METHODS: A total of 732 recently diagnosed patients with T2D from the Innovative Medicines Initiative Diabetes Research on Patient Stratification (IMI DIRECT) study were extensively phenotyped over 3 years, including measures of insulin sensitivity (OGIS), ß-cell glucose sensitivity (GS), and insulin clearance (CLIm) from mixed meal tests, liver enzymes, lipid profiles, and baseline regional fat from MRI. The associations between the longitudinal metabolic patterns and HbA1c deterioration, adjusted for changes in BMI and in diabetes medications, were assessed via stepwise multivariable linear and logistic regression. RESULTS: Faster HbA1c progression was independently associated with faster deterioration of OGIS and GS and increasing CLIm; visceral or liver fat, HDL-cholesterol, and triglycerides had further independent, though weaker, roles (R 2 = 0.38). A subgroup of patients with a markedly higher progression rate (fast progressors) was clearly distinguishable considering these variables only (discrimination capacity from area under the receiver operating characteristic = 0.94). The proportion of fast progressors was reduced from 56% to 8-10% in subgroups in which only one trait among OGIS, GS, and CLIm was relatively stable (odds ratios 0.07-0.09). T2D polygenic risk score and baseline pancreatic fat, glucagon-like peptide 1, glucagon, diet, and physical activity did not show an independent role. CONCLUSIONS: Deteriorating insulin sensitivity and ß-cell function, increasing insulin clearance, high visceral or liver fat, and worsening of the lipid profile are the crucial factors mediating glycemic deterioration of patients with T2D in the initial phase of the disease. Stabilization of a single trait among insulin sensitivity, ß-cell function, and insulin clearance may be relevant to prevent progression.


Assuntos
Diabetes Mellitus Tipo 2 , Resistência à Insulina , Células Secretoras de Insulina , Glicemia , HDL-Colesterol , Humanos , Insulina
8.
EBioMedicine ; 58: 102932, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32763829

RESUMO

BACKGROUND: Dietary advice remains the cornerstone of prevention and management of type 2 diabetes (T2D). However, understanding the efficacy of dietary interventions is confounded by the challenges inherent in assessing free living diet. Here we profiled dietary metabolites to investigate glycaemic deterioration and cardiometabolic risk in people at risk of or living with T2D. METHODS: We analysed data from plasma collected at baseline and 18-month follow-up in individuals from the Innovative Medicines Initiative (IMI) Diabetes Research on Patient Stratification (DIRECT) cohort 1 n = 403 individuals with normal or impaired glucose regulation (prediabetic) and cohort 2 n = 458 individuals with new onset of T2D. A dietary metabolite profile model (Tpred) was constructed using multivariable regression of 113 plasma metabolites obtained from targeted metabolomics assays. The continuous Tpred score was used to explore the relationships between diet, glycaemic deterioration and cardio-metabolic risk via multiple linear regression models. FINDINGS: A higher Tpred score was associated with healthier diets high in wholegrain (ß=3.36 g, 95% CI 0.31, 6.40 and ß=2.82 g, 95% CI 0.06, 5.57) and lower energy intake (ß=-75.53 kcal, 95% CI -144.71, -2.35 and ß=-122.51 kcal, 95% CI -186.56, -38.46), and saturated fat (ß=-0.92 g, 95% CI -1.56, -0.28 and ß=-0.98 g, 95% CI -1.53, -0.42 g), respectively for cohort 1 and 2. In both cohorts a higher Tpred score was also associated with lower total body adiposity and favourable lipid profiles HDL-cholesterol (ß=0.07 mmol/L, 95% CI 0.03, 0.1), (ß=0.08 mmol/L, 95% CI 0.04, 0.1), and triglycerides (ß=-0.1 mmol/L, 95% CI -0.2, -0.03), (ß=-0.2 mmol/L, 95% CI -0.3, -0.09), respectively for cohort 1 and 2. In cohort 2, the Tpred score was negatively associated with liver fat (ß=-0.74%, 95% CI -0.67, -0.81), and lower fasting concentrations of HbA1c (ß=-0.9 mmol/mol, 95% CI -1.5, -0.1), glucose (ß=-0.2 mmol/L, 95% CI -0.4, -0.05) and insulin (ß=-11.0 pmol/mol, 95% CI -19.5, -2.6). Longitudinal analysis showed at 18-month follow up a higher Tpred score was also associated lower total body adiposity in both cohorts and lower fasting glucose (ß=-0.2 mmol/L, 95% CI -0.3, -0.01) and insulin (ß=-9.2 pmol/mol, 95% CI -17.9, -0.4) concentrations in cohort 2. INTERPRETATION: Plasma dietary metabolite profiling provides objective measures of diet intake, showing a relationship to glycaemic deterioration and cardiometabolic health. FUNDING: This work was supported by the Innovative Medicines Initiative Joint Undertaking under grant agreement no. 115,317 (DIRECT), resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA companies.


Assuntos
Diabetes Mellitus Tipo 2/dietoterapia , Metabolômica/métodos , Estado Pré-Diabético/dietoterapia , Idoso , Estudos de Casos e Controles , HDL-Colesterol/sangue , Diabetes Mellitus Tipo 2/sangue , Dieta Saudável , Ingestão de Energia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estado Pré-Diabético/sangue , Triglicerídeos/sangue
9.
J Proteome Res ; 19(3): 1209-1221, 2020 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-32008325

RESUMO

Even though in the last few years several families of eukaryotic ß-barrel outer membrane proteins have been discovered, their computational characterization and their annotation in public databases are far from complete. The PFAM database includes only very few characteristic profiles for these families, and in most cases, the profile hidden Markov models (pHMMs) have been trained using prokaryotic and eukaryotic proteins together. Here, we present for the first time a comprehensive computational analysis of eukaryotic transmembrane ß-barrels. Twelve characteristic pHMMs were built, based on an extensive literature search, which can discriminate eukaryotic ß-barrels from other classes of proteins (globular and bacterial ß-barrel ones), as well as between mitochondrial and chloroplastic ones. We built eight novel profiles for the chloroplastic ß-barrel families that are not present in the PFAM database and also updated the profile for the MDM10 family (PF12519) in the PFAM database and divide the porin family (PF01459) into two separate families, namely, VDAC and TOM40.


Assuntos
Eucariotos , Porinas , Eucariotos/genética , Células Eucarióticas , Mitocôndrias , Proteínas
10.
Bioinformatics ; 35(24): 5309-5312, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31250907

RESUMO

SUMMARY: JUCHMME is an open-source software package designed to fit arbitrary custom Hidden Markov Models (HMMs) with a discrete alphabet of symbols. We incorporate a large collection of standard algorithms for HMMs as well as a number of extensions and evaluate the software on various biological problems. Importantly, the JUCHMME toolkit includes several additional features that allow for easy building and evaluation of custom HMMs, which could be a useful resource for the research community. AVAILABILITY AND IMPLEMENTATION: http://www.compgen.org/tools/juchmme, https://github.com/pbagos/juchmme. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Análise de Sequência
11.
Protein J ; 38(3): 200-216, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31119599

RESUMO

Ever since the signal hypothesis was proposed in 1971, the exact nature of signal peptides has been a focus point of research. The prediction of signal peptides and protein subcellular location from amino acid sequences has been an important problem in bioinformatics since the dawn of this research field, involving many statistical and machine learning technologies. In this review, we provide a historical account of how position-weight matrices, artificial neural networks, hidden Markov models, support vector machines and, lately, deep learning techniques have been used in the attempts to predict where proteins go. Because the secretory pathway was the first one to be studied both experimentally and through bioinformatics, our main focus is on the historical development of prediction methods for signal peptides that target proteins for secretion; prediction methods to identify targeting signals for other cellular compartments are treated in less detail.


Assuntos
Biologia Computacional/métodos , Sinais Direcionadores de Proteínas , Transporte Proteico , Bactérias/metabolismo , Eucariotos/metabolismo
12.
Nat Biotechnol ; 37(4): 420-423, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30778233

RESUMO

Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.


Assuntos
Redes Neurais de Computação , Sinais Direcionadores de Proteínas/genética , Sinais Direcionadores de Proteínas/fisiologia , Algoritmos , Sequência de Aminoácidos , Proteínas Arqueais/classificação , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Proteínas de Bactérias/classificação , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Biotecnologia , Biologia Computacional , Eucariotos/genética , Eucariotos/metabolismo , Análise de Sequência de Proteína , Software
13.
Bioinformatics ; 35(13): 2208-2215, 2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-30445435

RESUMO

MOTIVATION: Hidden Markov Models (HMMs) are probabilistic models widely used in applications in computational sequence analysis. HMMs are basically unsupervised models. However, in the most important applications, they are trained in a supervised manner. Training examples accompanied by labels corresponding to different classes are given as input and the set of parameters that maximize the joint probability of sequences and labels is estimated. A main problem with this approach is that, in the majority of the cases, labels are hard to find and thus the amount of training data is limited. On the other hand, there are plenty of unclassified (unlabeled) sequences deposited in the public databases that could potentially contribute to the training procedure. This approach is called semi-supervised learning and could be very helpful in many applications. RESULTS: We propose here, a method for semi-supervised learning of HMMs that can incorporate labeled, unlabeled and partially labeled data in a straightforward manner. The algorithm is based on a variant of the Expectation-Maximization (EM) algorithm, where the missing labels of the unlabeled or partially labeled data are considered as the missing data. We apply the algorithm to several biological problems, namely, for the prediction of transmembrane protein topology for alpha-helical and beta-barrel membrane proteins and for the prediction of archaeal signal peptides. The results are very promising, since the algorithms presented here can significantly improve the prediction performance of even the top-scoring classifiers. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina Supervisionado , Algoritmos , Cadeias de Markov , Modelos Estatísticos , Análise de Sequência
14.
J Bioinform Comput Biol ; 16(5): 1850019, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30353782

RESUMO

Hidden Markov Models (HMMs) are probabilistic models widely used in computational molecular biology. However, the Markovian assumption regarding transition probabilities which dictates that the observed symbol depends only on the current state may not be sufficient for some biological problems. In order to overcome the limitations of the first order HMM, a number of extensions have been proposed in the literature to incorporate past information in HMMs conditioning either on the hidden states, or on the observations, or both. Here, we implement a simple extension of the standard HMM in which the current observed symbol (amino acid residue) depends both on the current state and on a series of observed previous symbols. The major advantage of the method is the simplicity in the implementation, which is achieved by properly transforming the observation sequence, using an extended alphabet. Thus, it can utilize all the available algorithms for the training and decoding of HMMs. We investigated the use of several encoding schemes and performed tests in a number of important biological problems previously studied by our team (prediction of transmembrane proteins and prediction of signal peptides). The evaluation shows that, when enough data are available, the performance increased by 1.8%-8.2% and the existing prediction methods may improve using this approach. The methods, for which the improvement was significant (PRED-TMBB2, PRED-TAT and HMM-TM), are available as web-servers freely accessible to academic users at www.compgen.org/tools/ .


Assuntos
Biologia Computacional/métodos , Cadeias de Markov , Algoritmos , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Modelos Estatísticos , Sinais Direcionadores de Proteínas
15.
Curr Opin Struct Biol ; 50: 9-17, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29100082

RESUMO

Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins.


Assuntos
Proteínas de Membrana/química , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Conformação Proteica , Software
16.
Bioinformatics ; 33(10): 1521-1527, 2017 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-28108451

RESUMO

MOTIVATION: In the context of genome-wide association studies (GWAS), there is a variety of statistical techniques in order to conduct the analysis, but, in most cases, the underlying genetic model is usually unknown. Under these circumstances, the classical Cochran-Armitage trend test (CATT) is suboptimal. Robust procedures that maximize the power and preserve the nominal type I error rate are preferable. Moreover, performing a meta-analysis using robust procedures is of great interest and has never been addressed in the past. The primary goal of this work is to implement several robust methods for analysis and meta-analysis in the statistical package Stata and subsequently to make the software available to the scientific community. RESULTS: The CATT under a recessive, additive and dominant model of inheritance as well as robust methods based on the Maximum Efficiency Robust Test statistic, the MAX statistic and the MIN2 were implemented in Stata. Concerning MAX and MIN2, we calculated their asymptotic null distributions relying on numerical integration resulting in a great gain in computational time without losing accuracy. All the aforementioned approaches were employed in a fixed or a random effects meta-analysis setting using summary data with weights equal to the reciprocal of the combined cases and controls. Overall, this is the first complete effort to implement procedures for analysis and meta-analysis in GWAS using Stata. AVAILABILITY AND IMPLEMENTATION: A Stata program and a web-server are freely available for academic users at http://www.compgen.org/tools/GWAR. CONTACT: pbagos@compgen.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genética Populacional/métodos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Metanálise como Assunto , Modelos Genéticos , Software , Predisposição Genética para Doença , Genômica/métodos , Humanos , Hipertensão/genética , Polimorfismo de Nucleotídeo Único , Estatística como Assunto
18.
Nucleic Acids Res ; 45(D1): D219-D227, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899601

RESUMO

The Database of Protein Disorder (DisProt, URL: www.disprot.org) has been significantly updated and upgraded since its last major renewal in 2007. The current release holds information on more than 800 entries of IDPs/IDRs, i.e. intrinsically disordered proteins or regions that exist and function without a well-defined three-dimensional structure. We have re-curated previous entries to purge DisProt from conflicting cases, and also upgraded the functional classification scheme to reflect continuous advance in the field in the past 10 years or so. We define IDPs as proteins that are disordered along their entire sequence, i.e. entirely lack structural elements, and IDRs as regions that are at least five consecutive residues without well-defined structure. We base our assessment of disorder strictly on experimental evidence, such as X-ray crystallography and nuclear magnetic resonance (primary techniques) and a broad range of other experimental approaches (secondary techniques). Confident and ambiguous annotations are highlighted separately. DisProt 7.0 presents classified knowledge regarding the experimental characterization and functional annotations of IDPs/IDRs, and is intended to provide an invaluable resource for the research community for a better understanding structural disorder and for developing better computational tools for studying disordered proteins.


Assuntos
Bases de Dados de Proteínas , Proteínas Intrinsicamente Desordenadas , Animais , Cristalografia por Raios X , Transferência Ressonante de Energia de Fluorescência , Previsões , Controle de Formulários e Registros , Humanos , Proteínas Intrinsicamente Desordenadas/classificação , Ressonância Magnética Nuclear Biomolecular , Conformação Proteica
19.
Bioinformatics ; 32(17): i665-i671, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587687

RESUMO

MOTIVATION: The PRED-TMBB method is based on Hidden Markov Models and is capable of predicting the topology of beta-barrel outer membrane proteins and discriminate them from water-soluble ones. Here, we present an updated version of the method, PRED-TMBB2, with several newly developed features that improve its performance. The inclusion of a properly defined end state allows for better modeling of the beta-barrel domain, while different emission probabilities for the adjacent residues in strands are used to incorporate knowledge concerning the asymmetric amino acid distribution occurring there. Furthermore, the training was performed using newly developed algorithms in order to optimize the labels of the training sequences. Moreover, the method is retrained on a larger, non-redundant dataset which includes recently solved structures, and a newly developed decoding method was added to the already available options. Finally, the method now allows the incorporation of evolutionary information in the form of multiple sequence alignments. RESULTS: The results of a strict cross-validation procedure show that PRED-TMBB2 with homology information performs significantly better compared to other available prediction methods. It yields 76% in correct topology predictions and outperforms the best available predictor by 7%, with an overall SOV of 0.9. Regarding detection of beta-barrel proteins, PRED-TMBB2, using just the query sequence as input, achieves an MCC value of 0.92, outperforming even predictors designed for this task and are much slower. AVAILABILITY AND IMPLEMENTATION: The method, along with all datasets used, is freely available for academic users at http://www.compgen.org/tools/PRED-TMBB2 CONTACT: pbagos@compgen.org.


Assuntos
Proteínas de Membrana , Algoritmos , Biologia Computacional , Cadeias de Markov , Estrutura Secundária de Proteína , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
20.
Bioinformatics ; 32(10): 1571-3, 2016 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-26794316

RESUMO

UNLABELLED: : Accurate topology prediction of transmembrane ß-barrels is still an open question. Here, we present BOCTOPUS2, an improved topology prediction method for transmembrane ß-barrels that can also identify the barrel domain, predict the topology and identify the orientation of residues in transmembrane ß-strands. The major novelty of BOCTOPUS2 is the use of the dyad-repeat pattern of lipid and pore facing residues observed in transmembrane ß-barrels. In a cross-validation test on a benchmark set of 42 proteins, BOCTOPUS2 predicts the correct topology in 69% of the proteins, an improvement of more than 10% over the best earlier method (BOCTOPUS) and in addition, it produces significantly fewer erroneous predictions on non-transmembrane ß-barrel proteins. AVAILABILITY AND IMPLEMENTATION: BOCTOPUS2 webserver along with full dataset and source code is available at http://boctopus.bioinfo.se/ CONTACT: : arne@bioinfo.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Membrana/química , Biologia Computacional , Modelos Moleculares , Linguagens de Programação , Estrutura Secundária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...