Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Bioinformatics ; 36(17): 4583-4589, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32449765

RESUMO

MOTIVATION: Understanding an enzyme's function is one of the most crucial problem domains in computational biology. Enzymes are a key component in all organisms and many industrial processes as they help in fighting diseases and speed up essential chemical reactions. They have wide applications and therefore, the discovery of new enzymatic proteins can accelerate biological research and commercial productivity. Biological experiments, to determine an enzyme's function, are time-consuming and resource expensive. RESULTS: In this study, we propose a novel computational approach to predict an enzyme's function up to the fourth level of the Enzyme Commission (EC) Number. Many studies have attempted to predict an enzyme's function. Yet, no approach has properly tackled the fourth and final level of the EC number. The fourth level holds great significance as it gives us the most specific information of how an enzyme performs its function. Our method uses innovative deep learning approaches along with an efficient hierarchical classification scheme to predict an enzyme's precise function. On a dataset of 11 353 enzymes and 402 classes, we achieved a hierarchical accuracy and Macro-F1 score of 91.2% and 81.9%, respectively, on the 4th level. Moreover, our method can be used to predict the function of enzyme isoforms with considerable success. This methodology is broadly applicable for genome-wide prediction that can subsequently lead to automated annotation of enzyme databases and the identification of better/cheaper enzymes for commercial activities. AVAILABILITY AND IMPLEMENTATION: The web-server can be freely accessed at http://hecnet.cbrlab.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Proteínas , Genoma
2.
Proc Natl Acad Sci U S A ; 115(7): 1511-1516, 2018 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-29378944

RESUMO

[Formula: see text]-Barrel membrane proteins ([Formula: see text]MPs) play important roles, but knowledge of their structures is limited. We have developed a method to predict their 3D structures. We predict strand registers and construct transmembrane (TM) domains of [Formula: see text]MPs accurately, including proteins for which no prediction has been attempted before. Our method also accurately predicts structures from protein families with a limited number of sequences and proteins with novel folds. An average main-chain rmsd of 3.48 Å is achieved between predicted and experimentally resolved structures of TM domains, which is a significant improvement ([Formula: see text]3 Å) over a recent study. For [Formula: see text]MPs with NMR structures, the deviation between predictions and experimentally solved structures is similar to the difference among the NMR structures, indicating excellent prediction accuracy. Moreover, we can now accurately model the extended [Formula: see text]-barrels and loops in non-TM domains, increasing the overall coverage of structure prediction by [Formula: see text]%. Our method is general and can be applied to genome-wide structural prediction of [Formula: see text]MPs.


Assuntos
Proteínas de Membrana/química , Modelos Moleculares , Proteínas de Fímbrias/química , Conformação Proteica em Folha beta , Domínios Proteicos , Canais de Ânion Dependentes de Voltagem/química
3.
BMC Bioinformatics ; 21(1): 500, 2020 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-33148180

RESUMO

BACKGROUND: High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of protein-protein interaction (PPI) networks of different species reveals important insights which may help in disease analysis and drug design. The study of PPI network alignment can also helps in understanding the different biological systems of different species. It can also be used in transfer of knowledge across different species. Different aligners have been introduced in the last decade but developing an accurate and scalable global alignment algorithm that can ensures the biological significance alignment is still challenging. RESULTS: This paper presents a novel global pairwise network alignment algorithm, SAlign, which uses topological and biological information in the alignment process. The proposed algorithm incorporates sequence and structural information for computing biological scores, whereas previous algorithms only use sequence information. The alignment based on the proposed technique shows that the combined effect of structure and sequence results in significantly better pairwise alignments. We have compared SAlign with state-of-art algorithms on the basis of semantic similarity of alignment and the number of aligned nodes on multiple PPI network pairs. The results of SAlign on the network pairs which have high percentage of proteins with available structure are 3-63% semantically better than all existing techniques. Furthermore, it also aligns 5-14% more nodes of these network pairs as compared to existing aligners. The results of SAlign on other PPI network pairs are comparable or better than all existing techniques. We also introduce [Formula: see text], a Monte Carlo based alignment algorithm, that produces multiple network alignments with similar semantic similarity. This helps the user to pick biologically meaningful alignments. CONCLUSION: The proposed algorithm has the ability to find the alignments that are more biologically significant/relevant as compared to the alignments of existing aligners. Furthermore, the proposed method is able to generate alternate alignments that help in studying different genes/proteins of the specie.


Assuntos
Algoritmos , Mapas de Interação de Proteínas , Proteínas/metabolismo , Animais , Bases de Dados de Proteínas , Humanos , Camundongos , Método de Monte Carlo , Proteínas/química , Leveduras/metabolismo
4.
Biol Chem ; 401(6-7): 687-697, 2020 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-32142473

RESUMO

In the past three decades, significant advances have been made in providing the biochemical background of TOM (translocase of the outer mitochondrial membrane)-mediated protein translocation into mitochondria. In the light of recent cryoelectron microscopy-derived structures of TOM isolated from Neurospora crassa and Saccharomyces cerevisiae, the interpretation of biochemical and biophysical studies of TOM-mediated protein transport into mitochondria now rests on a solid basis. In this review, we compare the subnanometer structure of N. crassa TOM core complex with that of yeast. Both structures reveal remarkably well-conserved symmetrical dimers of 10 membrane protein subunits. The structural data also validate predictions of weakly stable regions in the transmembrane ß-barrel domains of the protein-conducting subunit Tom40, which signal the existence of ß-strands located in interfaces of protein-protein interactions.


Assuntos
Proteínas de Transporte/química , Mitocôndrias/metabolismo , Membranas Mitocondriais/metabolismo , Neurospora crassa/enzimologia , Saccharomyces cerevisiae/enzimologia , Proteínas de Transporte/isolamento & purificação , Proteínas de Transporte/metabolismo , Proteínas do Complexo de Importação de Proteína Precursora Mitocondrial , Conformação Proteica
5.
Bioinformatics ; 33(11): 1664-1671, 2017 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-28158457

RESUMO

MOTIVATION: Transmembrane beta-barrel proteins (TMBs) serve a multitude of essential cellular functions in Gram-negative bacteria, mitochondria and chloroplasts. Transfer free energies (TFEs) of residues in the transmembrane (TM) region provides fundamental quantifications of thermodynamic stabilities of TMBs, which are important for the folding and the membrane insertion processes, and may help in understanding the structure-function relationship. However, experimental measurement of TFEs of TMBs is challenging. Although a recent computational method can be used to calculate TFEs, the results of which are in excellent agreement with experimentally measured values, this method does not scale up, and is limited to small TMBs. RESULTS: We have developed an approximation method that calculates TFEs of TM residues in TMBs accurately, with which depth-dependent transfer free energy profiles can be derived. Our results are in excellent agreement with experimental measurements. This method is efficient and applicable to all bacterial TMBs regardless of the size of the protein. AVAILABILITY AND IMPLEMENTATION: An online webserver is available at http://tanto.bioe.uic.edu/tmb-tfe . CONTACT: : jliang@uic.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Proteínas de Membrana/química , Termodinâmica , Algoritmos , Proteínas de Bactérias/metabolismo , Bactérias Gram-Negativas/metabolismo , Estrutura Secundária de Proteína
6.
Bioinformatics ; 32(17): i658-i664, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587686

RESUMO

MOTIVATION: As an increasing amount of protein-protein interaction (PPI) data becomes available, their computational interpretation has become an important problem in bioinformatics. The alignment of PPI networks from different species provides valuable information about conserved subnetworks, evolutionary pathways and functional orthologs. Although several methods have been proposed for global network alignment, there is a pressing need for methods that produce more accurate alignments in terms of both topological and functional consistency. RESULTS: In this work, we present a novel global network alignment algorithm, named ModuleAlign, which makes use of local topology information to define a module-based homology score. Based on a hierarchical clustering of functionally coherent proteins involved in the same module, ModuleAlign employs a novel iterative scheme to find the alignment between two networks. Evaluated on a diverse set of benchmarks, ModuleAlign outperforms state-of-the-art methods in producing functionally consistent alignments. By aligning Pathogen-Human PPI networks, ModuleAlign also detects a novel set of conserved human genes that pathogens preferentially target to cause pathogenesis. AVAILABILITY: http://ttic.uchicago.edu/∼hashemifar/ModuleAlign.html CONTACT: canzar@ttic.edu or j3xu.ttic.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Humanos , Proteínas , Software
7.
Brain Behav Immun ; 62: 35-40, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-27810376

RESUMO

The blood-brain barrier (BBB) plays an important role in the clinical expression of neuropsychiatric symptoms during systemic illness in health and neurological disease. Evidence from in vitro and preclinical in vivo studies indicate that systemic inflammation impairs blood-brain barrier function. In order to investigate this hypothesis, we evaluated the association between systemic inflammatory markers (leucocytes, erythrocyte sedimentation rate and C-reactive protein) and BBB function (cerebrospinal fluid/serum albumin ratio) in 1273 consecutive lumbar punctures. In the absence of cerebrospinal fluid (CSF) abnormality, systemic inflammation did not affect the CSF/serum albumin ratio. When CSF abnormality was present, systemic inflammation significantly predicted the CSF/serum albumin ratio. Amongst the systemic inflammatory markers, C-reactive protein was the predominant driver of this effect. Temporal analysis in this association study suggested causality. In conclusion, the diseased BBB has an increased susceptibility to systemic inflammation.


Assuntos
Barreira Hematoencefálica/fisiopatologia , Proteína C-Reativa/metabolismo , Inflamação/fisiopatologia , Leucócitos/metabolismo , Adulto , Idoso , Biomarcadores/metabolismo , Sedimentação Sanguínea , Barreira Hematoencefálica/metabolismo , Feminino , Humanos , Inflamação/sangue , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Albumina Sérica/metabolismo
8.
J Am Chem Soc ; 138(8): 2592-601, 2016 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-26860422

RESUMO

Knowledge of the transfer free energy of amino acids from aqueous solution to a lipid bilayer is essential for understanding membrane protein folding and for predicting membrane protein structure. Here we report a computational approach that can calculate the folding free energy of the transmembrane region of outer membrane ß-barrel proteins (OMPs) by combining an empirical energy function with a reduced discrete state space model. We quantitatively analyzed the transfer free energies of 20 amino acid residues at the center of the lipid bilayer of OmpLA. Our results are in excellent agreement with the experimentally derived hydrophobicity scales. We further exhaustively calculated the transfer free energies of 20 amino acids at all positions in the TM region of OmpLA. We found that the asymmetry of the Gram-negative bacterial outer membrane as well as the TM residues of an OMP determine its functional fold in vivo. Our results suggest that the folding process of an OMP is driven by the lipid-facing residues in its hydrophobic core, and its NC-IN topology is determined by the differential stabilities of OMPs in the asymmetrical outer membrane. The folding free energy is further reduced by lipid A and assisted by general depth-dependent cooperativities that exist between polar and ionizable residues. Moreover, context-dependency of transfer free energies at specific positions in OmpLA predict regions important for protein function as well as structural anomalies. Our computational approach is fast, efficient and applicable to any OMP.


Assuntos
Proteínas da Membrana Bacteriana Externa/química , Modelos Químicos , Fosfolipases A1/química , Aminoácidos/química , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Dobramento de Proteína , Relação Estrutura-Atividade , Termodinâmica
9.
Bioinformatics ; 31(12): i133-41, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-26072475

RESUMO

MOTIVATION: Biological molecules perform their functions through interactions with other molecules. Structure alignment of interaction interfaces between biological complexes is an indispensable step in detecting their structural similarities, which are key S: to understanding their evolutionary histories and functions. Although various structure alignment methods have been developed to successfully access the similarities of protein structures or certain types of interaction interfaces, existing alignment tools cannot directly align arbitrary types of interfaces formed by protein, DNA or RNA molecules. Specifically, they require a ': blackbox preprocessing ': to standardize interface types and chain identifiers. Yet their performance is limited and sometimes unsatisfactory. RESULTS: Here we introduce a novel method, PROSTA-inter, that automatically determines and aligns interaction interfaces between two arbitrary types of complex structures. Our method uses sequentially remote fragments to search for the optimal superimposition. The optimal residue matching problem is then formulated as a maximum weighted bipartite matching problem to detect the optimal sequence order-independent alignment. Benchmark evaluation on all non-redundant protein -: DNA complexes in PDB shows significant performance improvement of our method over TM-align and iAlign (with the ': blackbox preprocessing ': ). Two case studies where our method discovers, for the first time, structural similarities between two pairs of functionally related protein -: DNA complexes are presented. We further demonstrate the power of our method on detecting structural similarities between a protein -: protein complex and a protein -: RNA complex, which is biologically known as a protein -: RNA mimicry case. AVAILABILITY AND IMPLEMENTATION: The PROSTA-inter web-server is publicly available at http://www.cbrc.kaust.edu.sa/prosta/.


Assuntos
Proteínas de Ligação a DNA/química , DNA/química , Complexos Multiproteicos/química , Proteínas de Ligação a RNA/química , RNA/química , Algoritmos , Sítios de Ligação , Modelos Moleculares , Mimetismo Molecular , Conformação de Ácido Nucleico , Conformação Proteica , Alinhamento de Sequência , Software
10.
Bioinformatics ; 31(24): 3922-9, 2015 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-26286808

RESUMO

MOTIVATION: The inherent promiscuity of small molecules towards protein targets impedes our understanding of healthy versus diseased metabolism. This promiscuity also poses a challenge for the pharmaceutical industry as identifying all protein targets is important to assess (side) effects and repositioning opportunities for a drug. RESULTS: Here, we present a novel integrated structure- and system-based approach of drug-target prediction (iDTP) to enable the large-scale discovery of new targets for small molecules, such as pharmaceutical drugs, co-factors and metabolites (collectively called 'drugs'). For a given drug, our method uses sequence order-independent structure alignment, hierarchical clustering and probabilistic sequence similarity to construct a probabilistic pocket ensemble (PPE) that captures promiscuous structural features of different binding sites on known targets. A drug's PPE is combined with an approximation of its delivery profile to reduce false positives. In our cross-validation study, we use iDTP to predict the known targets of 11 drugs, with 63% sensitivity and 81% specificity. We then predicted novel targets for these drugs-two that are of high pharmacological interest, the peroxisome proliferator-activated receptor gamma and the oncogene B-cell lymphoma 2, were successfully validated through in vitro binding experiments. Our method is broadly applicable for the prediction of protein-small molecule interactions with several novel applications to biological research and drug development. AVAILABILITY AND IMPLEMENTATION: The program, datasets and results are freely available to academic users at http://sfb.kaust.edu.sa/Pages/Software.aspx.


Assuntos
Descoberta de Drogas , Preparações Farmacêuticas/química , Sítios de Ligação , Biologia Computacional/métodos , Humanos , Metabolismo , Probabilidade , Proteínas/química , Proteínas/metabolismo
11.
J Biomol Struct Dyn ; : 1-9, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38214492

RESUMO

High throughput protein-protein interaction (PPI) profiling and computational techniques have resulted in generating a large amount of PPI network data. The study of PPI networks helps in understanding the biological processes of the proteins. The comparative study of the PPI networks helps in identifying the conserved interactions across the species. This article presents a novel local PPI network aligner 'GSLAlign' that consists of two stages. It first detects the communities from the PPI networks by applying the GraphSAGE algorithm using gene expression data. In the second stage, the detected communities are aligned using a community aligner that is based on protein sequence similarity. The community detection algorithm produces more separable and biologically accurate communities as compared to previous community detection algorithms. Moreover, the proposed community alignment algorithm achieves 3-8% better results in terms of semantic similarity as compared to previous local aligners. The average connectivity and coverage of the proposed algorithm are also better than the existing aligners.Communicated by Ramaswamy H. Sarma.

12.
AMIA Jt Summits Transl Sci Proc ; 2024: 409-418, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38827107

RESUMO

Cancer outcomes are poor in resource-limited countries owing to high costs and insufficient pathologist-population ratio. The advent of digital pathology has assisted in improving cancer outcomes, however, Whole Slide Image scanners are expensive and not affordable in low-income countries. Microscope-acquired images on the other hand are cheap to collect and can be more viable for automation of cancer detection. In this study, we propose LCH-Network, a novel method to identify the cancer mitotic count from microscope-acquired images. We introduced Label Mix, and also synthesized images using GANs to handle data imbalance. Moreover, we applied progressive resolution to handle different image scales for mitotic localization. We achieved F1-Score of 0.71 and outperformed other existing techniques. Our findings enable mitotic count estimation from microscopic images with a low-cost setup. Clinically, our method could help avoid presumptive treatment without a confirmed cancer diagnosis.

13.
Digit Health ; 10: 20552076241255471, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38778869

RESUMO

Objective: The mitotic activity index is an important prognostic factor in the diagnosis of cancer. The task of mitosis detection is difficult as the nuclei are microscopic in size and partially labeled, and there are many more non-mitotic nuclei compared to mitotic ones. In this paper, we highlight the challenges of current mitosis detection pipelines and propose a method to tackle these challenges. Methods: Our proposed methodology is inspired from recent research on deep learning and an extensive analysis on the dataset and training pipeline. We first used the MiDoG'22 dataset for training, validation, and testing. We then tested the methodology without fine-tuning on the TUPAC'16 dataset and on a real-time case from Shaukat Khanum Memorial Cancer Hospital and Research Centre. Results: Our methodology has shown promising results both quantitatively and qualitatively. Quantitatively, our methodology achieved an F1-score of 0.87 on the MiDoG'22 dataset and an F1-score of 0.83 on the TUPAC dataset. Qualitatively, our methodology is generalizable and interpretable across various datasets and clinical settings. Conclusion: In this paper, we highlight the challenges of current mitosis detection pipelines and propose a method that can accurately predict mitotic nuclei. We illustrate the accuracy, generalizability, and interpretability of our approach across various datasets and clinical settings. Our methodology can speed up the adoption of computer-aided digital pathology in clinical settings.

14.
J Biol Chem ; 287(3): 2179-90, 2012 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-22117062

RESUMO

The outer mitochondrial membrane protein, the voltage-dependent anion channel (VDAC), is increasingly implicated in the control of apoptosis. Oligomeric assembly of VDAC1 was shown to be coupled to apoptosis induction, with oligomerization increasing substantially upon apoptosis induction and inhibited by apoptosis blockers. In this study, structure- and computation-based selection of the predicated VDAC1 dimerization site, in combination with site-directed mutagenesis, cysteine replacement, and chemical cross-linking, were employed to identify contact sites between VDAC1 molecules in dimers and higher oligomers. The predicted weakly stable ß-strands were experimentally found to represent the interfaces between VDAC1 monomers composing the oligomer. Replacing hydrophobic amino acids with charged residues in ß-strands 1, 2, and 19 interfered with VDAC1 oligomerization. The proximity of ß-strands 1, 2, and 19 within the VDAC1 dimer and the existence of other association sites involving ß-strand 16 were confirmed when a cysteine was introduced at defined positions in cysteineless VDAC1 mutants, together with the use of cysteine-specific cross-linker bis(maleimido)ethane. Moreover, the results suggest that VDAC1 also exists as a dimer that upon apoptosis induction undergoes conformational changes and that its oligomerization proceeds through a series of interactions involving two distinct interfaces. Dissection of VDAC1 dimerization/oligomerization as presented here provides structural insight into the oligomeric status of cellular VDAC1 under physiological and apoptotic conditions.


Assuntos
Apoptose/fisiologia , Multimerização Proteica/fisiologia , Canal de Ânion 1 Dependente de Voltagem/metabolismo , Animais , Células HEK293 , Humanos , Camundongos , Estrutura Quaternária de Proteína , Estrutura Secundária de Proteína , Ratos , Relação Estrutura-Atividade , Canal de Ânion 1 Dependente de Voltagem/química , Canal de Ânion 1 Dependente de Voltagem/genética
15.
Biochim Biophys Acta ; 1818(4): 927-41, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22051023

RESUMO

We discuss recent progresses in computational studies of membrane proteins based on physical models with parameters derived from bioinformatics analysis. We describe computational identification of membrane proteins and prediction of their topology from sequence, discovery of sequence and spatial motifs, and implications of these discoveries. The detection of evolutionary signal for understanding the substitution pattern of residues in the TM segments and for sequence alignment is also discussed. We further discuss empirical potential functions for energetics of inserting residues in the TM domain, for interactions between TM helices or strands, and their applications in predicting lipid-facing surfaces of the TM domain. Recent progresses in structure predictions of membrane proteins are also reviewed, with further discussions on calculation of ensemble properties such as melting temperature based on simplified state space model. Additional topics include prediction of oligomerization state of membrane proteins, identification of the interfaces for protein-protein interactions, and design of membrane proteins. This article is part of a Special Issue entitled: Protein Folding in Membranes.


Assuntos
Biologia Computacional/métodos , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Sequência de Aminoácidos , Animais , Evolução Molecular , Humanos , Interações Hidrofóbicas e Hidrofílicas , Dados de Sequência Molecular , Ligação Proteica
16.
Sci Rep ; 13(1): 806, 2023 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-36646775

RESUMO

Long non-coding RNAs (lncRNAs), which were once considered as transcriptional noise, are now in the limelight of current research. LncRNAs play a major role in regulating various biological processes such as imprinting, cell differentiation, and splicing. The mutations of lncRNAs are involved in various complex diseases. Identifying lncRNA-disease associations has gained a lot of attention as predicting it efficiently will lead towards better disease treatment. In this study, we have developed a machine learning model that predicts disease-related lncRNAs by combining sequence and structure-based features. The features were trained on SVM and Random Forest classifiers. We have compared our method with the state-of-the-art and obtained the highest F1 score of 76% on SVM classifier. Moreover, this study has overcome two serious limitations of the reported method which are lack of redundancy checking and implementation of oversampling for balancing the positive and negative class. Our method has achieved improved performance among machine learning models reported for lncRNA-disease associations. Combining multiple features together specifically lncRNAs sequence mutation has a significant contribution to the disease related lncRNA prediction.


Assuntos
RNA Longo não Codificante , RNA Longo não Codificante/genética , Biologia Computacional/métodos , Aprendizado de Máquina , Algoritmo Florestas Aleatórias , Diferenciação Celular
17.
Artigo em Inglês | MEDLINE | ID: mdl-38083556

RESUMO

Recent advances in Natural Language Processing (NLP) have produced state of the art results on several sequence to sequence (seq2seq) tasks. Enhancements in embedders and their training methodologies have shown significant improvement on downstream tasks. Word vector models like Word2Vec, FastText & Glove were widely used over one-hot encoded vectors for years until the advent of deep contextualized embedders. Protein sequences consist of 20 naturally occurring amino acids that can be treated as the language of nature. These amino acids in combinations with each other makeup the biological functions. The choice of vector representation and architecture design for a biological task is highly dependent upon the nature of the task. We utilize unlabelled protein sequences to train a Convolution and Gated Recurrent Network (CGRN) embedder using Masked Language Modeling (MLM) technique that shows significant performance boost under resource constraint setting on two downstream tasks i.e., F1-score(Q8) of 73.1% on Secondary Structure Prediction (SSP) & F1-score of 84% on Intrinsically Disordered Region Prediction (IDRP). We also compare different architectures on downstream tasks to show the impact of the nature of biological task on the performance of the model.


Assuntos
Idioma , Processamento de Linguagem Natural , Sequência de Aminoácidos , Unified Medical Language System , Aminoácidos
18.
Methods Mol Biol ; 2627: 321-328, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36959455

RESUMO

ß-barrel membrane proteins (ßMPs), found in the outer membrane of gram-negative bacteria, mitochondria, and chloroplasts, play important roles in membrane anchoring, pore formation, and enzyme activities. However, it is often difficult to determine their structures experimentally, and the knowledge of their structures is currently limited. We have developed a method to predict the 3D architectures of ßMPs. We can accurately construct transmembrane domains of ßMPs by predicting their strand registers, from which full 3D atomic structures are derived. Using 3D Beta-barrel Membrane Protein Predictor (3D-BMPP), we can further accurately model the extended beta barrels and loops in non-TM regions with overall greater structure prediction coverage. 3DBMPP is a general technique that can be applied to protein families with limited sequences as well as proteins with novel folds. Applications of 3DBMPP can be broadly applied to genome-wide ßMPs structure prediction.


Assuntos
Proteínas da Membrana Bacteriana Externa , Proteínas de Membrana , Proteínas de Membrana/genética , Proteínas de Membrana/química , Domínios Proteicos , Proteínas da Membrana Bacteriana Externa/genética , Proteínas da Membrana Bacteriana Externa/química
19.
J Biomol Struct Dyn ; : 1-10, 2023 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-37787617

RESUMO

Multidrug efflux is a well-established mechanism of drug resistance in bacterial pathogens like Salmonella Typhi. styMdtM (locus name; STY4874) is a multidrug efflux transporter of the major facilitator superfamily expressed in S. Typhi. Functional assays identified several residues important for its transport activity. Here, we used an AlphaFold model to identify additional residues for analysis by mutagenesis. Mutation of peripheral residue Cys185 had no effect on the structure or function of the transporter. However, substitution of channel-lining residues Tyr29 and Tyr231 completely abolished transport function. Finally, mutation of Gln294, which faces peripheral helices of the transporter, resulted in the loss of transport of some substrates. Crystallization studies yielded diffraction data for the wild-type protein at 4.5 Å resolution and allowed the unit cell parameters to be established as a = b = 64.3 Å, c = 245.4 Å, α = ß = γ = 90°, in space group P4. Our studies represent a further stepping stone towards a mechanistic understanding of the clinically important multidrug transporter styMdtM.Communicated by Ramaswamy H. Sarma.

20.
Biochim Biophys Acta ; 1808(4): 1092-102, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21167813

RESUMO

Membrane proteins function in the diverse environment of the lipid bilayer. Experimental evidence suggests that some lipid molecules bind tightly to specific sites on the membrane protein surface. These lipid molecules often act as co-factors and play important functional roles. In this study, we have assessed the evolutionary selection pressure experienced at lipid-binding sites in a set of α-helical and ß-barrel membrane proteins using posterior probability analysis of the ratio of synonymous vs. nonsynonymous substitutions (ω-ratio). We have also carried out a geometric analysis of the membrane protein structures to identify residues in close contact with co-crystallized lipids. We found that residues forming cholesterol-binding sites in both ß(2)-adrenergic receptor and Na(+)-K(+)-ATPase exhibit strong conservation, which can be characterized by an expanded cholesterol consensus motif for GPCRs. Our results suggest the functional importance of aromatic stacking interactions and interhelical hydrogen bonds in facilitating protein-cholesterol interactions, which is now reflected in the expanded motif. We also find that residues forming the cardiolipin-binding site in formate dehydrogenase-N γ-subunit and the phosphatidylglycerol binding site in KcsA are under strong purifying selection pressure. Although the lipopolysaccharide (LPS)-binding site in ferric hydroxamate uptake receptor (FhuA) is only weakly conserved, we show using a statistical mechanical model that LPS binds to the least stable FhuA ß-strand and protects it from the bulk lipid. Our results suggest that specific lipid binding may be a general mechanism employed by ß-barrel membrane proteins to stabilize weakly stable regions. Overall, we find that the residues forming specific lipid binding sites on the surfaces of membrane proteins often experience strong purifying selection pressure.


Assuntos
Bicamadas Lipídicas/química , Lipídeos de Membrana/química , Proteínas de Membrana/química , Estrutura Terciária de Proteína , Aminoácidos/química , Aminoácidos/metabolismo , Proteínas da Membrana Bacteriana Externa/química , Proteínas da Membrana Bacteriana Externa/metabolismo , Sítios de Ligação , Evolução Biológica , Cardiolipinas/química , Cardiolipinas/metabolismo , Colesterol/química , Colesterol/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Formiato Desidrogenases/química , Formiato Desidrogenases/metabolismo , Bicamadas Lipídicas/metabolismo , Lipopolissacarídeos/química , Lipopolissacarídeos/metabolismo , Lipídeos de Membrana/metabolismo , Proteínas de Membrana/metabolismo , Modelos Moleculares , Ligação Proteica , Receptores Adrenérgicos beta 2/química , Receptores Adrenérgicos beta 2/metabolismo , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/metabolismo , ATPase Trocadora de Sódio-Potássio/química , ATPase Trocadora de Sódio-Potássio/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA