Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 147
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 120(33): e2305393120, 2023 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-37556498

RESUMEN

Toxin-antitoxin (TA) systems are a large group of small genetic modules found in prokaryotes and their mobile genetic elements. Type II TAs are encoded as bicistronic (two-gene) operons that encode two proteins: a toxin and a neutralizing antitoxin. Using our tool NetFlax (standing for Network-FlaGs for toxins and antitoxins), we have performed a large-scale bioinformatic analysis of proteinaceous TAs, revealing interconnected clusters constituting a core network of TA-like gene pairs. To understand the structural basis of toxin neutralization by antitoxins, we have predicted the structures of 3,419 complexes with AlphaFold2. Together with mutagenesis and functional assays, our structural predictions provide insights into the neutralizing mechanism of the hyperpromiscuous Panacea antitoxin domain. In antitoxins composed of standalone Panacea, the domain mediates direct toxin neutralization, while in multidomain antitoxins the neutralization is mediated by other domains, such as PAD1, Phd-C, and ZFD. We hypothesize that Panacea acts as a sensor that regulates TA activation. We have experimentally validated 16 NetFlax TA systems and used domain annotations and metabolic labeling assays to predict their potential mechanisms of toxicity (such as membrane disruption, and inhibition of cell division or protein synthesis) as well as biological functions (such as antiphage defense). We have validated the antiphage activity of a RosmerTA system encoded by Gordonia phage Kita, and used fluorescence microscopy to confirm its predicted membrane-depolarizing activity. The interactive version of the NetFlax TA network that includes structural predictions can be accessed at http://netflax.webflags.se/.


Asunto(s)
Antitoxinas , Toxinas Bacterianas , Antitoxinas/genética , Toxinas Bacterianas/metabolismo , Células Procariotas/metabolismo , Operón/genética , Biología Computacional , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo
2.
Bioinformatics ; 40(6)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38781500

RESUMEN

MOTIVATION: Today, the prediction of structures of large protein complexes solely from their sequence information requires prior knowledge of the stoichiometry of the complex. To address this challenge, we have enhanced the Monte Carlo Tree Search algorithms in MoLPC to enable the assembly of protein complexes while simultaneously predicting their stoichiometry. RESULTS: In MoLPC2, we have improved the predictions by allowing sampling alternative AlphaFold predictions. Using MoLPC2, we accurately predicted the structures of 50 out of 175 nonredundant protein complexes (TM-score ≥ 0.8) without knowing the stoichiometry. MoLPC2 provides new opportunities for predicting protein complex structures without stoichiometry information. AVAILABILITY AND IMPLEMENTATION: MoLPC2 is freely available at https://github.com/hychim/molpc2. A notebook is also available from the repository for easy use.


Asunto(s)
Algoritmos , Método de Montecarlo , Proteínas , Programas Informáticos , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Conformación Proteica , Pliegue de Proteína , Bases de Datos de Proteínas
3.
Bioinformatics ; 40(1)2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38175787

RESUMEN

MOTIVATION: Understanding metal-protein interaction can provide structural and functional insights into cellular processes. As the number of protein sequences increases, developing fast yet precise computational approaches to predict and annotate metal-binding sites becomes imperative. Quick and resource-efficient pre-trained protein language model (pLM) embeddings have successfully predicted binding sites from protein sequences despite not using structural or evolutionary features (multiple sequence alignments). Using residue-level embeddings from the pLMs, we have developed a sequence-based method (M-Ionic) to identify metal-binding proteins and predict residues involved in metal binding. RESULTS: On independent validation of recent proteins, M-Ionic reports an area under the curve (AUROC) of 0.83 (recall = 84.6%) in distinguishing metal binding from non-binding proteins compared to AUROC of 0.74 (recall = 61.8%) of the next best method. In addition to comparable performance to the state-of-the-art method for identifying metal-binding residues (Ca2+, Mg2+, Mn2+, Zn2+), M-Ionic provides binding probabilities for six additional ions (i.e. Cu2+, Po43-, So42-, Fe2+, Fe3+, Co2+). We show that the pLM embedding of a single residue contains sufficient information about its neighbours to predict its binding properties. AVAILABILITY AND IMPLEMENTATION: M-Ionic can be used on your protein of interest using a Google Colab Notebook (https://bit.ly/40FrRbK). The GitHub repository (https://github.com/TeamSundar/m-ionic) contains all code and data.


Asunto(s)
Metales , Proteínas , Proteínas/química , Secuencia de Aminoácidos , Sitios de Unión , Iones , Dominios Proteicos , Metales/química , Metales/metabolismo
4.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37405868

RESUMEN

MOTIVATION: Despite near-experimental accuracy on single-chain predictions, there is still scope for improvement among multimeric predictions. Methods like AlphaFold-Multimer and FoldDock can accurately model dimers. However, how well these methods fare on larger complexes is still unclear. Further, evaluation methods of the quality of multimeric complexes are not well established. RESULTS: We analysed the performance of AlphaFold-Multimer on a homology-reduced dataset of homo- and heteromeric protein complexes. We highlight the differences between the pairwise and multi-interface evaluation of chains within a multimer. We describe why certain complexes perform well on one metric (e.g. TM-score) but poorly on another (e.g. DockQ). We propose a new score, Predicted DockQ version 2 (pDockQ2), to estimate the quality of each interface in a multimer. Finally, we modelled protein complexes (from CORUM) and identified two highly confident structures that do not have sequence homology to any existing structures. AVAILABILITY AND IMPLEMENTATION: All scripts, models, and data used to perform the analysis in this study are freely available at https://gitlab.com/ElofssonLab/afm-benchmark.


Asunto(s)
Biología Computacional , Conformación Proteica , Biología Computacional/métodos
5.
Bioinformatics ; 39(2)2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36692145

RESUMEN

MOTIVATION: Protein-protein interaction (PPI) networks and transcriptional regulatory networks are critical in regulating cells and their signaling. A thorough understanding of PPIs can provide more insights into cellular physiology at normal and disease states. Although numerous methods have been proposed to predict PPIs, it is still challenging for interaction prediction between unknown proteins. In this study, a novel neural network named AFTGAN was constructed to predict multi-type PPIs. Regarding feature input, ESM-1b embedding containing much biological information for proteins was added as a protein sequence feature besides amino acid co-occurrence similarity and one-hot coding. An ensemble network was also constructed based on a transformer encoder containing an AFT module (performing the weight operation on vital protein sequence feature information) and graph attention network (extracting the relational features of protein pairs) for the part of the network framework. RESULTS: The experimental results showed that the Micro-F1 of the AFTGAN based on three partitioning schemes (BFS, DFS and the random mode) on the SHS27K and SHS148K datasets was 0.685, 0.711 and 0.867, as well as 0.745, 0.819 and 0.920, respectively, all higher than that of other popular methods. In addition, the experimental comparisons confirmed the performance superiority of the proposed model for predicting PPIs of unknown proteins on the STRING dataset. AVAILABILITY AND IMPLEMENTATION: The source code is publicly available at https://github.com/1075793472/AFTGAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Redes Neurales de la Computación , Programas Informáticos , Proteínas/química , Secuencia de Aminoácidos , Mapas de Interacción de Proteínas
6.
Mol Cell Proteomics ; 21(10): 100413, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36115577

RESUMEN

The assembly of proteins and peptides into amyloid fibrils is causally linked to serious disorders such as Alzheimer's disease. Multiple proteins have been shown to prevent amyloid formation in vitro and in vivo, ranging from highly specific chaperone-client pairs to completely nonspecific binding of aggregation-prone peptides. The underlying interactions remain elusive. Here, we turn to the machine learning-based structure prediction algorithm AlphaFold2 to obtain models for the nonspecific interactions of ß-lactoglobulin, transthyretin, or thioredoxin 80 with the model amyloid peptide amyloid ß and the highly specific complex between the BRICHOS chaperone domain of C-terminal region of lung surfactant protein C and its polyvaline target. Using a combination of native mass spectrometry (MS) and ion mobility MS, we show that nonspecific chaperoning is driven predominantly by hydrophobic interactions of amyloid ß with hydrophobic surfaces in ß-lactoglobulin, transthyretin, and thioredoxin 80, and in part regulated by oligomer stability. For C-terminal region of lung surfactant protein C, native MS and hydrogen-deuterium exchange MS reveal that a disordered region recognizes the polyvaline target by forming a complementary ß-strand. Hence, we show that AlphaFold2 and MS can yield atomistic models of hard-to-capture protein interactions that reveal different chaperoning mechanisms based on separate ligand properties and may provide possible clues for specific therapeutic intervention.


Asunto(s)
Péptidos beta-Amiloides , Amiloide , Humanos , Amiloide/química , Amiloide/metabolismo , Péptidos beta-Amiloides/química , Péptidos beta-Amiloides/metabolismo , Prealbúmina , Deuterio , Ligandos , Chaperonas Moleculares/metabolismo , Espectrometría de Masas , Aprendizaje Automático , Tiorredoxinas , Lactoglobulinas , Proteínas Asociadas a Surfactante Pulmonar
7.
Nucleic Acids Res ; 50(D1): D480-D487, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34850135

RESUMEN

The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.


Asunto(s)
Bases de Datos de Proteínas , Proteínas Intrínsecamente Desordenadas/metabolismo , Anotación de Secuencia Molecular , Programas Informáticos , Secuencia de Aminoácidos , ADN/genética , ADN/metabolismo , Conjuntos de Datos como Asunto , Ontología de Genes , Humanos , Internet , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/genética , Unión Proteica , ARN/genética , ARN/metabolismo
8.
BMC Biol ; 21(1): 47, 2023 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-36855050

RESUMEN

BACKGROUND: NorQ, a member of the MoxR-class of AAA+ ATPases, and NorD, a protein containing a Von Willebrand Factor Type A (VWA) domain, are essential for non-heme iron (FeB) cofactor insertion into cytochrome c-dependent nitric oxide reductase (cNOR). cNOR catalyzes NO reduction, a key step of bacterial denitrification. This work aimed at elucidating the specific mechanism of NorQD-catalyzed FeB insertion, and the general mechanism of the MoxR/VWA interacting protein families. RESULTS: We show that NorQ-catalyzed ATP hydrolysis, an intact VWA domain in NorD, and specific surface carboxylates on cNOR are all features required for cNOR activation. Supported by BN-PAGE, low-resolution cryo-EM structures of NorQ and the NorQD complex show that NorQ forms a circular hexamer with a monomer of NorD binding both to the side and to the central pore of the NorQ ring. Guided by AlphaFold predictions, we assign the density that "plugs" the NorQ ring pore to the VWA domain of NorD with a protruding "finger" inserting through the pore and suggest this binding mode to be general for MoxR/VWA couples. CONCLUSIONS: Based on our results, we present a tentative model for the mechanism of NorQD-catalyzed cNOR remodeling and suggest many of its features to be applicable to the whole MoxR/VWA family.


Asunto(s)
Proteínas AAA , Paracoccus denitrificans , Chaperonas Moleculares , Noretindrona , Relación Estructura-Actividad
9.
J Struct Biol ; 215(4): 108023, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37652396

RESUMEN

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Asunto(s)
Proteínas , Secuencias Repetidas en Tándem , Proteínas/genética , Proteínas/química , Secuencias Repetidas en Tándem/genética , Secuencia de Aminoácidos
10.
Bioinformatics ; 38(4): 954-961, 2022 01 27.
Artículo en Inglés | MEDLINE | ID: mdl-34788800

RESUMEN

MOTIVATION: In the last decade, de novo protein structure prediction accuracy for individual proteins has improved significantly by utilising deep learning (DL) methods for harvesting the co-evolution information from large multiple sequence alignments (MSAs). The same approach can, in principle, also be used to extract information about evolutionary-based contacts across protein-protein interfaces. However, most earlier studies have not used the latest DL methods for inter-chain contact distance prediction. This article introduces a fold-and-dock method based on predicted residue-residue distances with trRosetta. RESULTS: The method can simultaneously predict the tertiary and quaternary structure of a protein pair, even when the structures of the monomers are not known. The straightforward application of this method to a standard dataset for protein-protein docking yielded limited success. However, using alternative methods for generating MSAs allowed us to dock accurately significantly more proteins. We also introduced a novel scoring function, PconsDock, that accurately separates 98% of correctly and incorrectly folded and docked proteins. The average performance of the method is comparable to the use of traditional, template-based or ab initio shape-complementarity-only docking methods. Moreover, the results of conventional and fold-and-dock approaches are complementary, and thus a combined docking pipeline could increase overall docking success significantly. This methodology contributed to the best model for one of the CASP14 oligomeric targets, H1065. AVAILABILITY AND IMPLEMENTATION: All scripts for predictions and analysis are available from https://github.com/ElofssonLab/bioinfo-toolbox/ and https://gitlab.com/ElofssonLab/benchmark5/. All models joined alignments, and evaluation results are available from the following figshare repository https://doi.org/10.6084/m9.figshare.14654886.v2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Proteínas , Proteínas/química , Alineación de Secuencia , Biología Computacional/métodos
11.
Proteins ; 90(7): 1493-1505, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35246997

RESUMEN

Scoring docking solutions is a difficult task, and many methods have been developed for this purpose. In docking, only a handful of the hundreds of thousands of models generated by docking algorithms are acceptable, causing difficulties when developing scoring functions. Today's best scoring functions can significantly increase the number of top-ranked models but still fail for most targets. Here, we examine the possibility of utilizing predicted interface residues to score docking models generated during the scan stage of a docking algorithm. Many methods have been developed to infer the regions of a protein surface that interact with another protein, but most have not been benchmarked using docking algorithms. This study systematically tests different interface prediction methods for scoring >300.000 low-resolution rigid-body template free docking decoys. Overall we find that contact-based interface prediction by BIPSPI is the best method to score docking solutions, with >12% of first ranked docking models being acceptable. Additional experiments indicated precision as a high-importance metric when estimating interface prediction quality, focusing on docking constraints production. Finally, we discussed several limitations for adopting interface predictions as constraints in a docking protocol.


Asunto(s)
Proteínas , Programas Informáticos , Algoritmos , Benchmarking , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , Mapeo de Interacción de Proteínas/métodos , Proteínas/química
12.
Bioinformatics ; 37(3): 360-366, 2021 04 20.
Artículo en Inglés | MEDLINE | ID: mdl-32780838

RESUMEN

MOTIVATION: Proteins are ubiquitous molecules whose function in biological processes is determined by their 3D structure. Experimental identification of a protein's structure can be time-consuming, prohibitively expensive and not always possible. Alternatively, protein folding can be modeled using computational methods, which however are not guaranteed to always produce optimal results. GraphQA is a graph-based method to estimate the quality of protein models, that possesses favorable properties such as representation learning, explicit modeling of both sequential and 3D structure, geometric invariance and computational efficiency. RESULTS: GraphQA performs similarly to state-of-the-art methods despite using a relatively low number of input features. In addition, the graph network structure provides an improvement over the architecture used in ProQ4 operating on the same input features. Finally, the individual contributions of GraphQA components are carefully evaluated. AVAILABILITY AND IMPLEMENTATION: PyTorch implementation, datasets, experiments and link to an evaluation server are available through this GitHub repository: github.com/baldassarreFe/graphqa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Pliegue de Proteína
13.
PLoS Comput Biol ; 17(4): e1008798, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33857128

RESUMEN

Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein's structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy.


Asunto(s)
Modelos Moleculares , Proteínas/química , Biología Computacional/métodos , Conformación Proteica
14.
PLoS Comput Biol ; 17(6): e1009048, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34081706

RESUMEN

Recently, an increasing number of studies have demonstrated that miRNAs are involved in human diseases, indicating that miRNAs might be a potential pathogenic factor for various diseases. Therefore, figuring out the relationship between miRNAs and diseases plays a critical role in not only the development of new drugs, but also the formulation of individualized diagnosis and treatment. As the prediction of miRNA-disease association via biological experiments is expensive and time-consuming, computational methods have a positive effect on revealing the association. In this study, a novel prediction model integrating GCN, CNN and Squeeze-and-Excitation Networks (GCSENet) was constructed for the identification of miRNA-disease association. The model first captured features by GCN based on a heterogeneous graph including diseases, genes and miRNAs. Then, considering the different effects of genes on each type of miRNA and disease, as well as the different effects of the miRNA-gene and disease-gene relationships on miRNA-disease association, a feature weight was set and a combination of miRNA-gene and disease-gene associations was added as feature input for the convolution operation in CNN. Furthermore, the squeeze and excitation blocks of SENet were applied to determine the importance of each feature channel and enhance useful features by means of the attention mechanism, thus achieving a satisfactory prediction of miRNA-disease association. The proposed method was compared against other state-of-the-art methods. It achieved an AUROC score of 95.02% and an AUPR score of 95.55% in a 10-fold cross-validation, which led to the finding that the proposed method is superior to these popular methods on most of the performance evaluation indexes.


Asunto(s)
Predisposición Genética a la Enfermedad , MicroARNs/genética , Modelos Biológicos , Algoritmos , Humanos , Aprendizaje Automático , Reproducibilidad de los Resultados
15.
PLoS Comput Biol ; 17(8): e1009278, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34403419

RESUMEN

CPA/AT transporters are made up of scaffold and a core domain. The core domain contains two non-canonical helices (broken or reentrant) that mediate the transport of ions, amino acids or other charged compounds. During evolution, these transporters have undergone substantial changes in structure, topology and function. To shed light on these structural transitions, we create models for all families using an integrated topology annotation method. We find that the CPA/AT transporters can be classified into four fold-types based on their structure; (1) the CPA-broken fold-type, (2) the CPA-reentrant fold-type, (3) the BART fold-type, and (4) a previously not described fold-type, the Reentrant-Helix-Reentrant fold-type. Several topological transitions are identified, including the transition between a broken and reentrant helix, one transition between a loop and a reentrant helix, complete changes of orientation, and changes in the number of scaffold helices. These transitions are mainly caused by gene duplication and shuffling events. Structural models, topology information and other details are presented in a searchable database, CPAfold (cpafold.bioinfo.se).


Asunto(s)
Evolución Molecular , Proteínas de Transporte de Membrana/química , Animales , Humanos , Modelos Moleculares , Conformación Proteica
16.
Nucleic Acids Res ; 48(D1): D269-D276, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31713636

RESUMEN

The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the 'dark' proteome.


Asunto(s)
Bases de Datos de Proteínas , Proteínas Intrínsecamente Desordenadas/química , Ontologías Biológicas , Curaduría de Datos , Anotación de Secuencia Molecular
17.
Proteins ; 89(12): 1770-1786, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34519095

RESUMEN

The potential of deep learning has been recognized in the protein structure prediction community for some time, and became indisputable after CASP13. In CASP14, deep learning has boosted the field to unanticipated levels reaching near-experimental accuracy. This success comes from advances transferred from other machine learning areas, as well as methods specifically designed to deal with protein sequences and structures, and their abstractions. Novel emerging approaches include (i) geometric learning, that is, learning on representations such as graphs, three-dimensional (3D) Voronoi tessellations, and point clouds; (ii) pretrained protein language models leveraging attention; (iii) equivariant architectures preserving the symmetry of 3D space; (iv) use of large meta-genome databases; (v) combinations of protein representations; and (vi) finally truly end-to-end architectures, that is, differentiable models starting from a sequence and returning a 3D structure. Here, we provide an overview and our opinion of the novel deep learning approaches developed in the last 2 years and widely used in CASP14.


Asunto(s)
Secuencia de Aminoácidos , Conformación Proteica , Proteínas , Programas Informáticos , Biología Computacional , Bases de Datos de Proteínas , Aprendizaje Profundo , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de Proteína
18.
J Biol Chem ; 294(45): 16663-16671, 2019 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-31537648

RESUMEN

Assembly of the mitochondrial respiratory chain requires the coordinated synthesis of mitochondrial and nuclear encoded subunits, redox co-factor acquisition, and correct joining of the subunits to form functional complexes. The conserved Cbp3-Cbp6 chaperone complex binds newly synthesized cytochrome b and supports the ordered acquisition of the heme co-factors. Moreover, it functions as a translational activator by interacting with the mitoribosome. Cbp3 consists of two distinct domains: an N-terminal domain present in mitochondrial Cbp3 homologs and a highly conserved C-terminal domain comprising a ubiquinol-cytochrome c chaperone region. Here, we solved the crystal structure of this C-terminal domain from a bacterial homolog at 1.4 Å resolution, revealing a unique all-helical fold. This structure allowed mapping of the interaction sites of yeast Cbp3 with Cbp6 and cytochrome b via site-specific photo-cross-linking. We propose that mitochondrial Cbp3 homologs carry an N-terminal extension that positions the conserved C-terminal domain at the ribosomal tunnel exit for an efficient interaction with its substrate, the newly synthesized cytochrome b protein.


Asunto(s)
Citocromos b/metabolismo , Proteínas de la Membrana/metabolismo , Mitocondrias/metabolismo , Chaperonas Moleculares/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Brucella abortus/metabolismo , Cristalografía por Rayos X , Citocromos b/química , Citocromos b/genética , Proteínas del Complejo de Cadena de Transporte de Electrón/química , Proteínas del Complejo de Cadena de Transporte de Electrón/metabolismo , Proteínas de la Membrana/química , Proteínas de la Membrana/genética , Proteínas Mitocondriales/genética , Proteínas Mitocondriales/metabolismo , Chaperonas Moleculares/química , Chaperonas Moleculares/genética , Dominios Proteicos , Dominios y Motivos de Interacción de Proteínas , Estructura Terciaria de Proteína , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Alineación de Secuencia
19.
Bioinformatics ; 35(15): 2677-2679, 2019 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30590407

RESUMEN

MOTIVATION: Residue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. RESULTS: Here, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods. AVAILABILITY AND IMPLEMENTATION: PconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a GCC compiler. It does not require a GPU nor special hardware. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Profundo , Programas Informáticos
20.
PLoS Comput Biol ; 15(7): e1007186, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31329574

RESUMEN

Intrinsic disorder is more abundant in eukaryotic than prokaryotic proteins. Methods predicting intrinsic disorder are based on the amino acid sequence of a protein. Therefore, there must exist an underlying difference in the sequences between eukaryotic and prokaryotic proteins causing the (predicted) difference in intrinsic disorder. By comparing proteins, from complete eukaryotic and prokaryotic proteomes, we show that the difference in intrinsic disorder emerges from the linker regions connecting Pfam domains. Eukaryotic proteins have more extended linker regions, and in addition, the eukaryotic linkers are significantly more disordered, 38% vs. 12-16% disordered residues. Next, we examined the underlying reason for the increase in disorder in eukaryotic linkers, and we found that the changes in abundance of only three amino acids cause the increase. Eukaryotic proteins contain 8.6% serine; while prokaryotic proteins have 6.5%, eukaryotic proteins also contain 5.4% proline and 5.3% isoleucine compared with 4.0% proline and ≈ 7.5% isoleucine in the prokaryotes. All these three differences contribute to the increased disorder in eukaryotic proteins. It is tempting to speculate that the increase in serine frequencies in eukaryotes is related to regulation by kinases, but direct evidence for this is lacking. The differences are observed in all phyla, protein families, structural regions and type of protein but are most pronounced in disordered and linker regions. The observation that differences in the abundance of three amino acids cause the difference in disorder between eukaryotic and prokaryotic proteins raises the question: Are amino acid frequencies different in eukaryotic linkers because the linkers are more disordered or do the differences cause the increased disorder?


Asunto(s)
Células Eucariotas/metabolismo , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Células Procariotas/metabolismo , Aminoácidos/química , Animales , Biología Computacional , Bases de Datos de Proteínas , Evolución Molecular , Humanos , Proteínas Intrínsecamente Desordenadas/genética , Isoleucina/química , Prolina/química , Dominios Proteicos , Selección Genética , Serina/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA