Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Bioinformatics ; 40(4)2024 03 29.
Artículo en Inglés | MEDLINE | ID: mdl-38514422

RESUMEN

MOTIVATION: Deep learning algorithms applied to structural biology often struggle to converge to meaningful solutions when limited data is available, since they are required to learn complex physical rules from examples. State-of-the-art force-fields, however, cannot interface with deep learning algorithms due to their implementation. RESULTS: We present MadraX, a forcefield implemented as a differentiable PyTorch module, able to interact with deep learning algorithms in an end-to-end fashion. AVAILABILITY AND IMPLEMENTATION: MadraX documentation, together with tutorials and installation guide, is available at madrax.readthedocs.io.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Documentación
2.
Nucleic Acids Res ; 49(W1): W52-W59, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-34057475

RESUMEN

We provide integrated protein sequence-based predictions via https://bio2byte.be/b2btools/. The aim of our predictions is to identify the biophysical behaviour or features of proteins that are not readily captured by structural biology and/or molecular dynamics approaches. Upload of a FASTA file or text input of a sequence provides integrated predictions from DynaMine backbone and side-chain dynamics, conformational propensities, and derived EFoldMine early folding, DisoMine disorder, and Agmata ß-sheet aggregation. These predictions, several of which were previously not available online, capture 'emergent' properties of proteins, i.e. the inherent biophysical propensities encoded in their sequence, rather than context-dependent behaviour (e.g. final folded state). In addition, upload of a multiple sequence alignment (MSA) in a variety of formats enables exploration of the biophysical variation observed in homologous proteins. The associated plots indicate the biophysical limits of functionally relevant protein behaviour, with unusual residues flagged by a Gaussian mixture model analysis. The prediction results are available as JSON or CSV files and directly accessible via an API. Online visualisation is available as interactive plots, with brief explanations and tutorial pages included. The server and API employ an email-free token-based system that can be used to anonymously access previously generated results.


Asunto(s)
Proteínas/química , Alineación de Secuencia , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Internet
3.
Bioinformatics ; 37(20): 3473-3479, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33983381

RESUMEN

MOTIVATION: Proteins able to undergo liquid-liquid phase separation (LLPS) in vivo and in vitro are drawing a lot of interest, due to their functional relevance for cell life. Nevertheless, the proteome-scale experimental screening of these proteins seems unfeasible, because besides being expensive and time-consuming, LLPS is heavily influenced by multiple environmental conditions such as concentration, pH and temperature, thus requiring a combinatorial number of experiments for each protein. RESULTS: To overcome this problem, we propose a neural network model able to predict the LLPS behavior of proteins given specified experimental conditions, effectively predicting the outcome of in vitro experiments. Our model can be used to rapidly screen proteins and experimental conditions searching for LLPS, thus reducing the search space that needs to be covered experimentally. We experimentally validate Droppler's prediction on the TAR DNA-binding protein in different experimental conditions, showing the consistency of its predictions. AVAILABILITY AND IMPLEMENTATION: A python implementation of Droppler is available at https://bitbucket.org/grogdrinker/droppler. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

4.
Nucleic Acids Res ; 48(W1): W36-W40, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32459331

RESUMEN

Nuclear magnetic resonance (NMR) spectroscopy data provides valuable information on the behaviour of proteins in solution. The primary data to determine when studying proteins are the per-atom NMR chemical shifts, which reflect the local environment of atoms and provide insights into amino acid residue dynamics and conformation. Within an amino acid residue, chemical shifts present multi-dimensional and complexly cross-correlated information, making them difficult to analyse. The ShiftCrypt method, based on neural network auto-encoder architecture, compresses the per-amino acid chemical shift information in a single, interpretable, amino acid-type independent value that reflects the biophysical state of a residue. We here present the ShiftCrypt web server, which makes the method readily available. The server accepts chemical shifts input files in the NMR Exchange Format (NEF) or NMR-STAR format, executes ShiftCrypt and visualises the results, which are also accessible via an API. It also enables the "biophysically-based" pairwise alignment of two proteins based on their ShiftCrypt values. This approach uses Dynamic Time Warping and can optionally include their amino acid code information, and has applications in, for example, the alignment of disordered regions. The server uses a token-based system to ensure the anonymity of the users and results. The web server is available at www.bio2byte.be/shiftcrypt.


Asunto(s)
Resonancia Magnética Nuclear Biomolecular/métodos , Proteínas/química , Programas Informáticos , Aminoácidos/química , Redes Neurales de la Computación , Desnaturalización Proteica , Pliegue de Proteína , Desplegamiento Proteico
5.
Bioinformatics ; 36(7): 2076-2081, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31904854

RESUMEN

MOTIVATION: Protein beta-aggregation is an important but poorly understood phenomena involved in diseases as well as in beneficial physiological processes. However, while this task has been investigated for over 50 years, very little is known about its mechanisms of action. Moreover, the identification of regions involved in aggregation is still an open problem and the state-of-the-art methods are often inadequate in real case applications. RESULTS: In this article we present AgMata, an unsupervised tool for the identification of such regions from amino acidic sequence based on a generalized definition of statistical potentials that includes biophysical information. The tool outperforms the state-of-the-art methods on two different benchmarks. As case-study, we applied our tool to human ataxin-3, a protein involved in Machado-Joseph disease. Interestingly, AgMata identifies aggregation-prone residues that share the very same structural environment. Additionally, it successfully predicts the outcome of in vitro mutagenesis experiments, identifying point mutations that lead to an alteration of the aggregation propensity of the wild-type ataxin-3. AVAILABILITY AND IMPLEMENTATION: A python implementation of the tool is available at https://bitbucket.org/bio2byte/agmata. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Enfermedad de Machado-Joseph , Proteínas , Secuencia de Aminoácidos , Ataxina-3 , Humanos
6.
PLoS Comput Biol ; 16(4): e1007722, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32352965

RESUMEN

Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins and antibodies. In addition, understanding the molecular determinants of the solubility of proteins may be crucial to shed light on the molecular mechanisms of diseases caused by aggregation processes such as amyloidosis. Here we present SKADE, a novel Neural Network protein solubility predictor and we show how it can provide novel insight into the protein solubility mechanisms, thanks to its neural attention architecture. First, we show that SKADE positively compares with state of the art tools while using just the protein sequence as input. Then, thanks to the neural attention mechanism, we use SKADE to investigate the patterns learned during training and we analyse its decision process. We use this peculiarity to show that, while the attention profiles do not correlate with obvious sequence aspects such as biophysical properties of the aminoacids, they suggest that N- and C-termini are the most relevant regions for solubility prediction and are predictive for complex emergent properties such as aggregation-prone regions involved in beta-amyloidosis and contact density. Moreover, SKADE is able to identify mutations that increase or decrease the overall solubility of the protein, allowing it to be used to perform large scale in-silico mutagenesis of proteins in order to maximize their solubility.


Asunto(s)
Biología Computacional/métodos , Red Nerviosa/fisiología , Solubilidad , Algoritmos , Secuencia de Aminoácidos/fisiología , Aminoácidos , Animales , Simulación por Computador , Humanos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Proteínas/metabolismo , Programas Informáticos
7.
Bioinformatics ; 35(22): 4617-4623, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-30994888

RESUMEN

MOTIVATION: Eukaryotic cells contain different membrane-delimited compartments, which are crucial for the biochemical reactions necessary to sustain cell life. Recent studies showed that cells can also trigger the formation of membraneless organelles composed by phase-separated proteins to respond to various stimuli. These condensates provide new ways to control the reactions and phase-separation proteins (PSPs) are thus revolutionizing how cellular organization is conceived. The small number of experimentally validated proteins, and the difficulty in discovering them, remain bottlenecks in PSPs research. RESULTS: Here we present PSPer, the first in-silico screening tool for prion-like RNA-binding PSPs. We show that it can prioritize PSPs among proteins containing similar RNA-binding domains, intrinsically disordered regions and prions. PSPer is thus suitable to screen proteomes, identifying the most likely PSPs for further experimental investigation. Moreover, its predictions are fully interpretable in the sense that it assigns specific functional regions to the predicted proteins, providing valuable information for experimental investigation of targeted mutations on these regions. Finally, we show that it can estimate the ability of artificially designed proteins to form condensates (r=-0.87), thus providing an in-silico screening tool for protein design experiments. AVAILABILITY AND IMPLEMENTATION: PSPer is available at bio2byte.com/psp. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas de Unión al ARN/metabolismo , Orgánulos , Priones , Proteoma
8.
Bioinformatics ; 34(18): 3118-3125, 2018 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-29684140

RESUMEN

Motivation: Evolutionary information is crucial for the annotation of proteins in bioinformatics. The amount of retrieved homologs often correlates with the quality of predicted protein annotations related to structure or function. With a growing amount of sequences available, fast and reliable methods for homology detection are essential, as they have a direct impact on predicted protein annotations. Results: We developed a discriminative, alignment-free algorithm for homology detection with quasi-linear complexity, enabling theoretically much faster homology searches. To reach this goal, we convert the protein sequence into numeric biophysical representations. These are shrunk to a fixed length using a novel vector quantization method which uses a Discrete Cosine Transform compression. We then compute, for each compressed representation, similarity scores between proteins with the Dynamic Time Warping algorithm and we feed them into a Random Forest. The WARP performances are comparable with state of the art methods. Availability and implementation: The method is available at http://ibsquare.be/warp. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Compresión de Datos , Anotación de Secuencia Molecular , Programas Informáticos , Factores de Tiempo
9.
Nucleic Acids Res ; 45(W1): W201-W206, 2017 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-28498993

RESUMEN

High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual human genome contains millions of Single Nucleotide Variants: to discriminate the deleterious from the benign ones, a variety of methods have been developed that predict whether a protein-coding variant likely affects the carrier individual's health. We present such a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates. This extensive contextual information is non-linearly mapped into one single deleteriousness score for each variant. Since for the non-expert user it is sometimes still difficult to assess what this score means, how it relates to the encoded protein, and where it originates from, we developed an interactive online framework (http://deogen2.mutaframe.com/) to better present the DEOGEN2 deleteriousness predictions of all possible variants in all human proteins. The prediction is visualized so both expert and non-expert users can gain insights into the meaning, protein context and origins of each prediction.


Asunto(s)
Sustitución de Aminoácidos , Proteínas/genética , Programas Informáticos , Gráficos por Computador , Variación Genética , Humanos , Internet , Dominios Proteicos/genética , Pliegue de Proteína
10.
Bioinformatics ; 33(24): 3902-3908, 2017 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-28666322

RESUMEN

MOTIVATION: Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. RESULTS: Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. AVAILABILITY AND IMPLEMENTATION: A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. CONTACT: wim.vranken@vub.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Máquina de Vectores de Soporte , Algoritmos , Cadenas de Markov , Estructura Secundaria de Proteína , Proteínas/química , Programas Informáticos
11.
Hum Mutat ; 38(1): 86-94, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27667481

RESUMEN

Cysteines are among the rarest amino acids in nature, and are both functionally and structurally very important for proteins. The ability of cysteines to form disulfide bonds is especially relevant, both for constraining the folded state of the protein and for performing enzymatic duties. But how does the variation record of human proteins reflect their functional importance and structural role, especially with regard to deleterious mutations? We created HUMCYS, a manually curated dataset of single amino acid variants that (1) have a known disease/neutral phenotypic outcome and (2) cause the loss of a cysteine, in order to investigate how mutated cysteines relate to structural aspects such as surface accessibility and cysteine oxidation state. We also have developed a sequence-based in silico cysteine oxidation predictor to overcome the scarcity of experimentally derived oxidation annotations, and applied it to extend our analysis to classes of proteins for which the experimental determination of their structure is technically challenging, such as transmembrane proteins. Our investigation shows that we can gain insights into the reason behind the outcome of cysteine losses in otherwise uncharacterized proteins, and we discuss the possible molecular mechanisms leading to deleterious phenotypes, such as the involvement of the mutated cysteine in a structurally or enzymatically relevant disulfide bond.


Asunto(s)
Cisteína/genética , Modelos Biológicos , Mutación , Oxidación-Reducción , Algoritmos , Sustitución de Aminoácidos , Codón , Biología Computacional/métodos , Bases de Datos Genéticas , Estudios de Asociación Genética , Humanos , Espacio Intracelular/metabolismo , Polimorfismo de Nucleótido Simple , Transporte de Proteínas , Reproducibilidad de los Resultados , Programas Informáticos , Navegador Web
12.
Bioinformatics ; 31(8): 1219-25, 2015 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-25492406

RESUMEN

MOTIVATION: Cysteine residues have particular structural and functional relevance in proteins because of their ability to form covalent disulfide bonds. Bioinformatics tools that can accurately predict cysteine bonding states are already available, whereas it remains challenging to infer the disulfide connectivity pattern of unknown protein sequences. Improving accuracy in this area is highly relevant for the structural and functional annotation of proteins. RESULTS: We predict the intra-chain disulfide bond connectivity patterns starting from known cysteine bonding states with an evolutionary-based unsupervised approach called Sephiroth that relies on high-quality alignments obtained with HHblits and is based on a coarse-grained cluster-based modelization of tandem cysteine mutations within a protein family. We compared our method with state-of-the-art unsupervised predictors and achieve a performance improvement of 25-27% while requiring an order of magnitude less of aligned homologous sequences (∼10(3) instead of ∼10(4)). AVAILABILITY AND IMPLEMENTATION: The software described in this article and the datasets used are available at http://ibsquare.be/sephiroth. CONTACT: wvranken@vub.ac.be SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.


Asunto(s)
Algoritmos , Cisteína/química , Disulfuros/química , Modelos Estadísticos , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Análisis por Conglomerados , Cisteína/clasificación , Cisteína/genética , Humanos , Datos de Secuencia Molecular , Mutación/genética , Proteínas/análisis , Proteínas/genética , Homología de Secuencia
13.
NAR Genom Bioinform ; 6(3): lqae082, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38984065

RESUMEN

Protein dynamics and related conformational changes are essential for their function but difficult to characterise and interpret. Amino acids in a protein behave according to their local energy landscape, which is determined by their local structural context and environmental conditions. The lowest energy state for a given residue can correspond to sharply defined conformations, e.g. in a stable helix, or can cover a wide range of conformations, e.g. in intrinsically disordered regions. A good definition of such low energy states is therefore important to describe the behaviour of a residue and how it changes with its environment. We propose a data-driven probabilistic definition of six low energy conformational states typically accessible for amino acid residues in proteins. This definition is based on solution NMR information of 1322 proteins through a combined analysis of structure ensembles with interpreted chemical shifts. We further introduce a conformational state variability parameter that captures, based on an ensemble of protein structures from molecular dynamics or other methods, how often a residue moves between these conformational states. The approach enables a different perspective on the local conformational behaviour of proteins that is complementary to their static interpretation from single structure models.

14.
Materials (Basel) ; 17(8)2024 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-38673077

RESUMEN

The laser surface texturing (LST) technique has recently been used to enhance adhesion bond strength in various coating applications and to create structures with controlled hydrophobic or superhydrophobic surfaces. The texturing processing parameters can be adjusted to tune the surface's polarity, thereby controlling the ratio between the polar and dispersed components of the surface free energy and determining its hydrophobic character. The aim of this work is to systematically select appropriate laser and scan head parameters for high-quality surface topography of metal-based materials. A correlation between texturing parameters and wetting properties was made in view of several technological applications, i.e., for the proper growth of conformal layers onto laser-textured metal surfaces. Surface analyses, carried out by scanning electron microscopy and profilometry, reveal the presence of periodic microchannels decorated with laser-induced periodic surface structures (LIPSS) in the direction parallel to the microchannels. The water contact angle varies widely from about 20° to 100°, depending on the treated material (titanium, nickel, etc.). Nowadays, reducing the wettability transition time from hydrophilicity to hydrophobicity, while also changing environmental conditions, remains a challenge. Therefore, the characteristics of environmental dust and its influence on the properties of the picosecond laser-textured surface (e.g., chemical bonding of samples) have been studied while monitoring ambient conditions.

15.
J Mol Biol ; 434(12): 167579, 2022 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-35469832

RESUMEN

The role of intrinsically disordered protein regions (IDRs) in cellular processes has become increasingly evident over the last years. These IDRs continue to challenge structural biology experiments because they lack a well-defined conformation, and bioinformatics approaches that accurately delineate disordered protein regions remain essential for their identification and further investigation. Typically, these predictors use the protein amino acid sequence, without taking into account likely sequence-dependent emergent properties, such as protein backbone dynamics. Here we present DisoMine, a method that predicts protein'long disorder' with recurrent neural networks from simple predictions of protein dynamics, secondary structure and early folding. The tool is fast and requires only a single sequence, making it applicable for large-scale screening, including poorly studied and orphan proteins. DisoMine is a top performer in its category and compares well to disorder prediction approaches using evolutionary information. DisoMine is freely available through an interactive webserver at https://bio2byte.be/disomine/.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Redes Neurales de la Computación , Análisis de Secuencia de Proteína , Programas Informáticos , Secuencia de Aminoácidos , Biología Computacional/métodos , Proteínas Intrínsecamente Desordenadas/química , Estructura Secundaria de Proteína , Análisis de Secuencia de Proteína/métodos
16.
Curr Res Struct Biol ; 4: 167-174, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35669450

RESUMEN

Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.

17.
Nat Commun ; 13(1): 961, 2022 02 18.
Artículo en Inglés | MEDLINE | ID: mdl-35181656

RESUMEN

Structural bioinformatics suffers from the lack of interfaces connecting biological structures and machine learning methods, making the application of modern neural network architectures impractical. This negatively affects the development of structure-based bioinformatics methods, causing a bottleneck in biological research. Here we present PyUUL ( https://pyuul.readthedocs.io/ ), a library to translate biological structures into 3D tensors, allowing an out-of-the-box application of state-of-the-art deep learning algorithms. The library converts biological macromolecules to data structures typical of computer vision, such as voxels and point clouds, for which extensive machine learning research has been performed. Moreover, PyUUL allows an out-of-the box GPU and sparse calculation. Finally, we demonstrate how PyUUL can be used by researchers to address some typical bioinformatics problems, such as structure recognition and docking.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Imagenología Tridimensional/métodos , Redes Neurales de la Computación , Algoritmos , Humanos , Elementos Estructurales de las Proteínas/fisiología
18.
J Mol Cell Biol ; 13(1): 15-28, 2021 04 10.
Artículo en Inglés | MEDLINE | ID: mdl-32976566

RESUMEN

Amyotrophic lateral sclerosis (ALS) is a late-onset neurodegenerative disease selectively affecting motor neurons, leading to progressive paralysis. Although most cases are sporadic, ∼10% are familial. Similar proteins are found in aggregates in sporadic and familial ALS, and over the last decade, research has been focused on the underlying nature of this common pathology. Notably, TDP-43 inclusions are found in almost all ALS patients, while FUS inclusions have been reported in some familial ALS patients. Both TDP-43 and FUS possess 'low-complexity domains' (LCDs) and are considered as 'intrinsically disordered proteins', which form liquid droplets in vitro due to the weak interactions caused by the LCDs. Dysfunctional 'liquid-liquid phase separation' (LLPS) emerged as a new mechanism linking ALS-related proteins to pathogenesis. Here, we review the current state of knowledge on ALS-related gene products associated with a proteinopathy and discuss their status as LLPS proteins. In addition, we highlight the therapeutic potential of targeting LLPS for treating ALS.


Asunto(s)
Esclerosis Amiotrófica Lateral/patología , Proteínas Intrínsecamente Desordenadas/metabolismo , Agregación Patológica de Proteínas/patología , Esclerosis Amiotrófica Lateral/tratamiento farmacológico , Esclerosis Amiotrófica Lateral/genética , Autofagia/efectos de los fármacos , Proteínas de Unión al ADN/antagonistas & inhibidores , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Humanos , Proteínas Intrínsecamente Desordenadas/antagonistas & inhibidores , Proteínas Intrínsecamente Desordenadas/genética , Chaperonas Moleculares/farmacología , Chaperonas Moleculares/uso terapéutico , Mutación , Oligonucleótidos Antisentido/farmacología , Oligonucleótidos Antisentido/uso terapéutico , Agregación Patológica de Proteínas/tratamiento farmacológico , Agregación Patológica de Proteínas/genética , Pliegue de Proteína/efectos de los fármacos , Proteína FUS de Unión a ARN/antagonistas & inhibidores , Proteína FUS de Unión a ARN/genética , Proteína FUS de Unión a ARN/metabolismo
19.
Nat Commun ; 11(1): 3314, 2020 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-32620861

RESUMEN

The amyloid conformation can be adopted by a variety of sequences, but the precise boundaries of amyloid sequence space are still unclear. The currently charted amyloid sequence space is strongly biased towards hydrophobic, beta-sheet prone sequences that form the core of globular proteins and by Q/N/Y rich yeast prions. Here, we took advantage of the increasing amount of high-resolution structural information on amyloid cores currently available in the protein databank to implement a machine learning approach, named Cordax (https://cordax.switchlab.org), that explores amyloid sequence beyond its current boundaries. Clustering by t-Distributed Stochastic Neighbour Embedding (t-SNE) shows how our approach resulted in an expansion away from hydrophobic amyloid sequences towards clusters of lower aliphatic content and higher charge, or regions of helical and disordered propensities. These clusters uncouple amyloid propensity from solubility representing sequence flavours compatible with surface-exposed patches in globular proteins, functional amyloids or sequences associated to liquid-liquid phase transitions.


Asunto(s)
Algoritmos , Amiloide/química , Proteínas Amiloidogénicas/química , Modelos Químicos , Péptidos/química , Amiloide/metabolismo , Proteínas Amiloidogénicas/metabolismo , Amiloidosis/metabolismo , Humanos , Interacciones Hidrofóbicas e Hidrofílicas , Aprendizaje Automático , Péptidos/metabolismo , Conformación Proteica , Ingeniería de Proteínas/métodos , Solubilidad
20.
Nat Commun ; 10(1): 2511, 2019 06 07.
Artículo en Inglés | MEDLINE | ID: mdl-31175284

RESUMEN

Chemical shifts (CS) are determined from NMR experiments and represent the resonance frequency of the spin of atoms in a magnetic field. They contain a mixture of information, encompassing the in-solution conformations a protein adopts, as well as the movements it performs. Due to their intrinsically multi-faceted nature, CS are difficult to interpret and visualize. Classical approaches for the analysis of CS aim to extract specific protein-related properties, thus discarding a large amount of information that cannot be directly linked to structural features of the protein. Here we propose an autoencoder-based method, called ShiftCrypt, that provides a way to analyze, compare and interpret CS in their native, multidimensional space. We show that ShiftCrypt conserves information about the most common structural features. In addition, it can be used to identify hidden similarities between diverse proteins and peptides, and differences between the same protein in two different binding states.


Asunto(s)
Redes Neurales de la Computación , Resonancia Magnética Nuclear Biomolecular/métodos , Proteínas/ultraestructura , Aminoácidos , Fenómenos Biofísicos , Imagen por Resonancia Magnética , Espectroscopía de Resonancia Magnética , Modelos Moleculares , Estructura Secundaria de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA