Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 619(7971): 811-818, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37407817

RESUMEN

RNA viruses have evolved elaborate strategies to protect their genomes, including 5' capping. However, until now no RNA 5' cap has been identified for hepatitis C virus1,2 (HCV), which causes chronic infection, liver cirrhosis and cancer3. Here we demonstrate that the cellular metabolite flavin adenine dinucleotide (FAD) is used as a non-canonical initiating nucleotide by the viral RNA-dependent RNA polymerase, resulting in a 5'-FAD cap on the HCV RNA. The HCV FAD-capping frequency is around 75%, which is the highest observed for any RNA metabolite cap across all kingdoms of life4-8. FAD capping is conserved among HCV isolates for the replication-intermediate negative strand and partially for the positive strand. It is also observed in vivo on HCV RNA isolated from patient samples and from the liver and serum of a human liver chimeric mouse model. Furthermore, we show that 5'-FAD capping protects RNA from RIG-I mediated innate immune recognition but does not stabilize the HCV RNA. These results establish capping with cellular metabolites as a novel viral RNA-capping strategy, which could be used by other viruses and affect anti-viral treatment outcomes and persistence of infection.


Asunto(s)
Flavina-Adenina Dinucleótido , Hepacivirus , Caperuzas de ARN , ARN Viral , Animales , Humanos , Ratones , Quimera/virología , Flavina-Adenina Dinucleótido/metabolismo , Hepacivirus/genética , Hepacivirus/inmunología , Hepatitis C/virología , Reconocimiento de Inmunidad Innata , Hígado/virología , Estabilidad del ARN , ARN Viral/química , ARN Viral/genética , ARN Viral/inmunología , ARN Viral/metabolismo , ARN Polimerasa Dependiente del ARN/metabolismo , Replicación Viral/genética , Caperuzas de ARN/metabolismo
2.
Nucleic Acids Res ; 52(13): 7971-7986, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-38842942

RESUMEN

We present the nuclear magnetic resonance spectroscopy (NMR) solution structure of the 5'-terminal stem loop 5_SL1 (SL1) of the SARS-CoV-2 genome. SL1 contains two A-form helical elements and two regions with non-canonical structure, namely an apical pyrimidine-rich loop and an asymmetric internal loop with one and two nucleotides at the 5'- and 3'-terminal part of the sequence, respectively. The conformational ensemble representing the averaged solution structure of SL1 was validated using NMR residual dipolar coupling (RDC) and small-angle X-ray scattering (SAXS) data. We show that the internal loop is the major binding site for fragments of low molecular weight. This internal loop of SL1 can be stabilized by an A12-C28 interaction that promotes the transient formation of an A+•C base pair. As a consequence, the pKa of the internal loop adenosine A12 is shifted to 5.8, compared to a pKa of 3.63 of free adenosine. Furthermore, applying a recently developed pH-differential mutational profiling (PD-MaP) approach, we not only recapitulated our NMR findings of SL1 but also unveiled multiple sites potentially sensitive to pH across the 5'-UTR of SARS-CoV-2.


Asunto(s)
Conformación de Ácido Nucleico , ARN Viral , SARS-CoV-2 , SARS-CoV-2/genética , SARS-CoV-2/química , SARS-CoV-2/metabolismo , ARN Viral/química , ARN Viral/genética , ARN Viral/metabolismo , Concentración de Iones de Hidrógeno , Humanos , Dispersión del Ángulo Pequeño , COVID-19/virología , COVID-19/genética , Espectroscopía de Resonancia Magnética , Difracción de Rayos X , Sitios de Unión , Genoma Viral , Emparejamiento Base , Regiones no Traducidas 5' , Modelos Moleculares
3.
RNA ; 28(7): 937-946, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35483823

RESUMEN

We describe the conformational ensemble of the single-stranded r(UCAAUC) oligonucleotide obtained using extensive molecular dynamics (MD) simulations and Rosetta's FARFAR2 algorithm. The conformations observed in MD consist of A-form-like structures and variations thereof. These structures are not present in the pool generated using FARFAR2. By comparing with available nuclear magnetic resonance (NMR) measurements, we show that the presence of both A-form-like and other extended conformations is necessary to quantitatively explain experimental data. To further validate our results, we measure solution X-ray scattering (SAXS) data on the RNA hexamer and find that simulations result in more compact structures than observed from these experiments. The integration of simulations with NMR via a maximum entropy approach shows that small modifications to the MD ensemble lead to an improved description of the conformational ensemble. Nevertheless, we identify persisting discrepancies in matching experimental SAXS data.


Asunto(s)
Simulación de Dinámica Molecular , ARN , Espectroscopía de Resonancia Magnética , Oligonucleótidos , Conformación Proteica , Dispersión del Ángulo Pequeño , Difracción de Rayos X
4.
J Am Chem Soc ; 145(30): 16557-16572, 2023 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-37479220

RESUMEN

Both experimental and theoretical structure determinations of RNAs have remained challenging due to the intrinsic dynamics of RNAs. We report here an integrated nuclear magnetic resonance/molecular dynamics (NMR/MD) structure determination approach to describe the dynamic structure of the CUUG tetraloop. We show that the tetraloop undergoes substantial dynamics, leading to averaging of the experimental data. These dynamics are particularly linked to the temperature-dependent presence of a hydrogen bond within the tetraloop. Interpreting the NMR data by a single structure represents the low-temperature structure well but fails to capture all conformational states occurring at a higher temperature. We integrate MD simulations, starting from structures of CUUG tetraloops within the Protein Data Bank, with an extensive set of NMR data, and provide a structural ensemble that describes the dynamic nature of the tetraloop and the experimental NMR data well. We thus show that one of the most stable and frequently found RNA tetraloops displays substantial dynamics, warranting such an integrated structural approach.


Asunto(s)
Simulación de Dinámica Molecular , ARN , ARN/química , Conformación de Ácido Nucleico , Espectroscopía de Resonancia Magnética , Temperatura
5.
Nucleic Acids Res ; 48(11): 5839-5848, 2020 06 19.
Artículo en Inglés | MEDLINE | ID: mdl-32427326

RESUMEN

We provide an atomic-level description of the structure and dynamics of the UUCG RNA stem-loop by combining molecular dynamics simulations with experimental data. The integration of simulations with exact nuclear Overhauser enhancements data allowed us to characterize two distinct states of this molecule. The most stable conformation corresponds to the consensus three-dimensional structure. The second state is characterized by the absence of the peculiar non-Watson-Crick interactions in the loop region. By using machine learning techniques we identify a set of experimental measurements that are most sensitive to the presence of non-native states. We find that although our MD ensemble, as well as the consensus UUCG tetraloop structures, are in good agreement with experiments, there are remaining discrepancies. Together, our results show that (i) the MD simulation overstabilize a non-native loop conformation, (ii) eNOE data support its presence with a population of ≈10% and (iii) the structural interpretation of experimental data for dynamic RNAs is highly complex, even for a simple model system such as the UUCG tetraloop.


Asunto(s)
Espectroscopía de Resonancia Magnética , Simulación de Dinámica Molecular , Movimiento , Conformación de Ácido Nucleico , Secuencia de Bases , Teorema de Bayes , Conjuntos de Datos como Asunto , Entropía , ARN/química
6.
J Am Chem Soc ; 143(22): 8333-8343, 2021 06 09.
Artículo en Inglés | MEDLINE | ID: mdl-34039006

RESUMEN

The 5' untranslated region (UTR) of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome is a conserved, functional and structured genomic region consisting of several RNA stem-loop elements. While the secondary structure of such elements has been determined experimentally, their three-dimensional structures are not known yet. Here, we predict structure and dynamics of five RNA stem loops in the 5'-UTR of SARS-CoV-2 by extensive atomistic molecular dynamics simulations, more than 0.5 ms of aggregate simulation time, in combination with enhanced sampling techniques. We compare simulations with available experimental data, describe the resulting conformational ensembles, and identify the presence of specific structural rearrangements in apical and internal loops that may be functionally relevant. Our atomic-detailed structural predictions reveal a rich dynamics in these RNA molecules, could help the experimental characterization of these systems, and provide putative three-dimensional models for structure-based drug design studies.


Asunto(s)
COVID-19/virología , ARN Viral/química , SARS-CoV-2/genética , Regiones no Traducidas 5' , Secuencia de Bases , Genoma Viral , Humanos , Simulación de Dinámica Molecular , Estructura Molecular , Conformación de Ácido Nucleico , ARN Viral/genética , SARS-CoV-2/química
7.
RNA ; 25(2): 219-231, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30420522

RESUMEN

RNA molecules are highly dynamic systems characterized by a complex interplay between sequence, structure, dynamics, and function. Molecular simulations can potentially provide powerful insights into the nature of these relationships. The analysis of structures and molecular trajectories of nucleic acids can be nontrivial because it requires processing very high-dimensional data that are not easy to visualize and interpret. Here we introduce Barnaba, a Python library aimed at facilitating the analysis of nucleic acid structures and molecular simulations. The software consists of a variety of analysis tools that allow the user to (i) calculate distances between three-dimensional structures using different metrics, (ii) back-calculate experimental data from three-dimensional structures, (iii) perform cluster analysis and dimensionality reductions, (iv) search three-dimensional motifs in PDB structures and trajectories, and (v) construct elastic network models for nucleic acids and nucleic acids-protein complexes. In addition, Barnaba makes it possible to calculate torsion angles, pucker conformations, and to detect base-pairing/base-stacking interactions. Barnaba produces graphics that conveniently visualize both extended secondary structure and dynamics for a set of molecular conformations. The software is available as a command-line tool as well as a library, and supports a variety of file formats such as PDB, dcd, and xtc files. Source code, documentation, and examples are freely available at https://github.com/srnas/barnaba under GNU GPLv3 license.


Asunto(s)
Biología Computacional/métodos , Conformación de Ácido Nucleico , ARN/ultraestructura , Programas Informáticos , Emparejamiento Base/genética , Bases de Datos de Proteínas , Modelos Moleculares
8.
PLoS Comput Biol ; 16(4): e1007870, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32339173

RESUMEN

Many proteins contain multiple folded domains separated by flexible linkers, and the ability to describe the structure and conformational heterogeneity of such flexible systems pushes the limits of structural biology. Using the three-domain protein TIA-1 as an example, we here combine coarse-grained molecular dynamics simulations with previously measured small-angle scattering data to study the conformation of TIA-1 in solution. We show that while the coarse-grained potential (Martini) in itself leads to too compact conformations, increasing the strength of protein-water interactions results in ensembles that are in very good agreement with experiments. We show how these ensembles can be refined further using a Bayesian/Maximum Entropy approach, and examine the robustness to errors in the energy function. In particular we find that as long as the initial simulation is relatively good, reweighting against experiments is very robust. We also study the relative information in X-ray and neutron scattering experiments and find that refining against the SAXS experiments leads to improvement in the SANS data. Our results suggest a general strategy for studying the conformation of multi-domain proteins in solution that combines coarse-grained simulations with small-angle X-ray scattering data that are generally most easy to obtain. These results may in turn be used to design further small-angle neutron scattering experiments that exploit contrast variation through 1H/2H isotope substitutions.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Dispersión del Ángulo Pequeño , Difracción de Rayos X , Algoritmos , Biología Computacional , Neutrones , Conformación Proteica , Dominios Proteicos , Proteínas/análisis , Proteínas/química
9.
Chem Rev ; 118(8): 4177-4338, 2018 04 25.
Artículo en Inglés | MEDLINE | ID: mdl-29297679

RESUMEN

With both catalytic and genetic functions, ribonucleic acid (RNA) is perhaps the most pluripotent chemical species in molecular biology, and its functions are intimately linked to its structure and dynamics. Computer simulations, and in particular atomistic molecular dynamics (MD), allow structural dynamics of biomolecular systems to be investigated with unprecedented temporal and spatial resolution. We here provide a comprehensive overview of the fast-developing field of MD simulations of RNA molecules. We begin with an in-depth, evaluatory coverage of the most fundamental methodological challenges that set the basis for the future development of the field, in particular, the current developments and inherent physical limitations of the atomistic force fields and the recent advances in a broad spectrum of enhanced sampling methods. We also survey the closely related field of coarse-grained modeling of RNA systems. After dealing with the methodological aspects, we provide an exhaustive overview of the available RNA simulation literature, ranging from studies of the smallest RNA oligonucleotides to investigations of the entire ribosome. Our review encompasses tetranucleotides, tetraloops, a number of small RNA motifs, A-helix RNA, kissing-loop complexes, the TAR RNA element, the decoding center and other important regions of the ribosome, as well as assorted others systems. Extended sections are devoted to RNA-ion interactions, ribozymes, riboswitches, and protein/RNA complexes. Our overview is written for as broad of an audience as possible, aiming to provide a much-needed interdisciplinary bridge between computation and experiment, together with a perspective on the future of the field.


Asunto(s)
Simulación de Dinámica Molecular , Conformación de Ácido Nucleico , ARN/química , Catálisis , Simulación por Computador , ADN/química
10.
Nucleic Acids Res ; 46(4): 1674-1683, 2018 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-29272539

RESUMEN

We introduce the SPlit-and-conQueR (SPQR) model, a coarse-grained (CG) representation of RNA designed for structure prediction and refinement. In our approach, the representation of a nucleotide consists of a point particle for the phosphate group and an anisotropic particle for the nucleoside. The interactions are, in principle, knowledge-based potentials inspired by the $\mathcal {E}$SCORE function, a base-centered scoring function. However, a special treatment is given to base-pairing interactions and certain geometrical conformations which are lost in a raw knowledge-based model. This results in a representation able to describe planar canonical and non-canonical base pairs and base-phosphate interactions and to distinguish sugar puckers and glycosidic torsion conformations. The model is applied to the folding of several structures, including duplexes with internal loops of non-canonical base pairs, tetraloops, junctions and a pseudoknot. For the majority of these systems, experimental structures are correctly predicted at the level of individual contacts. We also propose a method for efficiently reintroducing atomistic detail from the CG representation.


Asunto(s)
Modelos Moleculares , ARN/química , Motivos de Nucleótidos , Nucleótidos/química , ARN Bicatenario/química
11.
Biochem Biophys Res Commun ; 498(2): 352-358, 2018 03 29.
Artículo en Inglés | MEDLINE | ID: mdl-29248728

RESUMEN

Coarse-grained models can be of great help to address the problem of structure prediction in nucleic acids. On one hand they can make the prediction more efficient, while on the other hand they can also help to identify the essential degrees of freedom and interactions for the description of a number of structures. With the aim to provide an all-atom representation in an explicit solvent to the predictions of our SPlit and conQueR (SPQR) coarse-grained model of RNA, we recently introduced a backmapping procedure which enforces the predicted structure into an atomistic one by means of steered molecular dynamics. These simulations minimize the ERMSD, a particular metric which deals exclusively with the relative arrangement of nucleobases, between the atomistic representation and the target structure. In this paper, we explore the effects of this approach on the resulting interaction networks and backbone conformations by applying it on a set of fragments using as a target their native structure. We find that the geometry of the target structures can be reliably recovered, with limitations in the regions with unpaired bases such as bulges. In addition, we observe that the folding pathway can also change depending on the parameters used in the definition of the ERMSD and the use of other metrics such as the RMSD.


Asunto(s)
Simulación de Dinámica Molecular , ARN/química , Conformación de Ácido Nucleico , Ácidos Nucleicos/química , ARN/metabolismo , Solventes/química
12.
Nucleic Acids Res ; 44(12): 5883-91, 2016 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-27091499

RESUMEN

We introduce a method for predicting RNA folding pathways, with an application to the most important RNA tetraloops. The method is based on the idea that ensembles of three-dimensional fragments extracted from high-resolution crystal structures are heterogeneous enough to describe metastable as well as intermediate states. These ensembles are first validated by performing a quantitative comparison against available solution nuclear magnetic resonance (NMR) data of a set of RNA tetranucleotides. Notably, the agreement is better with respect to the one obtained by comparing NMR with extensive all-atom molecular dynamics simulations. We then propose a procedure based on diffusion maps and Markov models that makes it possible to obtain reaction pathways and their relative probabilities from fragment ensembles. This approach is applied to study the helix-to-loop folding pathway of all the tetraloops from the GNRA and UNCG families. The results give detailed insights into the folding mechanism that are compatible with available experimental data and clarify the role of intermediate states observed in previous simulation studies. The method is computationally inexpensive and can be used to study arbitrary conformational transitions.


Asunto(s)
Oligorribonucleótidos/química , Pliegue del ARN , ARN/química , Difusión , Cinética , Espectroscopía de Resonancia Magnética , Cadenas de Markov , Simulación de Dinámica Molecular , Movimiento (Física) , Conformación de Ácido Nucleico , Termodinámica
13.
Biophys J ; 113(2): 257-267, 2017 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-28673616

RESUMEN

We report a map of RNA tetraloop conformations constructed by calculating pairwise distances among all experimentally determined four-nucleotide hairpin loops. Tetraloops with similar structures are clustered together and, as expected, the two largest clusters are the canonical GNRA and UNCG folds. We identify clusters corresponding to known tetraloop folds such as GGUG, RNYA, AGNN, and CUUG. These clusters are represented in a simple two-dimensional projection that recapitulates the relationship among the different folds. The cluster analysis also identifies 20 novel tetraloop folds that are peculiar to specific positions in ribosomal RNAs and that are stabilized by tertiary interactions. In our RNA tetraloop database we find a significant number of non-GNRA and non-UNCG sequences adopting the canonical GNRA and UNCG folds. Conversely, we find a significant number of GNRA and UNCG sequences adopting non-GNRA and non-UNCG folds. Our analysis demonstrates that there is not a simple one-to-one, but rather a many-to-many mapping between tetraloop sequence and tetraloop fold.


Asunto(s)
Conformación de Ácido Nucleico , ARN , Análisis por Conglomerados , Bases de Datos de Ácidos Nucleicos , Modelos Genéticos , Modelos Moleculares , Análisis de Componente Principal , ARN/química , Estabilidad del ARN
14.
Nucleic Acids Res ; 43(15): 7260-9, 2015 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-26187990

RESUMEN

Elastic network models (ENMs) are valuable and efficient tools for characterizing the collective internal dynamics of proteins based on the knowledge of their native structures. The increasing evidence that the biological functionality of RNAs is often linked to their innate internal motions poses the question of whether ENM approaches can be successfully extended to this class of biomolecules. This issue is tackled here by considering various families of elastic networks of increasing complexity applied to a representative set of RNAs. The fluctuations predicted by the alternative ENMs are stringently validated by comparison against extensive molecular dynamics simulations and SHAPE experiments. We find that simulations and experimental data are systematically best reproduced by either an all-atom or a three-beads-per-nucleotide representation (sugar-base-phosphate), with the latter arguably providing the best balance of accuracy and computational complexity.


Asunto(s)
Modelos Moleculares , ARN/química , Simulación de Dinámica Molecular , Conformación de Ácido Nucleico , Concentración Osmolar
15.
Nucleic Acids Res ; 42(21): 13306-14, 2014 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-25355509

RESUMEN

The intricate network of interactions observed in RNA three-dimensional structures is often described in terms of a multitude of geometrical properties, including helical parameters, base pairing/stacking, hydrogen bonding and backbone conformation. We show that a simple molecular representation consisting in one oriented bead per nucleotide can account for the fundamental structural properties of RNA. In this framework, canonical Watson-Crick, non-Watson-Crick base-pairing and base-stacking interactions can be unambiguously identified within a well-defined interaction shell. We validate this representation by performing two independent, complementary tests. First, we use it to construct a sequence-independent, knowledge-based scoring function for RNA structural prediction, which compares favorably to fully atomistic, state-of-the-art techniques. Second, we define a metric to measure deviation between RNA structures that directly reports on the differences in the base-base interaction network. The effectiveness of this metric is tested with respect to the ability to discriminate between structurally and kinetically distant RNA conformations, performing better compared to standard techniques. Taken together, our results suggest that this minimalist, nucleobase-centric representation captures the main interactions that are relevant for describing RNA structure and dynamics.


Asunto(s)
ARN/química , Emparejamiento Base , Cinética , Modelos Moleculares , Conformación de Ácido Nucleico , Pliegue del ARN
16.
BMC Bioinformatics ; 16 Suppl 9: S6, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26051557

RESUMEN

INTRODUCTION: Riboswitches are cis-acting regulatory RNA elements prevalently located in the leader sequences of bacterial mRNA. An adenine sensing riboswitch cis-regulates adeninosine deaminase gene (add) in Vibrio vulnificus. The structural mechanism regulating its conformational changes upon ligand binding mostly remains to be elucidated. In this open framework it has been suggested that the ligand stabilizes the interaction of the distal "kissing loop" complex. Using accurate full-atom molecular dynamics with explicit solvent in combination with enhanced sampling techniques and advanced analysis methods it could be possible to provide a more detailed perspective on the formation of these tertiary contacts. METHODS: In this work, we used umbrella sampling simulations to study the thermodynamics of the kissing loop complex in the presence and in the absence of the cognate ligand. We enforced the breaking/formation of the loop-loop interaction restraining the distance between the two loops. We also assessed the convergence of the results by using two alternative initialization protocols. A structural analysis was performed using a novel approach to analyze base contacts. RESULTS: Contacts between the two loops were progressively lost when larger inter-loop distances were enforced. Inter-loop Watson-Crick contacts survived at larger separation when compared with non-canonical pairing and stacking interactions. Intra-loop stacking contacts remained formed upon loop undocking. Our simulations qualitatively indicated that the ligand could stabilize the kissing loop complex. We also compared with previously published simulation studies. DISCUSSION AND CONCLUSIONS: Kissing complex stabilization given by the ligand was compatible with available experimental data. However, the dependence of its value on the initialization protocol of the umbrella sampling simulations posed some questions on the quantitative interpretation of the results and called for better converged enhanced sampling simulations.


Asunto(s)
Adenina/química , Simulación de Dinámica Molecular , ARN/química , Riboswitch/genética , Humanos , Conformación de Ácido Nucleico , Termodinámica
17.
Proteins ; 82(2): 288-99, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23934827

RESUMEN

We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications.


Asunto(s)
Modelos Moleculares , Modelos Estadísticos , Algoritmos , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Teorema de Bayes , Enlace de Hidrógeno , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Homología Estructural de Proteína , Termodinámica
18.
J Comput Chem ; 34(19): 1697-705, 2013 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-23619610

RESUMEN

We present a new software framework for Markov chain Monte Carlo sampling for simulation, prediction, and inference of protein structure. The software package contains implementations of recent advances in Monte Carlo methodology, such as efficient local updates and sampling from probabilistic models of local protein structure. These models form a probabilistic alternative to the widely used fragment and rotamer libraries. Combined with an easily extendible software architecture, this makes PHAISTOS well suited for Bayesian inference of protein structure from sequence and/or experimental data. Currently, two force-fields are available within the framework: PROFASI and OPLS-AA/L, the latter including the generalized Born surface area solvent model. A flexible command-line and configuration-file interface allows users quickly to set up simulations with the desired configuration. PHAISTOS is released under the GNU General Public License v3.0. Source code and documentation are freely available from http://phaistos.sourceforge.net. The software is implemented in C++ and has been tested on Linux and OSX platforms.


Asunto(s)
Cadenas de Markov , Método de Montecarlo , Proteínas/química , Programas Informáticos , Teorema de Bayes , Simulación por Computador , Modelos Químicos , Conformación Proteica
19.
Nat Commun ; 14(1): 4175, 2023 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-37443362

RESUMEN

Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.


Asunto(s)
Hipoxantina Fosforribosiltransferasa , Síndrome de Lesch-Nyhan , Humanos , Estudios Prospectivos , Hipoxantina Fosforribosiltransferasa/genética , Hipoxantina Fosforribosiltransferasa/metabolismo , Proteínas/genética , Mutación Missense
20.
J Pers Med ; 12(10)2022 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-36294727

RESUMEN

BACKGROUND: The application of Machine Learning (ML) to genetic individual-level data represents a foreseeable advancement for the field, which is still in its infancy. Here, we aimed to evaluate the feasibility and accuracy of an ML-based model for disease risk prediction applied to Primary Biliary Cholangitis (PBC). METHODS: Genome-wide significant variants identified in subjects of European ancestry in the recently released second international meta-analysis of GWAS in PBC were used as input data. Quality-checked, individual genomic data from two Italian cohorts were used. The ML included the following steps: import of genotype and phenotype data, genetic variant selection, supervised classification of PBC by genotype, generation of "if-then" rules for disease prediction by logic learning machine (LLM), and model validation in a different cohort. RESULTS: The training cohort included 1345 individuals: 444 were PBC cases and 901 were healthy controls. After pre-processing, 41,899 variants entered the analysis. Several configurations of parameters related to feature selection were simulated. The best LLM model reached an Accuracy of 71.7%, a Matthews correlation coefficient of 0.29, a Youden's value of 0.21, a Sensitivity of 0.28, a Specificity of 0.93, a Positive Predictive Value of 0.66, and a Negative Predictive Value of 0.72. Thirty-eight rules were generated. The rule with the highest covering (19.14) included the following genes: RIN3, KANSL1, TIMMDC1, TNPO3. The validation cohort included 834 individuals: 255 cases and 579 controls. By applying the ruleset derived in the training cohort, the Area under the Curve of the model was 0.73. CONCLUSIONS: This study represents the first illustration of an ML model applied to common variants associated with PBC. Our approach is computationally feasible, leverages individual-level data to generate intelligible rules, and can be used for disease prediction in at-risk individuals.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA