Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 619(7971): 811-818, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37407817

RESUMO

RNA viruses have evolved elaborate strategies to protect their genomes, including 5' capping. However, until now no RNA 5' cap has been identified for hepatitis C virus1,2 (HCV), which causes chronic infection, liver cirrhosis and cancer3. Here we demonstrate that the cellular metabolite flavin adenine dinucleotide (FAD) is used as a non-canonical initiating nucleotide by the viral RNA-dependent RNA polymerase, resulting in a 5'-FAD cap on the HCV RNA. The HCV FAD-capping frequency is around 75%, which is the highest observed for any RNA metabolite cap across all kingdoms of life4-8. FAD capping is conserved among HCV isolates for the replication-intermediate negative strand and partially for the positive strand. It is also observed in vivo on HCV RNA isolated from patient samples and from the liver and serum of a human liver chimeric mouse model. Furthermore, we show that 5'-FAD capping protects RNA from RIG-I mediated innate immune recognition but does not stabilize the HCV RNA. These results establish capping with cellular metabolites as a novel viral RNA-capping strategy, which could be used by other viruses and affect anti-viral treatment outcomes and persistence of infection.


Assuntos
Flavina-Adenina Dinucleotídeo , Hepacivirus , Capuzes de RNA , RNA Viral , Animais , Humanos , Camundongos , Quimera/virologia , Flavina-Adenina Dinucleotídeo/metabolismo , Hepacivirus/genética , Hepacivirus/imunologia , Hepatite C/virologia , Reconhecimento da Imunidade Inata , Fígado/virologia , Estabilidade de RNA , RNA Viral/química , RNA Viral/genética , RNA Viral/imunologia , RNA Viral/metabolismo , RNA Polimerase Dependente de RNA/metabolismo , Replicação Viral/genética , Capuzes de RNA/metabolismo
2.
Nucleic Acids Res ; 52(13): 7971-7986, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-38842942

RESUMO

We present the nuclear magnetic resonance spectroscopy (NMR) solution structure of the 5'-terminal stem loop 5_SL1 (SL1) of the SARS-CoV-2 genome. SL1 contains two A-form helical elements and two regions with non-canonical structure, namely an apical pyrimidine-rich loop and an asymmetric internal loop with one and two nucleotides at the 5'- and 3'-terminal part of the sequence, respectively. The conformational ensemble representing the averaged solution structure of SL1 was validated using NMR residual dipolar coupling (RDC) and small-angle X-ray scattering (SAXS) data. We show that the internal loop is the major binding site for fragments of low molecular weight. This internal loop of SL1 can be stabilized by an A12-C28 interaction that promotes the transient formation of an A+•C base pair. As a consequence, the pKa of the internal loop adenosine A12 is shifted to 5.8, compared to a pKa of 3.63 of free adenosine. Furthermore, applying a recently developed pH-differential mutational profiling (PD-MaP) approach, we not only recapitulated our NMR findings of SL1 but also unveiled multiple sites potentially sensitive to pH across the 5'-UTR of SARS-CoV-2.


Assuntos
Conformação de Ácido Nucleico , RNA Viral , SARS-CoV-2 , SARS-CoV-2/genética , SARS-CoV-2/química , SARS-CoV-2/metabolismo , RNA Viral/química , RNA Viral/genética , RNA Viral/metabolismo , Concentração de Íons de Hidrogênio , Humanos , Espalhamento a Baixo Ângulo , COVID-19/virologia , COVID-19/genética , Espectroscopia de Ressonância Magnética , Difração de Raios X , Sítios de Ligação , Genoma Viral , Pareamento de Bases , Regiões 5' não Traduzidas , Modelos Moleculares
3.
RNA ; 28(7): 937-946, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35483823

RESUMO

We describe the conformational ensemble of the single-stranded r(UCAAUC) oligonucleotide obtained using extensive molecular dynamics (MD) simulations and Rosetta's FARFAR2 algorithm. The conformations observed in MD consist of A-form-like structures and variations thereof. These structures are not present in the pool generated using FARFAR2. By comparing with available nuclear magnetic resonance (NMR) measurements, we show that the presence of both A-form-like and other extended conformations is necessary to quantitatively explain experimental data. To further validate our results, we measure solution X-ray scattering (SAXS) data on the RNA hexamer and find that simulations result in more compact structures than observed from these experiments. The integration of simulations with NMR via a maximum entropy approach shows that small modifications to the MD ensemble lead to an improved description of the conformational ensemble. Nevertheless, we identify persisting discrepancies in matching experimental SAXS data.


Assuntos
Simulação de Dinâmica Molecular , RNA , Espectroscopia de Ressonância Magnética , Oligonucleotídeos , Conformação Proteica , Espalhamento a Baixo Ângulo , Difração de Raios X
4.
J Am Chem Soc ; 145(30): 16557-16572, 2023 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-37479220

RESUMO

Both experimental and theoretical structure determinations of RNAs have remained challenging due to the intrinsic dynamics of RNAs. We report here an integrated nuclear magnetic resonance/molecular dynamics (NMR/MD) structure determination approach to describe the dynamic structure of the CUUG tetraloop. We show that the tetraloop undergoes substantial dynamics, leading to averaging of the experimental data. These dynamics are particularly linked to the temperature-dependent presence of a hydrogen bond within the tetraloop. Interpreting the NMR data by a single structure represents the low-temperature structure well but fails to capture all conformational states occurring at a higher temperature. We integrate MD simulations, starting from structures of CUUG tetraloops within the Protein Data Bank, with an extensive set of NMR data, and provide a structural ensemble that describes the dynamic nature of the tetraloop and the experimental NMR data well. We thus show that one of the most stable and frequently found RNA tetraloops displays substantial dynamics, warranting such an integrated structural approach.


Assuntos
Simulação de Dinâmica Molecular , RNA , RNA/química , Conformação de Ácido Nucleico , Espectroscopia de Ressonância Magnética , Temperatura
5.
Nucleic Acids Res ; 48(11): 5839-5848, 2020 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-32427326

RESUMO

We provide an atomic-level description of the structure and dynamics of the UUCG RNA stem-loop by combining molecular dynamics simulations with experimental data. The integration of simulations with exact nuclear Overhauser enhancements data allowed us to characterize two distinct states of this molecule. The most stable conformation corresponds to the consensus three-dimensional structure. The second state is characterized by the absence of the peculiar non-Watson-Crick interactions in the loop region. By using machine learning techniques we identify a set of experimental measurements that are most sensitive to the presence of non-native states. We find that although our MD ensemble, as well as the consensus UUCG tetraloop structures, are in good agreement with experiments, there are remaining discrepancies. Together, our results show that (i) the MD simulation overstabilize a non-native loop conformation, (ii) eNOE data support its presence with a population of ≈10% and (iii) the structural interpretation of experimental data for dynamic RNAs is highly complex, even for a simple model system such as the UUCG tetraloop.


Assuntos
Espectroscopia de Ressonância Magnética , Simulação de Dinâmica Molecular , Movimento , Conformação de Ácido Nucleico , Sequência de Bases , Teorema de Bayes , Conjuntos de Dados como Assunto , Entropia , RNA/química
6.
J Am Chem Soc ; 143(22): 8333-8343, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-34039006

RESUMO

The 5' untranslated region (UTR) of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome is a conserved, functional and structured genomic region consisting of several RNA stem-loop elements. While the secondary structure of such elements has been determined experimentally, their three-dimensional structures are not known yet. Here, we predict structure and dynamics of five RNA stem loops in the 5'-UTR of SARS-CoV-2 by extensive atomistic molecular dynamics simulations, more than 0.5 ms of aggregate simulation time, in combination with enhanced sampling techniques. We compare simulations with available experimental data, describe the resulting conformational ensembles, and identify the presence of specific structural rearrangements in apical and internal loops that may be functionally relevant. Our atomic-detailed structural predictions reveal a rich dynamics in these RNA molecules, could help the experimental characterization of these systems, and provide putative three-dimensional models for structure-based drug design studies.


Assuntos
COVID-19/virologia , RNA Viral/química , SARS-CoV-2/genética , Regiões 5' não Traduzidas , Sequência de Bases , Genoma Viral , Humanos , Simulação de Dinâmica Molecular , Estrutura Molecular , Conformação de Ácido Nucleico , RNA Viral/genética , SARS-CoV-2/química
7.
RNA ; 25(2): 219-231, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30420522

RESUMO

RNA molecules are highly dynamic systems characterized by a complex interplay between sequence, structure, dynamics, and function. Molecular simulations can potentially provide powerful insights into the nature of these relationships. The analysis of structures and molecular trajectories of nucleic acids can be nontrivial because it requires processing very high-dimensional data that are not easy to visualize and interpret. Here we introduce Barnaba, a Python library aimed at facilitating the analysis of nucleic acid structures and molecular simulations. The software consists of a variety of analysis tools that allow the user to (i) calculate distances between three-dimensional structures using different metrics, (ii) back-calculate experimental data from three-dimensional structures, (iii) perform cluster analysis and dimensionality reductions, (iv) search three-dimensional motifs in PDB structures and trajectories, and (v) construct elastic network models for nucleic acids and nucleic acids-protein complexes. In addition, Barnaba makes it possible to calculate torsion angles, pucker conformations, and to detect base-pairing/base-stacking interactions. Barnaba produces graphics that conveniently visualize both extended secondary structure and dynamics for a set of molecular conformations. The software is available as a command-line tool as well as a library, and supports a variety of file formats such as PDB, dcd, and xtc files. Source code, documentation, and examples are freely available at https://github.com/srnas/barnaba under GNU GPLv3 license.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/ultraestrutura , Software , Pareamento de Bases/genética , Bases de Dados de Proteínas , Modelos Moleculares
8.
PLoS Comput Biol ; 16(4): e1007870, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32339173

RESUMO

Many proteins contain multiple folded domains separated by flexible linkers, and the ability to describe the structure and conformational heterogeneity of such flexible systems pushes the limits of structural biology. Using the three-domain protein TIA-1 as an example, we here combine coarse-grained molecular dynamics simulations with previously measured small-angle scattering data to study the conformation of TIA-1 in solution. We show that while the coarse-grained potential (Martini) in itself leads to too compact conformations, increasing the strength of protein-water interactions results in ensembles that are in very good agreement with experiments. We show how these ensembles can be refined further using a Bayesian/Maximum Entropy approach, and examine the robustness to errors in the energy function. In particular we find that as long as the initial simulation is relatively good, reweighting against experiments is very robust. We also study the relative information in X-ray and neutron scattering experiments and find that refining against the SAXS experiments leads to improvement in the SANS data. Our results suggest a general strategy for studying the conformation of multi-domain proteins in solution that combines coarse-grained simulations with small-angle X-ray scattering data that are generally most easy to obtain. These results may in turn be used to design further small-angle neutron scattering experiments that exploit contrast variation through 1H/2H isotope substitutions.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Espalhamento a Baixo Ângulo , Difração de Raios X , Algoritmos , Biologia Computacional , Nêutrons , Conformação Proteica , Domínios Proteicos , Proteínas/análise , Proteínas/química
9.
Chem Rev ; 118(8): 4177-4338, 2018 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-29297679

RESUMO

With both catalytic and genetic functions, ribonucleic acid (RNA) is perhaps the most pluripotent chemical species in molecular biology, and its functions are intimately linked to its structure and dynamics. Computer simulations, and in particular atomistic molecular dynamics (MD), allow structural dynamics of biomolecular systems to be investigated with unprecedented temporal and spatial resolution. We here provide a comprehensive overview of the fast-developing field of MD simulations of RNA molecules. We begin with an in-depth, evaluatory coverage of the most fundamental methodological challenges that set the basis for the future development of the field, in particular, the current developments and inherent physical limitations of the atomistic force fields and the recent advances in a broad spectrum of enhanced sampling methods. We also survey the closely related field of coarse-grained modeling of RNA systems. After dealing with the methodological aspects, we provide an exhaustive overview of the available RNA simulation literature, ranging from studies of the smallest RNA oligonucleotides to investigations of the entire ribosome. Our review encompasses tetranucleotides, tetraloops, a number of small RNA motifs, A-helix RNA, kissing-loop complexes, the TAR RNA element, the decoding center and other important regions of the ribosome, as well as assorted others systems. Extended sections are devoted to RNA-ion interactions, ribozymes, riboswitches, and protein/RNA complexes. Our overview is written for as broad of an audience as possible, aiming to provide a much-needed interdisciplinary bridge between computation and experiment, together with a perspective on the future of the field.


Assuntos
Simulação de Dinâmica Molecular , Conformação de Ácido Nucleico , RNA/química , Catálise , Simulação por Computador , DNA/química
10.
Nucleic Acids Res ; 46(4): 1674-1683, 2018 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-29272539

RESUMO

We introduce the SPlit-and-conQueR (SPQR) model, a coarse-grained (CG) representation of RNA designed for structure prediction and refinement. In our approach, the representation of a nucleotide consists of a point particle for the phosphate group and an anisotropic particle for the nucleoside. The interactions are, in principle, knowledge-based potentials inspired by the $\mathcal {E}$SCORE function, a base-centered scoring function. However, a special treatment is given to base-pairing interactions and certain geometrical conformations which are lost in a raw knowledge-based model. This results in a representation able to describe planar canonical and non-canonical base pairs and base-phosphate interactions and to distinguish sugar puckers and glycosidic torsion conformations. The model is applied to the folding of several structures, including duplexes with internal loops of non-canonical base pairs, tetraloops, junctions and a pseudoknot. For the majority of these systems, experimental structures are correctly predicted at the level of individual contacts. We also propose a method for efficiently reintroducing atomistic detail from the CG representation.


Assuntos
Modelos Moleculares , RNA/química , Motivos de Nucleotídeos , Nucleotídeos/química , RNA de Cadeia Dupla/química
11.
Biochem Biophys Res Commun ; 498(2): 352-358, 2018 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-29248728

RESUMO

Coarse-grained models can be of great help to address the problem of structure prediction in nucleic acids. On one hand they can make the prediction more efficient, while on the other hand they can also help to identify the essential degrees of freedom and interactions for the description of a number of structures. With the aim to provide an all-atom representation in an explicit solvent to the predictions of our SPlit and conQueR (SPQR) coarse-grained model of RNA, we recently introduced a backmapping procedure which enforces the predicted structure into an atomistic one by means of steered molecular dynamics. These simulations minimize the ERMSD, a particular metric which deals exclusively with the relative arrangement of nucleobases, between the atomistic representation and the target structure. In this paper, we explore the effects of this approach on the resulting interaction networks and backbone conformations by applying it on a set of fragments using as a target their native structure. We find that the geometry of the target structures can be reliably recovered, with limitations in the regions with unpaired bases such as bulges. In addition, we observe that the folding pathway can also change depending on the parameters used in the definition of the ERMSD and the use of other metrics such as the RMSD.


Assuntos
Simulação de Dinâmica Molecular , RNA/química , Conformação de Ácido Nucleico , Ácidos Nucleicos/química , RNA/metabolismo , Solventes/química
12.
Nucleic Acids Res ; 44(12): 5883-91, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27091499

RESUMO

We introduce a method for predicting RNA folding pathways, with an application to the most important RNA tetraloops. The method is based on the idea that ensembles of three-dimensional fragments extracted from high-resolution crystal structures are heterogeneous enough to describe metastable as well as intermediate states. These ensembles are first validated by performing a quantitative comparison against available solution nuclear magnetic resonance (NMR) data of a set of RNA tetranucleotides. Notably, the agreement is better with respect to the one obtained by comparing NMR with extensive all-atom molecular dynamics simulations. We then propose a procedure based on diffusion maps and Markov models that makes it possible to obtain reaction pathways and their relative probabilities from fragment ensembles. This approach is applied to study the helix-to-loop folding pathway of all the tetraloops from the GNRA and UNCG families. The results give detailed insights into the folding mechanism that are compatible with available experimental data and clarify the role of intermediate states observed in previous simulation studies. The method is computationally inexpensive and can be used to study arbitrary conformational transitions.


Assuntos
Oligorribonucleotídeos/química , Dobramento de RNA , RNA/química , Difusão , Cinética , Espectroscopia de Ressonância Magnética , Cadeias de Markov , Simulação de Dinâmica Molecular , Movimento (Física) , Conformação de Ácido Nucleico , Termodinâmica
13.
Biophys J ; 113(2): 257-267, 2017 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-28673616

RESUMO

We report a map of RNA tetraloop conformations constructed by calculating pairwise distances among all experimentally determined four-nucleotide hairpin loops. Tetraloops with similar structures are clustered together and, as expected, the two largest clusters are the canonical GNRA and UNCG folds. We identify clusters corresponding to known tetraloop folds such as GGUG, RNYA, AGNN, and CUUG. These clusters are represented in a simple two-dimensional projection that recapitulates the relationship among the different folds. The cluster analysis also identifies 20 novel tetraloop folds that are peculiar to specific positions in ribosomal RNAs and that are stabilized by tertiary interactions. In our RNA tetraloop database we find a significant number of non-GNRA and non-UNCG sequences adopting the canonical GNRA and UNCG folds. Conversely, we find a significant number of GNRA and UNCG sequences adopting non-GNRA and non-UNCG folds. Our analysis demonstrates that there is not a simple one-to-one, but rather a many-to-many mapping between tetraloop sequence and tetraloop fold.


Assuntos
Conformação de Ácido Nucleico , RNA , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Modelos Genéticos , Modelos Moleculares , Análise de Componente Principal , RNA/química , Estabilidade de RNA
14.
Nucleic Acids Res ; 43(15): 7260-9, 2015 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-26187990

RESUMO

Elastic network models (ENMs) are valuable and efficient tools for characterizing the collective internal dynamics of proteins based on the knowledge of their native structures. The increasing evidence that the biological functionality of RNAs is often linked to their innate internal motions poses the question of whether ENM approaches can be successfully extended to this class of biomolecules. This issue is tackled here by considering various families of elastic networks of increasing complexity applied to a representative set of RNAs. The fluctuations predicted by the alternative ENMs are stringently validated by comparison against extensive molecular dynamics simulations and SHAPE experiments. We find that simulations and experimental data are systematically best reproduced by either an all-atom or a three-beads-per-nucleotide representation (sugar-base-phosphate), with the latter arguably providing the best balance of accuracy and computational complexity.


Assuntos
Modelos Moleculares , RNA/química , Simulação de Dinâmica Molecular , Conformação de Ácido Nucleico , Concentração Osmolar
15.
Nucleic Acids Res ; 42(21): 13306-14, 2014 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-25355509

RESUMO

The intricate network of interactions observed in RNA three-dimensional structures is often described in terms of a multitude of geometrical properties, including helical parameters, base pairing/stacking, hydrogen bonding and backbone conformation. We show that a simple molecular representation consisting in one oriented bead per nucleotide can account for the fundamental structural properties of RNA. In this framework, canonical Watson-Crick, non-Watson-Crick base-pairing and base-stacking interactions can be unambiguously identified within a well-defined interaction shell. We validate this representation by performing two independent, complementary tests. First, we use it to construct a sequence-independent, knowledge-based scoring function for RNA structural prediction, which compares favorably to fully atomistic, state-of-the-art techniques. Second, we define a metric to measure deviation between RNA structures that directly reports on the differences in the base-base interaction network. The effectiveness of this metric is tested with respect to the ability to discriminate between structurally and kinetically distant RNA conformations, performing better compared to standard techniques. Taken together, our results suggest that this minimalist, nucleobase-centric representation captures the main interactions that are relevant for describing RNA structure and dynamics.


Assuntos
RNA/química , Pareamento de Bases , Cinética , Modelos Moleculares , Conformação de Ácido Nucleico , Dobramento de RNA
16.
BMC Bioinformatics ; 16 Suppl 9: S6, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26051557

RESUMO

INTRODUCTION: Riboswitches are cis-acting regulatory RNA elements prevalently located in the leader sequences of bacterial mRNA. An adenine sensing riboswitch cis-regulates adeninosine deaminase gene (add) in Vibrio vulnificus. The structural mechanism regulating its conformational changes upon ligand binding mostly remains to be elucidated. In this open framework it has been suggested that the ligand stabilizes the interaction of the distal "kissing loop" complex. Using accurate full-atom molecular dynamics with explicit solvent in combination with enhanced sampling techniques and advanced analysis methods it could be possible to provide a more detailed perspective on the formation of these tertiary contacts. METHODS: In this work, we used umbrella sampling simulations to study the thermodynamics of the kissing loop complex in the presence and in the absence of the cognate ligand. We enforced the breaking/formation of the loop-loop interaction restraining the distance between the two loops. We also assessed the convergence of the results by using two alternative initialization protocols. A structural analysis was performed using a novel approach to analyze base contacts. RESULTS: Contacts between the two loops were progressively lost when larger inter-loop distances were enforced. Inter-loop Watson-Crick contacts survived at larger separation when compared with non-canonical pairing and stacking interactions. Intra-loop stacking contacts remained formed upon loop undocking. Our simulations qualitatively indicated that the ligand could stabilize the kissing loop complex. We also compared with previously published simulation studies. DISCUSSION AND CONCLUSIONS: Kissing complex stabilization given by the ligand was compatible with available experimental data. However, the dependence of its value on the initialization protocol of the umbrella sampling simulations posed some questions on the quantitative interpretation of the results and called for better converged enhanced sampling simulations.


Assuntos
Adenina/química , Simulação de Dinâmica Molecular , RNA/química , Riboswitch/genética , Humanos , Conformação de Ácido Nucleico , Termodinâmica
17.
Proteins ; 82(2): 288-99, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23934827

RESUMO

We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications.


Assuntos
Modelos Moleculares , Modelos Estatísticos , Algoritmos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Teorema de Bayes , Ligação de Hidrogênio , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Homologia Estrutural de Proteína , Termodinâmica
18.
J Comput Chem ; 34(19): 1697-705, 2013 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-23619610

RESUMO

We present a new software framework for Markov chain Monte Carlo sampling for simulation, prediction, and inference of protein structure. The software package contains implementations of recent advances in Monte Carlo methodology, such as efficient local updates and sampling from probabilistic models of local protein structure. These models form a probabilistic alternative to the widely used fragment and rotamer libraries. Combined with an easily extendible software architecture, this makes PHAISTOS well suited for Bayesian inference of protein structure from sequence and/or experimental data. Currently, two force-fields are available within the framework: PROFASI and OPLS-AA/L, the latter including the generalized Born surface area solvent model. A flexible command-line and configuration-file interface allows users quickly to set up simulations with the desired configuration. PHAISTOS is released under the GNU General Public License v3.0. Source code and documentation are freely available from http://phaistos.sourceforge.net. The software is implemented in C++ and has been tested on Linux and OSX platforms.


Assuntos
Cadeias de Markov , Método de Monte Carlo , Proteínas/química , Software , Teorema de Bayes , Simulação por Computador , Modelos Químicos , Conformação Proteica
19.
Nat Commun ; 14(1): 4175, 2023 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-37443362

RESUMO

Proteins play important roles in biology, biotechnology and pharmacology, and missense variants are a common cause of disease. Discovering functionally important sites in proteins is a central but difficult problem because of the lack of large, systematic data sets. Sequence conservation can highlight residues that are functionally important but is often convoluted with a signal for preserving structural stability. We here present a machine learning method to predict functional sites by combining statistical models for protein sequences with biophysical models of stability. We train the model using multiplexed experimental data on variant effects and validate it broadly. We show how the model can be used to discover active sites, as well as regulatory and binding sites. We illustrate the utility of the model by prospective prediction and subsequent experimental validation on the functional consequences of missense variants in HPRT1 which may cause Lesch-Nyhan syndrome, and pinpoint the molecular mechanisms by which they cause disease.


Assuntos
Hipoxantina Fosforribosiltransferase , Síndrome de Lesch-Nyhan , Humanos , Estudos Prospectivos , Hipoxantina Fosforribosiltransferase/genética , Hipoxantina Fosforribosiltransferase/metabolismo , Proteínas/genética , Mutação de Sentido Incorreto
20.
J Pers Med ; 12(10)2022 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-36294727

RESUMO

BACKGROUND: The application of Machine Learning (ML) to genetic individual-level data represents a foreseeable advancement for the field, which is still in its infancy. Here, we aimed to evaluate the feasibility and accuracy of an ML-based model for disease risk prediction applied to Primary Biliary Cholangitis (PBC). METHODS: Genome-wide significant variants identified in subjects of European ancestry in the recently released second international meta-analysis of GWAS in PBC were used as input data. Quality-checked, individual genomic data from two Italian cohorts were used. The ML included the following steps: import of genotype and phenotype data, genetic variant selection, supervised classification of PBC by genotype, generation of "if-then" rules for disease prediction by logic learning machine (LLM), and model validation in a different cohort. RESULTS: The training cohort included 1345 individuals: 444 were PBC cases and 901 were healthy controls. After pre-processing, 41,899 variants entered the analysis. Several configurations of parameters related to feature selection were simulated. The best LLM model reached an Accuracy of 71.7%, a Matthews correlation coefficient of 0.29, a Youden's value of 0.21, a Sensitivity of 0.28, a Specificity of 0.93, a Positive Predictive Value of 0.66, and a Negative Predictive Value of 0.72. Thirty-eight rules were generated. The rule with the highest covering (19.14) included the following genes: RIN3, KANSL1, TIMMDC1, TNPO3. The validation cohort included 834 individuals: 255 cases and 579 controls. By applying the ruleset derived in the training cohort, the Area under the Curve of the model was 0.73. CONCLUSIONS: This study represents the first illustration of an ML model applied to common variants associated with PBC. Our approach is computationally feasible, leverages individual-level data to generate intelligible rules, and can be used for disease prediction in at-risk individuals.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA