Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37332013

RESUMEN

We report the structure-based pathogenicity relationship identifier (SPRI), a novel computational tool for accurate evaluation of pathological effects of missense single mutations and prediction of higher-order spatially organized units of mutational clusters. SPRI can effectively extract properties determining pathogenicity encoded in protein structures, and can identify deleterious missense mutations of germ line origin associated with Mendelian diseases, as well as mutations of somatic origin associated with cancer drivers. It compares favorably to other methods in predicting deleterious mutations. Furthermore, SPRI can discover spatially organized pathogenic higher-order spatial clusters (patHOS) of deleterious mutations, including those of low recurrence, and can be used for discovery of candidate cancer driver genes and driver mutations. We further demonstrate that SPRI can take advantage of AlphaFold2 predicted structures and can be deployed for saturation mutation analysis of the whole human proteome.


Asunto(s)
Mutación Missense , Neoplasias , Humanos , Virulencia , Mutación , Neoplasias/genética , Biología Computacional/métodos
2.
J Am Chem Soc ; 140(3): 1105-1115, 2018 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-29262680

RESUMEN

Outer membrane protein G (OmpG) from Escherichia coli has exhibited pH-dependent gating that can be employed by bacteria to alter the permeability of their outer membranes in response to environmental changes. We developed a computational model, Protein Topology of Zoetic Loops (Pretzel), to investigate the roles of OmpG extracellular loops implicated in gating. The key interactions predicted by our model were verified by single-channel recording data. Our results indicate that the gating equilibrium is primarily controlled by an electrostatic interaction network formed between the gating loop and charged residues in the lumen. The results shed light on the mechanism of OmpG gating and will provide a fundamental basis for the engineering of OmpG as a nanopore sensor. Our computational Pretzel model could be applied to other outer membrane proteins that contain intricate dynamic loops that are functionally important.


Asunto(s)
Proteínas de la Membrana Bacteriana Externa/metabolismo , Escherichia coli K12/metabolismo , Proteínas de Escherichia coli/metabolismo , Porinas/metabolismo , Proteínas de la Membrana Bacteriana Externa/química , Escherichia coli K12/química , Proteínas de Escherichia coli/química , Concentración de Iones de Hidrógeno , Activación del Canal Iónico , Modelos Moleculares , Porinas/química , Conformación Proteica , Electricidad Estática
3.
Curr Opin Struct Biol ; 71: 200-214, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34399301

RESUMEN

Computational three-dimensional chromatin modeling has helped uncover principles of genome organization. Here, we discuss methods for modeling three-dimensional chromatin structures, with focus on a minimalistic polymer model which inverts population Hi-C into single-cell conformations. Utilizing only basic physical properties, this model reveals that a few specific Hi-C interactions can fold chromatin into conformations consistent with single-cell imaging, Dip-C, and FISH measurements. Aggregated single-cell chromatin conformations also reproduce Hi-C frequencies. This approach allows quantification of structural heterogeneity and discovery of many-body interaction units and has revealed additional insights, including (1) topologically associating domains as a byproduct of folding driven by specific interactions, (2) cell subpopulations with different structural scaffolds are developmental stage dependent, and (3) the functional landscape of many-body units within enhancer-rich regions. We also discuss these findings in relation to the genome structure-function relationship.


Asunto(s)
Cromatina , Cromosomas , Ensamble y Desensamble de Cromatina , Genoma , Conformación Molecular
4.
Methods Mol Biol ; 2186: 159-169, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32918736

RESUMEN

Bacterial porins often exhibit ion conductance and gating behavior which can be modulated by pH. However, the underlying control mechanism of gating is often complex, and direct inspection of the protein structure is generally insufficient for full mechanistic understanding. Here we describe Pretzel, a computational framework that can effectively model loop-based gating events in membrane proteins. Our method combines Monte Carlo conformational sampling, structure clustering, ensemble energy evaluation, and a topological gating criterion to model the equilibrium gating state under the pH environment of interest. We discuss details of applying Pretzel to the porin outer membrane protein G (OmpG).


Asunto(s)
Proteínas de la Membrana Bacteriana Externa/química , Proteínas de Escherichia coli/química , Activación del Canal Iónico , Simulación de Dinámica Molecular , Porinas/química , Proteínas de la Membrana Bacteriana Externa/metabolismo , Proteínas de Escherichia coli/metabolismo , Concentración de Iones de Hidrógeno , Método de Montecarlo , Porinas/metabolismo , Dominios Proteicos
5.
Nat Commun ; 12(1): 205, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33420075

RESUMEN

Single-cell chromatin studies provide insights into how chromatin structure relates to functions of individual cells. However, balancing high-resolution and genome wide-coverage remains challenging. We describe a computational method for the reconstruction of large 3D-ensembles of single-cell (sc) chromatin conformations from population Hi-C that we apply to study embryogenesis in Drosophila. With minimal assumptions of physical properties and without adjustable parameters, our method generates large ensembles of chromatin conformations via deep-sampling. Our method identifies specific interactions, which constitute 5-6% of Hi-C frequencies, but surprisingly are sufficient to drive chromatin folding, giving rise to the observed Hi-C patterns. Modeled sc-chromatins quantify chromatin heterogeneity, revealing significant changes during embryogenesis. Furthermore, >50% of modeled sc-chromatin maintain topologically associating domains (TADs) in early embryos, when no population TADs are perceptible. Domain boundaries become fixated during development, with strong preference at binding-sites of insulator-complexes upon the midblastula transition. Overall, high-resolution 3D-ensembles of sc-chromatin conformations enable further in-depth interpretation of population Hi-C, improving understanding of the structure-function relationship of genome organization.


Asunto(s)
Ensamble y Desensamble de Cromatina , Cromatina/química , Drosophila/genética , Desarrollo Embrionario , Animales , Biofisica , Cromosomas de Insectos/química , Cromosomas de Insectos/genética , Biología Computacional , Heterogeneidad Genética , Genoma , Modelos Moleculares , Conformación Molecular
6.
Genome Biol ; 21(1): 13, 2020 01 16.
Artículo en Inglés | MEDLINE | ID: mdl-31948478

RESUMEN

Chromatin interactions are important for gene regulation and cellular specialization. Emerging evidence suggests many-body spatial interactions play important roles in condensing super-enhancer regions into a cohesive transcriptional apparatus. Chromosome conformation studies using Hi-C are limited to pairwise, population-averaged interactions; therefore unsuitable for direct assessment of many-body interactions. We describe a computational model, CHROMATIX, which reconstructs ensembles of single-cell chromatin structures by deconvolving Hi-C data and identifies significant many-body interactions. For a diverse set of highly active transcriptional loci with at least 2 super-enhancers, we detail the many-body functional landscape and show DNase accessibility, POLR2A binding, and decreased H3K27me3 are predictive of interaction-enriched regions.


Asunto(s)
Cromatina/química , Modelos Genéticos , Transcripción Genética , Biología Computacional/métodos , Elementos de Facilitación Genéticos , Genoma , Aprendizaje Automático , Regiones Promotoras Genéticas , Análisis de la Célula Individual
7.
Artículo en Inglés | MEDLINE | ID: mdl-34085045

RESUMEN

In this study, we focus on the following question: do genomic regions enriched in cancer variant mutations have significantly different chromatin folding patterns? We utilize publicly available Hi-C data to characterize chromatin folding patterns in healthy (GM12878) and cancer (K562) cells based on status of A/B compartmentalization and random vs non-random chromatin physical interactions. We then perform statistical testing to assess if chromatin folding patterns in cancer variant-enriched loci are significantly different from non-enriched loci. Our results indicate that loci with cancer variant status have significantly altered (FDR < 0.05) chromatin folding patterns.

8.
Artículo en Inglés | MEDLINE | ID: mdl-34136829

RESUMEN

Missense SNPs are key factors contributing towards many Mendelian disorders and complex diseases. Identifying whether a single amino acid substitution will lead to pathological effects is important for interpreting personal genome and for precision medicine. In this study, we describe a novel method for predicting whether a missense SNP likely brings about pathological effects. Our approach integrates sequence information, biophysical properties, and topological properties of protein structures. In our test dataset consisting of 500 deleterious variants and 500 neutral, our method achieves an accuracy of 0.823. The ROC curve of model has an AUC of 0.910. Our methods outperforms two well known methods, and is comparable with the widely used Polyphen-2 method, while requiring a much smaller amount (approximately 25%) of training data. Our method can be used to aid in distinguishing driver and passenger mutations in cancer and in assessing missense mutations assocaited with rare diseases. It can also be used to identifying mutations in rare disease where only limited patient exome data exsit.

9.
Artículo en Inglés | MEDLINE | ID: mdl-35261984

RESUMEN

With the rapid progress of cancer genome studies, many missense mutations in populations of somatic cells of different cancer types and at different stages have been identified. However, it is challenging to understand the implications of these cancer-related variants. We have developed a computational method that integrates structural, topographical, and evolutionary information for assessments of biochemical effects and the extent of deleteriousness of the cancer-related variants. We have mapped somatic missense mutations from the Catalogue of Somatic Mutations In Cancer (COSMIC) to 3D structures in the Protein Data Bank (PDB). Our results show that a large portion of these missense mutations is located on protein surface pockets, which often serve as a structural and functional unit of cancer variants. We provide detailed analysis of several examples and assessment on the importance of these variants, including prediction of previously unreported cancer-variants, along with independent evidence from the literature. Furthermore, we show our predictions can inform on the functional roles and the mechanism of predicted cancer variants.

10.
Artículo en Inglés | MEDLINE | ID: mdl-29780972

RESUMEN

Information on protein hydrogen exchange can help delineate key regions involved in protein-protein interactions and provides important insight towards determining functional roles of genetic variants and their possible mechanisms in disease processes. Previous studies have shown that the degree of hydrogen exchange is affected by hydrogen bond formations, solvent accessibility, proximity to other residues, and experimental conditions. However, a general predictive method for identifying residues capable of hydrogen exchange transferable to a broad set of proteins is lacking. We have developed a machine learning method based on random forest that can predict whether a residue experiences hydrogen exchange. Using data from the Start2Fold database, which contains information on 13,306 residues (3,790 of which experience hydrogen exchange and 9,516 which do not exchange), our method achieves good performance. Specifically, we achieve an overall out-of-bag (OOB) error, an unbiased estimate of the test set error, of 20.3 percent. Using a randomly selected test data set consisting of 500 residues experiencing hydrogen exchange and 500 which do not, our method achieves an accuracy of 0.79, a recall of 0.74, a precision of 0.82, and an F1 score of 0.78.

11.
Pac Symp Biocomput ; : 159-70, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23424121

RESUMEN

Despite thousands of reported studies unveiling gene-level signatures for complex diseases, few of these techniques work at the single-sample level with explicit underpinning of biological mechanisms. This presents both a critical dilemma in the field of personalized medicine as well as a plethora of opportunities for analysis of RNA-seq data. In this study, we hypothesize that the "Functional Analysis of Individual Microarray Expression" (FAIME) method we developed could be smoothly extended to RNA-seq data and unveil intrinsic underlying mechanism signatures across different scales of biological data for the same complex disease. Using publicly available RNA-seq data for gastric cancer, we confirmed the effectiveness of this method (i) to translate each sample transcriptome to pathway-scale scores, (ii) to predict deregulated pathways in gastric cancer against gold standards (FDR<5%, Precision=75%, Recall =92%), and (iii) to predict phenotypes in an independent dataset and expression platform (RNA-seq vs microarrays, Fisher Exact Test p<10(-6)). Measuring at a single-sample level, FAIME could differentiate cancer samples from normal ones; furthermore, it achieved comparative performance in identifying differentially expressed pathways as compared to state-of-the-art cross-sample methods. These results motivate future work on mechanism-level biomarker discovery predictive of diagnoses, treatment, and therapy.


Asunto(s)
Medicina de Precisión/estadística & datos numéricos , Análisis de Secuencia de ARN/estadística & datos numéricos , Transcriptoma , Análisis por Conglomerados , Biología Computacional , Interpretación Estadística de Datos , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Humanos , Modelos Estadísticos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , ARN Neoplásico/genética , Curva ROC , Neoplasias Gástricas/genética
12.
J Parallel Distrib Comput ; 72(1): 83-93, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23125479

RESUMEN

Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly inefficient for large number of sequences. In this paper, we develop a highly scalable parallel algorithm based on domain decomposition, referred to as P-Pyro-Align, to align such large number of reads from single or multiple reference genomes. The proposed alignment algorithm accurately aligns the erroneous reads, and has been implemented on a cluster of workstations using MPI library. Experimental results for different problem sizes are analyzed in terms of execution time, quality of the alignments, and the ability of the algorithm to handle reads from multiple haplotypes. We report high quality multiple alignment of up to 0.5 million reads. The algorithm is shown to be highly scalable and exhibits super-linear speedups with increasing number of processors.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA