Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Biol Chem ; : 107362, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38735478

RESUMEN

Cooperative interactions in protein-protein interfaces demonstrate the interdependency or the linked network- like behavior and their effect on the coupling of proteins. Cooperative interactions also could cause ripple or allosteric effects at a distance in protein-protein interfaces. Although they are critically important in protein-protein interfaces, it is challenging to determine which amino acid pair interactions are cooperative. In this work we have used Bayesian network modeling, an interpretable machine learning method, combined with molecular dynamics trajectories to identify the residue pairs that show high cooperativity and their allosteric effect in the interface of G protein coupled receptor (GPCR) complexes with Gα subunits. Our results reveal six GPCR:Gα contacts that are common to the different Gα subtypes and show strong cooperativity in the formation of interface. Both the C-terminus helix5 and the core of the G protein are codependent entities and play an important role in GPCR coupling. We show that a promiscuous GPCR coupling to different Gα subtypes, makes all the GPCR:Gα contacts that are specific to each Gα subtype (Gαs, Gαi and Gαq). This work underscores the potential of data-driven Bayesian network modeling in elucidating the intricate dependencies and selectivity determinants in GPCR:G protein complexes, offering valuable insights into the dynamic nature of these essential cellular signaling components.

2.
Res Sq ; 2024 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-38645262

RESUMEN

Enhancers are fundamental to gene regulation. Post-translational modifications by the small ubiquitin-like modifiers (SUMO) modify chromatin regulation enzymes, including histone acetylases and deacetylases. However, it remains unclear whether SUMOylation regulates enhancer marks, acetylation at the 27th lysine residue of the histone H3 protein (H3K27Ac). To investigate whether SUMOylation regulates H3K27Ac, we performed genome-wide ChIP-seq analyses and discovered that knockdown (KD) of the SUMO activating enzyme catalytic subunit UBA2 reduced H3K27Ac at most enhancers. Bioinformatic analysis revealed that TFAP2C-binding sites are enriched in enhancers whose H3K27Ac was reduced by UBA2 KD. ChIP-seq analysis in combination with molecular biological methods showed that TFAP2C binding to enhancers increased upon UBA2 KD or inhibition of SUMOylation by a small molecule SUMOylation inhibitor. However, this is not due to the SUMOylation of TFAP2C itself. Proteomics analysis of TFAP2C interactome on the chromatin identified histone deacetylation (HDAC) and RNA splicing machineries that contain many SUMOylation targets. TFAP2C KD reduced HDAC1 binding to chromatin and increased H3K27Ac marks at enhancer regions, suggesting that TFAP2C is important in recruiting HDAC machinery. Taken together, our findings provide insights into the regulation of enhancer marks by SUMOylation and TFAP2C and suggest that SUMOylation of proteins in the HDAC machinery regulates their recruitments to enhancers.

3.
Cancers (Basel) ; 15(24)2023 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-38136405

RESUMEN

Next-generation cancer and oncology research needs to take full advantage of the multimodal structured, or graph, information, with the graph data types ranging from molecular structures to spatially resolved imaging and digital pathology, biological networks, and knowledge graphs. Graph Neural Networks (GNNs) efficiently combine the graph structure representations with the high predictive performance of deep learning, especially on large multimodal datasets. In this review article, we survey the landscape of recent (2020-present) GNN applications in the context of cancer and oncology research, and delineate six currently predominant research areas. We then identify the most promising directions for future research. We compare GNNs with graphical models and "non-structured" deep learning, and devise guidelines for cancer and oncology researchers or physician-scientists, asking the question of whether they should adopt the GNN methodology in their research pipelines.

4.
bioRxiv ; 2023 Oct 12.
Artículo en Inglés | MEDLINE | ID: mdl-37873104

RESUMEN

Cooperative interactions in protein-protein interfaces demonstrate the interdependency or the linked network-like behavior of interface interactions and their effect on the coupling of proteins. Cooperative interactions also could cause ripple or allosteric effects at a distance in protein-protein interfaces. Although they are critically important in protein-protein interfaces it is challenging to determine which amino acid pair interactions are cooperative. In this work we have used Bayesian network modeling, an interpretable machine learning method, combined with molecular dynamics trajectories to identify the residue pairs that show high cooperativity and their allosteric effect in the interface of G protein-coupled receptor (GPCR) complexes with G proteins. Our results reveal a strong co-dependency in the formation of interface GPCR:G protein contacts. This observation indicates that cooperativity of GPCR:G protein interactions is necessary for the coupling and selectivity of G proteins and is thus critical for receptor function. We have identified subnetworks containing polar and hydrophobic interactions that are common among multiple GPCRs coupling to different G protein subtypes (Gs, Gi and Gq). These common subnetworks along with G protein-specific subnetworks together confer selectivity to the G protein coupling. This work underscores the potential of data-driven Bayesian network modeling in elucidating the intricate dependencies and selectivity determinants in GPCR:G protein complexes, offering valuable insights into the dynamic nature of these essential cellular signaling components.

5.
iScience ; 26(2): 106041, 2023 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-36818303

RESUMEN

Modern artificial neural networks (ANNs) have long been designed on foundations of mathematics as opposed to their original foundations of biomimicry. However, the structure and function of these modern ANNs are often analogous to real-life biological networks. We propose that the ubiquitous information-theoretic principles underlying the development of ANNs are similar to the principles guiding the macro-evolution of biological networks and that insights gained from one field can be applied to the other. We generate hypotheses on the bow-tie network structure of the Janus kinase - signal transducers and activators of transcription (JAK-STAT) pathway, additionally informed by the evolutionary considerations, and carry out ANN simulation experiments to demonstrate that an increase in the network's input and output complexity does not necessarily require a more complex intermediate layer. This observation should guide novel biomarker discovery-namely, to prioritize sections of the biological networks in which information is most compressed as opposed to biomarkers representing the periphery of the network.

6.
Res Sq ; 2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38168398

RESUMEN

While there are currently over 40 replicated genes with mapped risk alleles for Late Onset Alzheimer's disease (LOAD), the Apolipoprotein E locus E4 haplotype is still the biggest driver of risk, with odds ratios for neuropathologically confirmed E44 carriers exceeding 30 (95% confidence interval 16.59-58.75). We sought to address whether the APOE E4 haplotype modifies expression globally through networks of expression to increase LOAD risk. We have used the Human Brainome data to build expression networks comparing APOE E4 carriers to non-carriers using scalable mixed-datatypes Bayesian network (BN) modeling. We have found that VGF had the greatest explanatory weight. High expression of VGF is a protective signal, even on the background of APOE E4 alleles. LOAD risk signals, considering an APOE background, include high levels of SPECC1L, HLA-DRA and RANBP3L. Our findings nominate several new transcripts, taking a combined approach to network building including known LOAD risk loci.

7.
Math Biosci Eng ; 18(6): 8603-8621, 2021 10 09.
Artículo en Inglés | MEDLINE | ID: mdl-34814315

RESUMEN

Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.


Asunto(s)
Biología de Sistemas , Teorema de Bayes , Simulación por Computador
8.
Int J Mol Sci ; 22(5)2021 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-33652558

RESUMEN

Cancer immunotherapy, specifically immune checkpoint blockade, has been found to be effective in the treatment of metastatic cancers. However, only a subset of patients achieve clinical responses. Elucidating pretreatment biomarkers predictive of sustained clinical response is a major research priority. Another research priority is evaluating changes in the immune system before and after treatment in responders vs. nonresponders. Our group has been studying immune networks as an accurate reflection of the global immune state. Flow cytometry (FACS, fluorescence-activated cell sorting) data characterizing immune cell panels in peripheral blood mononuclear cells (PBMC) from gastroesophageal adenocarcinoma (GEA) patients were used to analyze changes in immune networks in this setting. Here, we describe a novel computational pipeline to perform secondary analyses of FACS data using systems biology/machine learning techniques and concepts. The pipeline is centered around comparative Bayesian network analyses of immune networks and is capable of detecting strong signals that conventional methods (such as FlowJo manual gating) might miss. Future studies are planned to validate and follow up the immune biomarkers (and combinations/interactions thereof) associated with clinical responses identified with this computational pipeline.


Asunto(s)
Adenocarcinoma , Citometría de Flujo , Neoplasias Gastrointestinales , Inmunoterapia , Leucocitos Mononucleares , Adenocarcinoma/sangre , Adenocarcinoma/inmunología , Adenocarcinoma/terapia , Neoplasias Gastrointestinales/sangre , Neoplasias Gastrointestinales/inmunología , Neoplasias Gastrointestinales/terapia , Humanos , Leucocitos Mononucleares/inmunología , Leucocitos Mononucleares/metabolismo , Leucocitos Mononucleares/patología
9.
Front Genet ; 11: 648, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32625238

RESUMEN

We propose a novel two-stage analysis strategy to discover candidate genes associated with the particular cancer outcomes in large multimodal genomic cancers databases, such as The Cancer Genome Atlas (TCGA). During the first stage, we use mixed mutual information to perform variable selection; during the second stage, we use scalable Bayesian network (BN) modeling to identify candidate genes and their interactions. Two crucial features of the proposed approach are (i) the ability to handle mixed data types (continuous and discrete, genomic, epigenomic, etc.) and (ii) a flexible boundary between the variable selection and network modeling stages - the boundary that can be adjusted in accordance with the investigators' BN software scalability and hardware implementation. These two aspects result in high generalizability of the proposed analytical framework. We apply the above strategy to three different TCGA datasets (LGG, Brain Lower Grade Glioma; HNSC, Head and Neck Squamous Cell Carcinoma; STES, Stomach and Esophageal Carcinoma), linking multimodal molecular information (SNPs, mRNA expression, DNA methylation) to two clinical outcome variables (tumor status and patient survival). We identify 11 candidate genes, of which 6 have already been directly implicated in the cancer literature. One novel LGG prognostic factor suggested by our analysis, methylation of TMPRSS11F type II transmembrane serine protease, presents intriguing direction for the follow-up studies.

10.
Immunity ; 52(6): 1105-1118.e9, 2020 06 16.
Artículo en Inglés | MEDLINE | ID: mdl-32553173

RESUMEN

The challenges in recapitulating in vivo human T cell development in laboratory models have posed a barrier to understanding human thymopoiesis. Here, we used single-cell RNA sequencing (sRNA-seq) to interrogate the rare CD34+ progenitor and the more differentiated CD34- fractions in the human postnatal thymus. CD34+ thymic progenitors were comprised of a spectrum of specification and commitment states characterized by multilineage priming followed by gradual T cell commitment. The earliest progenitors in the differentiation trajectory were CD7- and expressed a stem-cell-like transcriptional profile, but had also initiated T cell priming. Clustering analysis identified a CD34+ subpopulation primed for the plasmacytoid dendritic lineage, suggesting an intrathymic dendritic specification pathway. CD2 expression defined T cell commitment stages where loss of B cell potential preceded that of myeloid potential. These datasets delineate gene expression profiles spanning key differentiation events in human thymopoiesis and provide a resource for the further study of human T cell development.


Asunto(s)
Diferenciación Celular/genética , Linaje de la Célula/genética , Linfopoyesis/genética , Linfocitos T/metabolismo , Timocitos/metabolismo , Animales , Biomarcadores , Biología Computacional , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Inmunofenotipificación , Ratones , Análisis de la Célula Individual , Linfocitos T/citología , Timocitos/citología , Transcriptoma
11.
J Mol Evol ; 87(4-6): 184-198, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31302723

RESUMEN

Recent developments in sequencing and growth of bioinformatics resources provide us with vast depositories of protein network and single nucleotide polymorphism data. It allows us to re-examine, on a larger and more comprehensive scale, the relationship between protein-protein interactions and protein variability and evolutionary rates. This relationship has remained far from unambiguously resolved for quite a long time, reflecting shifting analysis approaches in the literature, and growing data availability. In this study, we utilized several public genomic databases to investigate this relationship in human, mouse, pig, chicken, and zebrafish. We observed strong non-linear relationship patterns (tending towards convex decreasing function shapes) between protein variability and the density of corresponding protein-protein interactions across all five species. To investigate further, we carried out stochastic simulations, modeling the interplay between protein connectivity and variability. Our results indicate that a simple negative linear correlation model, often suggested (or tacitly assumed) in the literature, as either a null or an alternative hypothesis, is not a good fit with the observed data. After considering different (but still relatively simple, and not overfitting) simulation models, we found that a convex decreasing protein variability-connectivity function (specifically, exponential decay) led to a much better fit with the real data. We conclude that simple correlation models might be inadequate for describing protein variability-connectivity interplay in vertebrates; they often tend towards false negatives (showing no more than marginal linear or rank correlation where there are in fact strong non-random patterns).


Asunto(s)
Evolución Molecular , Modelos Estadísticos , Procesos Estocásticos , Vertebrados/genética , Animales , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Humanos , Dominios y Motivos de Interacción de Proteínas/fisiología
12.
Life (Basel) ; 8(1)2018 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-29419741

RESUMEN

The identity/recognition of tRNAs, in the context of aminoacyl tRNA synthetases (and other molecules), is a complex phenomenon that has major implications ranging from the origins and evolution of translation machinery and genetic code to the evolution and speciation of tRNAs themselves to human mitochondrial diseases to artificial genetic code engineering. Deciphering it via laboratory experiments, however, is difficult and necessarily time- and resource-consuming. In this study, we propose a mathematically rigorous two-pronged in silico approach to identifying and classifying tRNA positions important for tRNA identity/recognition, rooted in machine learning and information-theoretic methodology. We apply Bayesian Network modeling to elucidate the structure of intra-tRNA-molecule relationships, and distribution divergence analysis to identify meaningful inter-molecule differences between various tRNA subclasses. We illustrate the complementary application of these two approaches using tRNA examples across the three domains of life, and identify and discuss important (informative) positions therein. In summary, we deliver to the tRNA research community a novel, comprehensive methodology for identifying the specific elements of interest in various tRNA molecules, which can be followed up by the corresponding experimental work and/or high-resolution position-specific statistical analyses.

13.
Proc Natl Acad Sci U S A ; 114(48): E10359-E10368, 2017 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-29133398

RESUMEN

Long-range intrachromosomal interactions play an important role in 3D chromosome structure and function, but our understanding of how various factors contribute to the strength of these interactions remains poor. In this study we used a recently developed analysis framework for Bayesian network (BN) modeling to analyze publicly available datasets for intrachromosomal interactions. We investigated how 106 variables affect the pairwise interactions of over 10 million 5-kb DNA segments in the B-lymphocyte cell line GB12878. Strictly data-driven BN modeling indicates that the strength of intrachromosomal interactions (hic_strength) is directly influenced by only four types of factors: distance between segments, Rad21 or SMC3 (cohesin components),transcription at transcription start sites (TSS), and the number of CCCTC-binding factor (CTCF)-cohesin complexes between the interacting DNA segments. Subsequent studies confirmed that most high-intensity interactions have a CTCF-cohesin complex in at least one of the interacting segments. However, 46% have CTCF on only one side, and 32% are without CTCF. As expected, high-intensity interactions are strongly dependent on the orientation of the ctcf motif, and, moreover, we find that the interaction between enhancers and promoters is similarly dependent on ctcf motif orientation. Dependency relationships between transcription factors were also revealed, including known lineage-determining B-cell transcription factors (e.g., Ebf1) as well as potential novel relationships. Thus, BN analysis of large intrachromosomal interaction datasets is a useful tool for gaining insight into DNA-DNA, protein-DNA, and protein-protein interactions.


Asunto(s)
Teorema de Bayes , Cromatina/metabolismo , ADN/metabolismo , Modelos Moleculares , Linfocitos B , Sitios de Unión , Proteínas de Ciclo Celular/metabolismo , Línea Celular , Proteoglicanos Tipo Condroitín Sulfato/metabolismo , Cromatina/química , Proteínas Cromosómicas no Histona/metabolismo , Biología Computacional , ADN/química , Proteínas de Unión al ADN/metabolismo , Conjuntos de Datos como Asunto , Humanos , Conformación Molecular , Proteínas Nucleares/metabolismo , Motivos de Nucleótidos , Fosfoproteínas/metabolismo , Regiones Promotoras Genéticas , Mapeo de Interacción de Proteínas/métodos , Programas Informáticos , Factores de Transcripción/metabolismo , Sitio de Iniciación de la Transcripción , Transcripción Genética
14.
J Comput Biol ; 24(4): 340-356, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-27681505

RESUMEN

Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc.


Asunto(s)
Algoritmos , Teorema de Bayes , Biología Computacional/métodos , Redes Reguladoras de Genes , Programas Informáticos , Estudio de Asociación del Genoma Completo , Humanos , Metabolómica , Persona de Mediana Edad , Modelos Genéticos , Estudios Prospectivos
15.
Nucleic Acids Res ; 43(15): e100, 2015 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-25977295

RESUMEN

Data on biological mechanisms of aging are mostly obtained from cross-sectional study designs. An inherent disadvantage of this design is that inter-individual differences can mask small but biologically significant age-dependent changes. A serially sampled design (same individual at different time points) would overcome this problem but is often limited by the relatively small numbers of available paired samples and the statistics being used. To overcome these limitations, we have developed a new vector-based approach, termed three-component analysis, which incorporates temporal distance, signal intensity and variance into one single score for gene ranking and is combined with gene set enrichment analysis. We tested our method on a unique age-based sample set of human skin fibroblasts and combined genome-wide transcription, DNA methylation and histone methylation (H3K4me3 and H3K27me3) data. Importantly, our method can now for the first time demonstrate a clear age-dependent decrease in expression of genes coding for proteins involved in translation and ribosome function. Using analogies with data from lower organisms, we propose a model where age-dependent down-regulation of protein translation-related components contributes to extend human lifespan.


Asunto(s)
Envejecimiento/genética , Epigénesis Genética , Perfilación de la Expresión Génica , Biosíntesis de Proteínas , Adulto , Anciano , Algoritmos , Células Cultivadas , Análisis por Conglomerados , Metilación de ADN , Regulación hacia Abajo , Factores de Transcripción Forkhead/metabolismo , Histonas/metabolismo , Humanos , Masculino , Metilación , Persona de Mediana Edad , Piel/metabolismo , Estadísticas no Paramétricas
16.
Pharmacogenomics ; 12(9): 1349-60, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21919609

RESUMEN

Pharmacogenetics aims to elucidate the genetic factors underlying the individual's response to pharmacotherapy. Coupled with the recent (and ongoing) progress in high-throughput genotyping, sequencing and other genomic technologies, pharmacogenetics is rapidly transforming into pharmacogenomics, while pursuing the primary goals of identifying and studying the genetic contribution to drug therapy response and adverse effects, and existing drug characterization and new drug discovery. Accomplishment of both of these goals hinges on gaining a better understanding of the underlying biological systems; however, reverse-engineering biological system models from the massive datasets generated by the large-scale genetic epidemiology studies presents a formidable data analysis challenge. In this article, we review the recent progress made in developing such data analysis methodology within the paradigm of systems biology research that broadly aims to gain a 'holistic', or 'mechanistic' understanding of biological systems by attempting to capture the entirety of interactions between the components (genetic and otherwise) of the system.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/genética , Biología de Sistemas , Inteligencia Artificial , Descubrimiento de Drogas , Epistasis Genética , Genotipo , Humanos , Epidemiología Molecular , Farmacogenética , Proyectos de Investigación , Estadística como Asunto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...