Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Am J Hum Genet ; 95(6): 744-53, 2014 Dec 04.
Artículo en Inglés | MEDLINE | ID: mdl-25434007

RESUMEN

Schizophrenia (SZ) genome-wide association studies (GWASs) have identified common risk variants in >100 susceptibility loci; however, the contribution of rare variants at these loci remains largely unexplored. One of the strongly associated loci spans MIR137 (miR137) and MIR2682 (miR2682), two microRNA genes important for neuronal function. We sequenced ∼6.9 kb MIR137/MIR2682 and upstream regulatory sequences in 2,610 SZ cases and 2,611 controls of European ancestry. We identified 133 rare variants with minor allele frequency (MAF) <0.5%. The rare variant burden in promoters and enhancers, but not insulators, was associated with SZ (p = 0.021 for MAF < 0.5%, p = 0.003 for MAF < 0.1%). A rare enhancer SNP, 1:g.98515539A>T, presented exclusively in 11 SZ cases (nominal p = 4.8 × 10(-4)). We further identified its risk allele T in 2 of 2,434 additional SZ cases, 11 of 4,339 bipolar (BP) cases, and 3 of 3,572 SZ/BP study controls and 1,688 population controls; yielding combined p values of 0.0007, 0.0013, and 0.0001 for SZ, BP, and SZ/BP, respectively. The risk allele T of 1:g.98515539A>T reduced enhancer activity of its flanking sequence by >50% in human neuroblastoma cells, predicting lower expression of MIR137/MIR2682. Both empirical and computational analyses showed weaker transcription factor (YY1) binding by the risk allele. Chromatin conformation capture (3C) assay further indicated that 1:g.98515539A>T influenced MIR137/MIR2682, but not the nearby DPYD or LOC729987. Our results suggest that rare noncoding risk variants are associated with SZ and BP at MIR137/MIR2682 locus, with risk alleles decreasing MIR137/MIR2682 expression.


Asunto(s)
Trastorno Bipolar/genética , Regulación de la Expresión Génica/genética , Variación Genética , MicroARNs/genética , Esquizofrenia/genética , Alelos , Secuencia de Bases , Línea Celular Tumoral , Frecuencia de los Genes , Genes Reporteros , Sitios Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Datos de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas/genética , Riesgo , Análisis de Secuencia de ADN
2.
BMC Bioinformatics ; 17: 68, 2016 Feb 04.
Artículo en Inglés | MEDLINE | ID: mdl-26846597

RESUMEN

BACKGROUND: The intrinsic bendability of DNA plays an important role with relevance for myriad of essential cellular mechanisms. The flexibility of a DNA fragment can be experimentally and computationally examined by its propensity for cyclization, quantified by the Jacobson-Stockmayer J factor. In this study, we use a well-established coarse-grained three-dimensional model of DNA and seven distinct sets of experimentally and computationally derived conformational parameters of the double helix to evaluate the role of structural parameters in calculating DNA cyclization. RESULTS: We calculate the cyclization rates of 86 DNA sequences with previously measured J factors and lengths between 57 and 325 bp as well as of 20,000 randomly generated DNA sequences with lengths between 350 and 4000 bp. Our comparison with experimental data is complemented with analysis of simulated data. CONCLUSIONS: Our data demonstrate that all sets of parameters yield very similar results for longer DNA fragments, regardless of the nucleotide sequence, which are in agreement with experimental measurements. However, for DNA fragments shorter than 100 bp, all sets of parameters performed poorly yielding results with several orders of magnitude difference from the experimental measurements. Our data show that DNA cyclization rates calculated using conformational parameters based on nucleosome packaging data are most similar to the experimental measurements. Overall, our study provides a comprehensive large-scale assessment of the role of structural parameters in calculating DNA cyclization rates.


Asunto(s)
Fenómenos Biofísicos , ADN/química , Conformación de Ácido Nucleico , Ciclización , Humanos , Modelos Moleculares , Docilidad
3.
PLoS Comput Biol ; 9(1): e1002881, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23341768

RESUMEN

Physicochemical properties of DNA, such as shape, affect protein-DNA recognition. However, the properties of DNA that are most relevant for predicting the binding sites of particular transcription factors (TFs) or classes of TFs have yet to be fully understood. Here, using a model that accurately captures the melting behavior and breathing dynamics (spontaneous local openings of the double helix) of double-stranded DNA, we simulated the dynamics of known binding sites of the TF and nucleoid-associated protein Fis in Escherichia coli. Our study involves simulations of breathing dynamics, analysis of large published in vitro and genomic datasets, and targeted experimental tests of our predictions. Our simulation results and available in vitro binding data indicate a strong correlation between DNA breathing dynamics and Fis binding. Indeed, we can define an average DNA breathing profile that is characteristic of Fis binding sites. This profile is significantly enriched among the identified in vivo E. coli Fis binding sites. To test our understanding of how Fis binding is influenced by DNA breathing dynamics, we designed base-pair substitutions, mismatch, and methylation modifications of DNA regions that are known to interact (or not interact) with Fis. The goal in each case was to make the local DNA breathing dynamics either closer to or farther from the breathing profile characteristic of a strong Fis binding site. For the modified DNA segments, we found that Fis-DNA binding, as assessed by gel-shift assay, changed in accordance with our expectations. We conclude that Fis binding is associated with DNA breathing dynamics, which in turn may be regulated by various nucleotide modifications.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Proteínas de Escherichia coli/metabolismo , Sitios de Unión , Modelos Moleculares , Unión Proteica
4.
Nucleic Acids Res ; 40(20): 10116-23, 2012 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-22904068

RESUMEN

The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.


Asunto(s)
ADN/química , Factor de Transcripción YY1/metabolismo , Secuencia de Bases , Sitios de Unión , Secuencia de Consenso , Células HeLa , Humanos , Simulación de Dinámica Molecular , Plasminógeno/genética , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas , Unión Proteica
5.
Physiol Rep ; 11(15): e15742, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37537137

RESUMEN

Obesity continues to rise in the juveniles and obese children are more likely to develop metabolic syndrome (MetS) and related cardiovascular disease. Unfortunately, effective prevention and long-term treatment options remain limited. We determined the juvenile cardiac response to MetS in a swine model. Juvenile male swine were fed either an obesogenic diet, to induce MetS, or a lean diet, as a control (LD). Myocardial ischemia was induced with surgically placed ameroid constrictor on the left circumflex artery. Physiological data were recorded and at 22 weeks of age the animals underwent a terminal harvest procedure and myocardial tissue was extracted for total metabolic and proteomic LC/MS-MS, RNA-seq analysis, and data underwent nonnegative matrix factorization for metabolic signatures. Significantly altered in MetS versus. LD were the glycolysis-related metabolites and enzymes. In MetS compared with LD Glycogen synthase 1 (GYS1)-glycogen phosphorylases (PYGM/PYGL) expression disbalance resulted in a loss of myocardial glycogen. Our findings are consistent with the concept that transcriptionally driven myocardial changes in glycogen and glucose metabolism-related enzymes lead to a deficiency of their metabolite products in MetS. This abnormal energy metabolism provides insight into the pathogenesis of the juvenile heart in MetS. This study reveals that MetS and ischemia diminishes ATP availability in the myocardium via altering the glucose-G6P-pyruvate axis at the level of metabolites and gene expression of related enzymes. The observed severe glycogen depletion in MetS coincides with disbalance in expression of GYS1 and both PYGM and PYGL. This altered energy substrate metabolism is a potential target of pharmacological agents for improving juvenile myocardial function in MetS and ischemia.


Asunto(s)
Síndrome Metabólico , Obesidad Infantil , Porcinos , Masculino , Animales , Síndrome Metabólico/metabolismo , Proteómica/métodos , Miocardio/metabolismo , Glucólisis , Isquemia/metabolismo , Modelos Animales de Enfermedad
6.
bioRxiv ; 2023 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-37745370

RESUMEN

Motivation: The two strands of the DNA double helix locally and spontaneously separate and recombine in living cells due to the inherent thermal DNA motion.This dynamics results in transient openings in the double helix and is referred to as "DNA breathing" or "DNA bubbles." The propensity to form local transient openings is important in a wide range of biological processes, such as transcription, replication, and transcription factors binding. However, the modeling and computer simulation of these phenomena, have remained a challenge due to the complex interplay of numerous factors, such as, temperature, salt content, DNA sequence, hydrogen bonding, base stacking, and others. Results: We present pyDNA-EPBD, a parallel software implementation of the Extended Peyrard-Bishop- Dauxois (EPBD) nonlinear DNA model that allows us to describe some features of DNA dynamics in detail. The pyDNA-EPBD generates genomic scale profiles of average base-pair openings, base flipping probability, DNA bubble probability, and calculations of the characteristically dynamic length indicating the number of base pairs statistically significantly affected by a single point mutation using the Markov Chain Monte Carlo (MCMC) algorithm.

7.
Nucleic Acids Res ; 38(6): 1790-5, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-20019064

RESUMEN

We assess the role of DNA breathing dynamics as a determinant of promoter strength and transcription start site (TSS) location. We compare DNA Langevin dynamic profiles of representative gene promoters, calculated with the extended non-linear PBD model of DNA with experimental data on transcription factor binding and transcriptional activity. Our results demonstrate that DNA dynamic activity at the TSS can be suppressed by mutations that do not affect basal transcription factor binding-DNA contacts. We use this effect to establish the separate contributions of transcription factor binding and DNA dynamics to transcriptional activity. Our results argue against a purely 'transcription factor-centric' view of transcription initiation, suggesting that both DNA dynamics and transcription factor binding are necessary conditions for transcription initiation.


Asunto(s)
Regulación de la Expresión Génica , Regiones Promotoras Genéticas , Factores de Transcripción/metabolismo , Sitio de Iniciación de la Transcripción , Transcripción Genética , Simulación por Computador , ADN/química , Células HeLa , Humanos , Mutación
8.
Sci Rep ; 12(1): 8539, 2022 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-35595786

RESUMEN

Quantum annealers manufactured by D-Wave Systems, Inc., are computational devices capable of finding high-quality heuristic solutions of NP-hard problems. In this contribution, we explore the potential and effectiveness of such quantum annealers for computing Boolean tensor networks. Tensors offer a natural way to model high-dimensional data commonplace in many scientific fields, and representing a binary tensor as a Boolean tensor network is the task of expressing a tensor containing categorical (i.e., [Formula: see text]) values as a product of low dimensional binary tensors. A Boolean tensor network is computed by Boolean tensor decomposition, and it is usually not exact. The aim of such decomposition is to minimize the given distance measure between the high-dimensional input tensor and the product of lower-dimensional (usually three-dimensional) tensors and matrices representing the tensor network. In this paper, we introduce and analyze three general algorithms for Boolean tensor networks: Tucker, Tensor Train, and Hierarchical Tucker networks. The computation of a Boolean tensor network is reduced to a sequence of Boolean matrix factorizations, which we show can be expressed as a quadratic unconstrained binary optimization problem suitable for solving on a quantum annealer. By using a novel method we introduce called parallel quantum annealing, we demonstrate that Boolean tensor's with up to millions of elements can be decomposed efficiently using a DWave 2000Q quantum annealer.

9.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1670-1682, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-33400654

RESUMEN

A central challenge in protein modeling research and protein structure prediction in particular is known as decoy selection. The problem refers to selecting biologically-active/native tertiary structures among a multitude of physically-realistic structures generated by template-free protein structure prediction methods. Research on decoy selection is active. Clustering-based methods are popular, but they fail to identify good/near-native decoys on datasets where near-native decoys are severely under-sampled by a protein structure prediction method. Reasonable progress is reported by methods that additionally take into account the internal energy of a structure and employ it to identify basins in the energy landscape organizing the multitude of decoys. These methods, however, incur significant time costs for extracting basins from the landscape. In this paper, we propose a novel decoy selection method based on non-negative matrix factorization. We demonstrate that our method outperforms energy landscape-based methods. In particular, the proposed method addresses both the time cost issue and the challenge of identifying good decoys in a sparse dataset, successfully recognizing near-native decoys for both easy and hard protein targets.


Asunto(s)
Algoritmos , Proteínas , Análisis por Conglomerados , Conformación Proteica , Pliegue de Proteína , Proteínas/química , Proteínas/genética
10.
Cell Genom ; 2(11): None, 2022 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-36388765

RESUMEN

Mutational signature analysis is commonly performed in cancer genomic studies. Here, we present SigProfilerExtractor, an automated tool for de novo extraction of mutational signatures, and benchmark it against another 13 bioinformatics tools by using 34 scenarios encompassing 2,500 simulated signatures found in 60,000 synthetic genomes and 20,000 synthetic exomes. For simulations with 5% noise, reflecting high-quality datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true-positive signatures while yielding 5-fold less false-positive signatures. Applying SigProfilerExtractor to 4,643 whole-genome- and 19,184 whole-exome-sequenced cancers reveals four novel signatures. Two of the signatures are confirmed in independent cohorts, and one of these signatures is associated with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting signatures, and several novel mutational signatures, including one putatively attributed to direct tobacco smoking mutagenesis in bladder tissues.

11.
Nucleic Acids Res ; 37(7): 2405-10, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19264801

RESUMEN

No simple model exists that accurately describes the melting behavior and breathing dynamics of double-stranded DNA as a function of nucleotide sequence. This is especially true for homogenous and periodic DNA sequences, which exhibit large deviations in melting temperature from predictions made by additive thermodynamic contributions. Currently, no method exists for analysis of the DNA breathing dynamics of repeats and of highly G/C- or A/T-rich regions, even though such sequences are widespread in vertebrate genomes. Here, we extend the nonlinear Peyrard-Bishop-Dauxois (PBD) model of DNA to include a sequence-dependent stacking term, resulting in a model that can accurately describe the melting behavior of homogenous and periodic sequences. We collect melting data for several DNA oligos, and apply Monte Carlo simulations to establish force constants for the 10 dinucleotide steps (CG, CA, GC, AT, AG, AA, AC, TA, GG, TC). The experiments and numerical simulations confirm that the GG/CC dinucleotide stacking is remarkably unstable, compared with the stacking in GC/CG and CG/GC dinucleotide steps. The extended PBD model will facilitate thermodynamic and dynamic simulations of important genomic regions such as CpG islands and disease-related repeats.


Asunto(s)
ADN/química , Modelos Químicos , Termodinámica , Secuencia de Bases , Simulación por Computador , Método de Montecarlo , Desnaturalización de Ácido Nucleico
12.
Sci Rep ; 11(1): 19752, 2021 10 05.
Artículo en Inglés | MEDLINE | ID: mdl-34611227

RESUMEN

Although metabolic syndrome (MetS) is linked to an elevated risk of cardiovascular disease (CVD), the cardiac-specific risk mechanism is unknown. Obesity, hypertension, and diabetes (all MetS components) are the most common form of CVD and represent risk factors for worse COVID-19 outcomes compared to their non MetS peers. Here, we use obese Yorkshire pigs as a highly relevant animal model of human MetS, where pigs develop the hallmarks of human MetS and reproducibly mimics the myocardial pathophysiology in patients. Myocardium-specific mass spectroscopy-derived metabolomics, proteomics, and transcriptomics enabled the identity and quality of proteins and metabolites to be investigated in the myocardium to greater depth. Myocardium-specific deregulation of pro-inflammatory markers, propensity for arterial thrombosis, and platelet aggregation was revealed by computational analysis of differentially enriched pathways between MetS and control animals. While key components of the complement pathway and the immune response to viruses are under expressed, key N6-methyladenosin RNA methylation enzymes are largely overexpressed in MetS. Blood tests do not capture the entirety of metabolic changes that the myocardium undergoes, making this analysis of greater value than blood component analysis alone. Our findings create data associations to further characterize the MetS myocardium and disease vulnerability, emphasize the need for a multimodal therapeutic approach, and suggests a mechanism for observed worse outcomes in MetS patients with COVID-19 comorbidity.


Asunto(s)
COVID-19/patología , Susceptibilidad a Enfermedades , Síndrome Metabólico/patología , Animales , Factores de Coagulación Sanguínea/genética , Factores de Coagulación Sanguínea/metabolismo , COVID-19/complicaciones , COVID-19/virología , Ciclooxigenasa 2/genética , Ciclooxigenasa 2/metabolismo , Dieta Alta en Grasa/veterinaria , Modelos Animales de Enfermedad , Humanos , Inmunidad Innata/genética , Síndrome Metabólico/complicaciones , Síndrome Metabólico/metabolismo , Metiltransferasas/genética , Metiltransferasas/metabolismo , Miocardio/metabolismo , Estrés Oxidativo/genética , Agregación Plaquetaria , Receptores Purinérgicos P2Y1/genética , Receptores Purinérgicos P2Y1/metabolismo , Sistema Renina-Angiotensina , Factores de Riesgo , SARS-CoV-2/aislamiento & purificación , Porcinos , Activador de Plasminógeno de Tipo Uroquinasa/genética , Activador de Plasminógeno de Tipo Uroquinasa/metabolismo
13.
PLoS Comput Biol ; 5(3): e1000313, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19282962

RESUMEN

Establishing the general and promoter-specific mechanistic features of gene transcription initiation requires improved understanding of the sequence-dependent structural/dynamic features of promoter DNA. Experimental data suggest that a spontaneous dsDNA strand separation at the transcriptional start site is likely to be a requirement for transcription initiation in several promoters. Here, we use Langevin molecular dynamic simulations based on the Peyrard-Bishop-Dauxois nonlinear model of DNA (PBD LMD) to analyze the strand separation (bubble) dynamics of 80-bp-long promoter DNA sequences. We derive three dynamic criteria, bubble probability, bubble lifetime, and average strand separation, to characterize bubble formation at the transcriptional start sites of eight mammalian gene promoters. We observe that the most stable dsDNA openings do not necessarily coincide with the most probable openings and the highest average strand displacement, underscoring the advantages of proper molecular dynamic simulations. The dynamic profiles of the tested mammalian promoters differ significantly in overall profile and bubble probability, but the transcriptional start site is often distinguished by large (longer than 10 bp) and long-lived transient openings in the double helix. In support of these results are our experimental transcription data demonstrating that an artificial bubble-containing DNA template is transcribed bidirectionally by human RNA polymerase alone in the absence of any other transcription factors.


Asunto(s)
ARN Polimerasas Dirigidas por ADN/química , ADN/química , ADN/ultraestructura , Modelos Químicos , Modelos Moleculares , Regiones Promotoras Genéticas , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Simulación por Computador , ARN Polimerasas Dirigidas por ADN/ultraestructura , Calor , Modelos Genéticos , Datos de Secuencia Molecular
14.
Sci Rep ; 10(1): 3483, 2020 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-32103083

RESUMEN

Although metabolic syndrome (MS) is a significant risk of cardiovascular disease (CVD), the cardiac response (MR) to MS remains unclear due to traditional MS models' narrow scope around a limited number of cell-cycle regulation biomarkers and drawbacks of limited human tissue samples. To date, we developed the most comprehensive platform studying MR to MS in a pig model tightly related to human MS criteria. By incorporating comparative metabolomic, transcriptomic, functional analyses, and unsupervised machine learning (UML), we can discover unknown metabolic pathways connections and links on numerous biomarkers across the MS-associated issues in the heart. For the first time, we show severely diminished availability of glycolytic and citric acid cycle (CAC) pathways metabolites, altered expression, GlcNAcylation, and activity of involved enzymes. A notable exception, however, is the excessive succinate accumulation despite reduced succinate dehydrogenase complex iron-sulfur subunit b (SDHB) expression and decreased content of precursor metabolites. Finally, the expression of metabolites and enzymes from the GABA-glutamate, GABA-putrescine, and the glyoxylate pathways significantly increase, suggesting an alternative cardiac means to replenish succinate and malate in MS. Our platform discovers potential therapeutic targets for MS-associated CVD within pathways that were previously unknown to corelate with the disease.


Asunto(s)
Metabolismo Energético , Síndrome Metabólico/patología , Metaboloma , Metabolómica/métodos , Miocardio/metabolismo , Animales , Biomarcadores/metabolismo , Ciclo del Ácido Cítrico/genética , Dieta Alta en Grasa , Modelos Animales de Enfermedad , Glucólisis/genética , Masculino , Síndrome Metabólico/metabolismo , Factores de Riesgo , Succinato Deshidrogenasa/metabolismo , Ácido Succínico/metabolismo , Porcinos , Aprendizaje Automático no Supervisado
15.
Nat Commun ; 11(1): 3096, 2020 06 18.
Artículo en Inglés | MEDLINE | ID: mdl-32555180

RESUMEN

Intratumor heterogeneity (ITH) and tumor evolution have been well described for clear cell renal cell carcinomas (ccRCC), but they are less studied for other kidney cancer subtypes. Here we investigate ITH and clonal evolution of papillary renal cell carcinoma (pRCC) and rarer kidney cancer subtypes, integrating whole-genome sequencing and DNA methylation data. In 29 tumors, up to 10 samples from the center to the periphery of each tumor, and metastatic samples in 2 cases, enable phylogenetic analysis of spatial features of clonal expansion, which shows congruent patterns of genomic and epigenomic evolution. In contrast to previous studies of ccRCC, in pRCC, driver gene mutations and most arm-level somatic copy number alterations (SCNAs) are clonal. These findings suggest that a single biopsy would be sufficient to identify the important genetic drivers and that targeting large-scale SCNAs may improve pRCC treatment, which is currently poor. While type 1 pRCC displays near absence of structural variants (SVs), the more aggressive type 2 pRCC and the rarer subtypes have numerous SVs, which should be pursued for prognostic significance.


Asunto(s)
Carcinoma de Células Renales/genética , Neoplasias Renales/genética , Variaciones en el Número de Copia de ADN/genética , Epigenómica , Mutación de Línea Germinal/genética , Humanos , Filogenia
16.
J Contam Hydrol ; 220: 66-97, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-30528243

RESUMEN

Unsupervised Machine Learning (ML) is becoming increasingly popular for solving various types of data analytics problems including feature extraction, blind source separation, exploratory analyses, model diagnostics, etc. Here, we have developed a new unsupervised ML method based on Nonnegative Tensor Factorization (NTF) for identification of the original groundwater types (including contaminant sources) present in geochemical mixtures observed in an aquifer. Frequently, groundwater types with different geochemical signatures are related to different background and/or contamination sources. The characterization of groundwater mixing processes is a challenging but very important task critical for any environmental management project aiming to characterize the fate and transport of contaminants in the subsurface and perform contaminant remediation. This task typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. Additionally, the application of inverse methods may introduce biases in the analyses through the assumptions made in the model development process. Here, we substitute the model inversion with unsupervised ML analysis. The ML analysis does not make any assumptions about underlying physical and geochemical processes occurring in the aquifer. Our ML methodology, called NTFk, is capable of identifying (1) the unknown number of groundwater types (contaminant sources) present in the aquifer, (2) the original geochemical concentrations (signatures) of these groundwater types and (3) spatial and temporal dynamics in the mixing of these groundwater types. These results are obtained only from the measured geochemical data without any additional site information. In general, the NTFk methodology allows for interpretation of large high-dimensional datasets representing diverse spatial and temporal components such as state variables and velocities. NTFk has been tested on synthetic and real-world site three-dimensional datasets. The NTFk algorithm is designed to work with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).


Asunto(s)
Agua Subterránea , Contaminantes Químicos del Agua , Monitoreo del Ambiente , Isótopos
17.
J Chem Theory Comput ; 15(11): 6343-6357, 2019 Nov 12.
Artículo en Inglés | MEDLINE | ID: mdl-31476122

RESUMEN

Phase separation in mixed lipid systems has been extensively studied both experimentally and theoretically because of its biological importance. A detailed description of such complex systems undoubtedly requires novel mathematical frameworks that are capable of decomposing and categorizing the evolution of thousands if not millions of lipids involved in the phenomenon. The interpretation and analysis of molecular dynamics (MD) simulations representing temporal and spatial changes in such systems are still a challenging task. Here, we present an unsupervised machine learning approach based on nonnegative matrix factorization called NMFk that successfully extracts latent (i.e., not directly observable) features from the second layer neighborhood profiles derived from coarse-grained MD simulations of a ternary lipid mixture. Our results demonstrate that NMFk extracts physically meaningful features that uniquely describe the phase separation such as locations and roles of different lipid types, formation of nanodomains, and timescales of lipid segregation.


Asunto(s)
Lípidos/química , Aprendizaje Automático no Supervisado , 1,2-Dipalmitoilfosfatidilcolina/química , Colesterol/química , Membrana Dobles de Lípidos/química , Simulación de Dinámica Molecular , Fosfatidilcolinas/química
18.
PLoS One ; 14(12): e0225857, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31790488

RESUMEN

Although the high-fat-diet-induced metabolic syndrome (MetS) is a precursor of human cardiac pathology, the myocardial metabolic state in MetS is far from clear. The discrepancies in metabolite handling between human and small animal models and the difficulties inherent in obtaining human tissue complicate the identification of the myocardium-specific metabolic response in patients. Here we use the large animal model of swine that develops the hallmark criteria of human MetS. Our comparative metabolomics together with transcriptomics and computational nonnegative matrix factorization (NMF) interpretation of the data exposes significant decline in metabolites related to the fatty acid oxidation, glycolysis, and pentose phosphate pathway. Behind the reversal lies decreased expression of enzymes that operate in the pathways. We showed that diminished glycogen deposition is a metabolic signature of MetS in the pig myocardium. The depletion of glycogen arises from disbalance in expression of genes that break down and synthesize glycogen. We show robust acetoacetate accumulation and activated expression of key enzymes in ketone body formation, catabolism and transporters, suggesting a shift in fuel utilization in MetS. A contrasting enrichment in O-GlcNAcylated proteins uncovers hexosamine pathway and O-GlcNAcase (OGA) expression involvement in the myocardial response to MetS. Although the hexosamine biosynthetic pathway (HBP) activity and the availability of the UDP-GlcNAc substrate in the MetS myocardium is low, the level of O-GlcNacylated proteins is high as the O-GlcNacase is significantly diminished. Our data support the perception of transcriptionally driven myocardial alterations in expression of standard fatty acids, glucose metabolism, glycogen, and ketone body related enzymes and subsequent paucity of their metabolite products in MetS. This aberrant energy metabolism in the MetS myocardium provide insight into the pathogenesis of CVD in MetS.


Asunto(s)
Redes y Vías Metabólicas , Síndrome Metabólico/metabolismo , Miocardio/metabolismo , Animales , Colesterol en la Dieta/efectos adversos , Dieta , Glicosilación , Masculino , Metaboloma , Metabolómica , N-Acetilglucosaminiltransferasas/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Factores de Riesgo , Porcinos , Aprendizaje Automático no Supervisado , beta-N-Acetilhexosaminidasas/metabolismo
19.
J Contam Hydrol ; 212: 134-142, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29174719

RESUMEN

Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. The NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).


Asunto(s)
Agua Subterránea/química , Aprendizaje Automático Supervisado , Contaminantes Químicos del Agua/química , Monitoreo del Ambiente/métodos , Isótopos/análisis
20.
PLoS One ; 13(12): e0206653, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30532243

RESUMEN

D-Wave quantum annealers represent a novel computational architecture and have attracted significant interest. Much of this interest has focused on the quantum behavior of D-Wave machines, and there have been few practical algorithms that use the D-Wave. Machine learning has been identified as an area where quantum annealing may be useful. Here, we show that the D-Wave 2X can be effectively used as part of an unsupervised machine learning method. This method takes a matrix as input and produces two low-rank matrices as output-one containing latent features in the data and another matrix describing how the features can be combined to approximately reproduce the input matrix. Despite the limited number of bits in the D-Wave hardware, this method is capable of handling a large input matrix. The D-Wave only limits the rank of the two output matrices. We apply this method to learn the features from a set of facial images and compare the performance of the D-Wave to two classical tools. This method is able to learn facial features and accurately reproduce the set of facial images. The performance of the D-Wave shows some promise, but has some limitations. It outperforms the two classical codes in a benchmark when only a short amount of computational time is allowed (200-20,000 microseconds), but these results suggest heuristics that would likely outperform the D-Wave in this benchmark.


Asunto(s)
Aprendizaje Automático , Modelos Teóricos , Teoría Cuántica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA