Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Chem Theory Comput ; 15(11): 6343-6357, 2019 Nov 12.
Article in English | MEDLINE | ID: mdl-31476122

ABSTRACT

Phase separation in mixed lipid systems has been extensively studied both experimentally and theoretically because of its biological importance. A detailed description of such complex systems undoubtedly requires novel mathematical frameworks that are capable of decomposing and categorizing the evolution of thousands if not millions of lipids involved in the phenomenon. The interpretation and analysis of molecular dynamics (MD) simulations representing temporal and spatial changes in such systems are still a challenging task. Here, we present an unsupervised machine learning approach based on nonnegative matrix factorization called NMFk that successfully extracts latent (i.e., not directly observable) features from the second layer neighborhood profiles derived from coarse-grained MD simulations of a ternary lipid mixture. Our results demonstrate that NMFk extracts physically meaningful features that uniquely describe the phase separation such as locations and roles of different lipid types, formation of nanodomains, and timescales of lipid segregation.


Subject(s)
Lipids/chemistry , Unsupervised Machine Learning , 1,2-Dipalmitoylphosphatidylcholine/chemistry , Cholesterol/chemistry , Lipid Bilayers/chemistry , Molecular Dynamics Simulation , Phosphatidylcholines/chemistry
2.
J Contam Hydrol ; 220: 66-97, 2019 Jan.
Article in English | MEDLINE | ID: mdl-30528243

ABSTRACT

Unsupervised Machine Learning (ML) is becoming increasingly popular for solving various types of data analytics problems including feature extraction, blind source separation, exploratory analyses, model diagnostics, etc. Here, we have developed a new unsupervised ML method based on Nonnegative Tensor Factorization (NTF) for identification of the original groundwater types (including contaminant sources) present in geochemical mixtures observed in an aquifer. Frequently, groundwater types with different geochemical signatures are related to different background and/or contamination sources. The characterization of groundwater mixing processes is a challenging but very important task critical for any environmental management project aiming to characterize the fate and transport of contaminants in the subsurface and perform contaminant remediation. This task typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. Additionally, the application of inverse methods may introduce biases in the analyses through the assumptions made in the model development process. Here, we substitute the model inversion with unsupervised ML analysis. The ML analysis does not make any assumptions about underlying physical and geochemical processes occurring in the aquifer. Our ML methodology, called NTFk, is capable of identifying (1) the unknown number of groundwater types (contaminant sources) present in the aquifer, (2) the original geochemical concentrations (signatures) of these groundwater types and (3) spatial and temporal dynamics in the mixing of these groundwater types. These results are obtained only from the measured geochemical data without any additional site information. In general, the NTFk methodology allows for interpretation of large high-dimensional datasets representing diverse spatial and temporal components such as state variables and velocities. NTFk has been tested on synthetic and real-world site three-dimensional datasets. The NTFk algorithm is designed to work with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).


Subject(s)
Groundwater , Water Pollutants, Chemical , Environmental Monitoring , Isotopes
3.
PLoS One ; 13(12): e0206653, 2018.
Article in English | MEDLINE | ID: mdl-30532243

ABSTRACT

D-Wave quantum annealers represent a novel computational architecture and have attracted significant interest. Much of this interest has focused on the quantum behavior of D-Wave machines, and there have been few practical algorithms that use the D-Wave. Machine learning has been identified as an area where quantum annealing may be useful. Here, we show that the D-Wave 2X can be effectively used as part of an unsupervised machine learning method. This method takes a matrix as input and produces two low-rank matrices as output-one containing latent features in the data and another matrix describing how the features can be combined to approximately reproduce the input matrix. Despite the limited number of bits in the D-Wave hardware, this method is capable of handling a large input matrix. The D-Wave only limits the rank of the two output matrices. We apply this method to learn the features from a set of facial images and compare the performance of the D-Wave to two classical tools. This method is able to learn facial features and accurately reproduce the set of facial images. The performance of the D-Wave shows some promise, but has some limitations. It outperforms the two classical codes in a benchmark when only a short amount of computational time is allowed (200-20,000 microseconds), but these results suggest heuristics that would likely outperform the D-Wave in this benchmark.


Subject(s)
Machine Learning , Models, Theoretical , Quantum Theory
4.
PLoS One ; 13(3): e0193974, 2018.
Article in English | MEDLINE | ID: mdl-29518126

ABSTRACT

Factor analysis is broadly used as a powerful unsupervised machine learning tool for reconstruction of hidden features in recorded mixtures of signals. In the case of a linear approximation, the mixtures can be decomposed by a variety of model-free Blind Source Separation (BSS) algorithms. Most of the available BSS algorithms consider an instantaneous mixing of signals, while the case when the mixtures are linear combinations of signals with delays is less explored. Especially difficult is the case when the number of sources of the signals with delays is unknown and has to be determined from the data as well. To address this problem, in this paper, we present a new method based on Nonnegative Matrix Factorization (NMF) that is capable of identifying: (a) the unknown number of the sources, (b) the delays and speed of propagation of the signals, and (c) the locations of the sources. Our method can be used to decompose records of mixtures of signals with delays emitted by an unknown number of sources in a nondispersive medium, based only on recorded data. This is the case, for example, when electromagnetic signals from multiple antennas are received asynchronously; or mixtures of acoustic or seismic signals recorded by sensors located at different positions; or when a shift in frequency is induced by the Doppler effect. By applying our method to synthetic datasets, we demonstrate its ability to identify the unknown number of sources as well as the waveforms, the delays, and the strengths of the signals. Using Bayesian analysis, we also evaluate estimation uncertainties and identify the region of likelihood where the positions of the sources can be found.


Subject(s)
Factor Analysis, Statistical , Signal Processing, Computer-Assisted , Unsupervised Machine Learning , Algorithms , Bayes Theorem , Datasets as Topic , Fourier Analysis , Markov Chains , Monte Carlo Method
5.
J Contam Hydrol ; 212: 134-142, 2018 05.
Article in English | MEDLINE | ID: mdl-29174719

ABSTRACT

Identification of the original groundwater types present in geochemical mixtures observed in an aquifer is a challenging but very important task. Frequently, some of the groundwater types are related to different infiltration and/or contamination sources associated with various geochemical signatures and origins. The characterization of groundwater mixing processes typically requires solving complex inverse models representing groundwater flow and geochemical transport in the aquifer, where the inverse analysis accounts for available site data. Usually, the model is calibrated against the available data characterizing the spatial and temporal distribution of the observed geochemical types. Numerous different geochemical constituents and processes may need to be simulated in these models which further complicates the analyses. In this paper, we propose a new contaminant source identification approach that performs decomposition of the observation mixtures based on Non-negative Matrix Factorization (NMF) method for Blind Source Separation (BSS), coupled with a custom semi-supervised clustering algorithm. Our methodology, called NMFk, is capable of identifying (a) the unknown number of groundwater types and (b) the original geochemical concentration of the contaminant sources from measured geochemical mixtures with unknown mixing ratios without any additional site information. NMFk is tested on synthetic and real-world site data. The NMFk algorithm works with geochemical data represented in the form of concentrations, ratios (of two constituents; for example, isotope ratios), and delta notations (standard normalized stable isotope ratios).


Subject(s)
Groundwater/chemistry , Supervised Machine Learning , Water Pollutants, Chemical/chemistry , Environmental Monitoring/methods , Isotopes/analysis
6.
Ground Water ; 56(1): 109-117, 2018 01.
Article in English | MEDLINE | ID: mdl-28722824

ABSTRACT

In modeling solute transport with mobile-immobile mass transfer (MIMT), it is common to use an advection-dispersion equation (ADE) with a retardation factor, or retarded ADE. This is commonly referred to as making the local equilibrium assumption (LEA). Assuming local equilibrium, Eulerian textbook treatments derive the retarded ADE, ostensibly exactly. However, other authors have presented rigorous mathematical derivations of the dispersive effect of MIMT, applicable even in the case of arbitrarily fast mass transfer. We resolve the apparent contradiction between these seemingly exact derivations by adopting a Lagrangian point of view. We show that local equilibrium constrains the expected time immobile, whereas the retarded ADE actually embeds a stronger, nonphysical, constraint: that all particles spend the same amount of every time increment immobile. Eulerian derivations of the retarded ADE thus silently commit the gambler's fallacy, leading them to ignore dispersion due to mass transfer that is correctly modeled by other approaches. We then present a particle tracking simulation illustrating how poor an approximation the retarded ADE may be, even when mobile and immobile plumes are continually near local equilibrium. We note that classic "LEA" (actually, retarded ADE validity) criteria test for insignificance of MIMT-driven dispersion relative to hydrodynamic dispersion, rather than for local equilibrium.


Subject(s)
Groundwater , Models, Theoretical , Solutions , Water Movements
7.
Chemosphere ; 182: 276-283, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28500972

ABSTRACT

High-explosive compounds including hexahydro-1,3,5-trinitro-1,3,5-triazine (RDX) were used extensively in weapons research and testing at Los Alamos National Laboratory (LANL). Liquid effluents containing RDX were released to an outfall pond that flowed to Cañon de Valle at LANL's Technical Area 16 (TA-16), resulting in the contamination of the alluvial, intermediate and regional groundwater bodies. Monitoring of groundwater within Cañon de Valle has shown persistent RDX in the intermediate perched zone located between 225 and 311 m below ground surface. Monitoring data also show detectable levels of RDX putative anaerobic degradation products. Batch and column experiments were conducted to determine the extent of adsorption-desorption and transport of RDX and its degradation products (MNX, DNX, and TNX) in major rock types that are within the RDX plume. All experiments were performed in the dark using water obtained from a well located at the center of the plume, which is fairly oxic and has a neutral pH of 7.5. Retardation factors and partitioning coefficient (Kd) values for RDX were calculated from batch experiments. Additionally, retardation factors and Kd values for RDX and its degradation products were calibrated from column experiments using a one-dimensional transport model with equilibrium sorption (linear isotherm). Results from the column and batch experiments showed little to no sorption of RDX to the aquifer materials tested, with retardation factors ranging from 1.0 to 1.8 and Kd values varying from 0 to 0.70 L/kg. Results also showed no measurable differences between the transport properties of RDX and its degradation products.


Subject(s)
Environmental Monitoring/methods , Geologic Sediments/chemistry , Triazines/chemistry , Volcanic Eruptions , Adsorption , Environmental Pollutants/chemistry , Explosive Agents/chemistry , New Mexico , Water Pollution/analysis
8.
J Contam Hydrol ; 193: 74-85, 2016 Oct.
Article in English | MEDLINE | ID: mdl-27639975

ABSTRACT

We develop empirically-grounded error envelopes for localization of a point contamination release event in the saturated zone of a previously uncharacterized heterogeneous aquifer into which a number of plume-intercepting wells have been drilled. We assume that flow direction in the aquifer is known exactly and velocity is known to within a factor of two of our best guess from well observations prior to source identification. Other aquifer and source parameters must be estimated by interpretation of well breakthrough data via the advection-dispersion equation. We employ high performance computing to generate numerous random realizations of aquifer parameters and well locations, simulate well breakthrough data, and then employ unsupervised machine optimization techniques to estimate the most likely spatial (or space-time) location of the source. Tabulating the accuracy of these estimates from the multiple realizations, we relate the size of 90% and 95% confidence envelopes to the data quantity (number of wells) and model quality (fidelity of ADE interpretation model to actual concentrations in a heterogeneous aquifer with channelized flow). We find that for purely spatial localization of the contaminant source, increased data quantities can make up for reduced model quality. For space-time localization, we find similar qualitative behavior, but significantly degraded spatial localization reliability and less improvement from extra data collection. Since the space-time source localization problem is much more challenging, we also tried a multiple-initial-guess optimization strategy. This greatly enhanced performance, but gains from additional data collection remained limited.


Subject(s)
Environmental Monitoring/methods , Groundwater/chemistry , Models, Theoretical , Water Movements , Water Pollutants, Chemical/analysis , Water Pollution, Chemical/analysis , Reproducibility of Results , Spatio-Temporal Analysis
9.
Article in English | MEDLINE | ID: mdl-25974474

ABSTRACT

Brownian motion, the classical diffusive process, maximizes the Boltzmann-Gibbs entropy. The Tsallis q entropy, which is nonadditive, was developed as an alternative to the classical entropy for systems which are nonergodic. A generalization of Brownian motion is provided that maximizes the Tsallis entropy rather than the Boltzmann-Gibbs entropy. This process is driven by a Brownian measure with a random diffusion coefficient. The distribution of this coefficient is derived as a function of q for 1

10.
Ground Water ; 49(3): 403-14, 2011.
Article in English | MEDLINE | ID: mdl-20550585

ABSTRACT

Identification of the pumping influences at monitoring wells caused by spatially and temporally variable water supply pumping can be a challenging, yet an important hydrogeological task. The information that can be obtained can be critical for conceptualization of the hydrogeological conditions and indications of the zone of influence of the individual pumping wells. However, the pumping influences are often intermittent and small in magnitude with variable production rates from multiple pumping wells. While these difficulties may support an inclination to abandon the existing dataset and conduct a dedicated cross-hole pumping test, that option can be challenging and expensive to coordinate and execute. This paper presents a method that utilizes a simple analytical modeling approach for analysis of a long-term water level record utilizing an inverse modeling approach. The methodology allows the identification of pumping wells influencing the water level fluctuations. Thus, the analysis provides an efficient and cost-effective alternative to designed and coordinated cross-hole pumping tests. We apply this method on a dataset from the Los Alamos National Laboratory site. Our analysis also provides (1) an evaluation of the information content of the transient water level data; (2) indications of potential structures of the aquifer heterogeneity inhibiting or promoting pressure propagation; and (3) guidance for the development of more complicated models requiring detailed specification of the aquifer heterogeneity.


Subject(s)
Models, Theoretical , Water Supply , Fresh Water , Pressure
11.
Ground Water ; 44(6): 814-25, 2006.
Article in English | MEDLINE | ID: mdl-17087753

ABSTRACT

Modern ground water characterization and remediation projects routinely require calibration and inverse analysis of large three-dimensional numerical models of complex hydrogeological systems. Hydrogeologic complexity can be prompted by various aquifer characteristics including complicated spatial hydrostratigraphy and aquifer recharge from infiltration through an unsaturated zone. To keep the numerical models computationally efficient, compromises are frequently made in the model development, particularly, about resolution of the computational grid and numerical representation of the governing flow equation. The compromise is required so that the model can be used in calibration, parameter estimation, performance assessment, and analysis of sensitivity and uncertainty in model predictions. However, grid properties and resolution as well as applied computational schemes can have large effects on forward-model predictions and on inverse parameter estimates. We investigate these effects for a series of one- and two-dimensional synthetic cases representing saturated and variably saturated flow problems. We show that "conformable" grids, despite neglecting terms in the numerical formulation, can lead to accurate solutions of problems with complex hydrostratigraphy. Our analysis also demonstrates that, despite slower computer run times and higher memory requirements for a given problem size, the control volume finite-element method showed an advantage over finite-difference techniques in accuracy of parameter estimation for a given grid resolution for most of the test problems.


Subject(s)
Models, Theoretical , Numerical Analysis, Computer-Assisted , Water/analysis , Calibration , Permeability , Uncertainty , Water Movements
12.
Ground Water ; 41(2): 200-11, 2003.
Article in English | MEDLINE | ID: mdl-12656286

ABSTRACT

Large-scale models are frequently used to estimate fluxes to small-scale models. The uncertainty associated with these flux estimates, however, is rarely addressed. We present a case study from the Española Basin, northern New Mexico, where we use a basin-scale model coupled with a high-resolution, nested site-scale model. Both models are three-dimensional and are analyzed by codes FEHM and PEST. Using constrained nonlinear optimization, we examine the effect of parameter uncertainty in the basin-scale model on the nonlinear confidence limits of predicted fluxes to the site-scale model. We find that some of the fluxes are very well constrained, while for others there is fairly large uncertainty. Site-scale transport simulation results, however, are relatively insensitive to the estimated uncertainty in the fluxes. We also compare parameter estimates obtained by the basin- and site-scale inverse models. Differences in the model grid resolution (scale of parameter estimation) result in differing delineation of hydrostratigraphic units, so the two models produce different estimates for some units. The effect is similar to the observed scale effect in medium properties owing to differences in tested volume. More important, estimation uncertainty of model parameters is quite different at the two scales. Overall, the basin inverse model resulted in significantly lower estimates of uncertainty, because of the larger calibration dataset available. This suggests that the basin-scale model contributes not only important boundary condition information but also improved parameter identification for some units. Our results demonstrate that caution is warranted when applying parameter estimates inferred from a large-scale model to small-scale simulations, and vice versa.


Subject(s)
Models, Theoretical , Water Movements , Water Supply , Calibration , Geological Phenomena , Geology , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...