Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
Add more filters










Publication year range
1.
J Appl Crystallogr ; 57(Pt 2): 392-402, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38596727

ABSTRACT

DLSIA (Deep Learning for Scientific Image Analysis) is a Python-based machine learning library that empowers scientists and researchers across diverse scientific domains with a range of customizable convolutional neural network (CNN) architectures for a wide variety of tasks in image analysis to be used in downstream data processing. DLSIA features easy-to-use architectures, such as autoencoders, tunable U-Nets and parameter-lean mixed-scale dense networks (MSDNets). Additionally, this article introduces sparse mixed-scale networks (SMSNets), generated using random graphs, sparse connections and dilated convolutions connecting different length scales. For verification, several DLSIA-instantiated networks and training scripts are employed in multiple applications, including inpainting for X-ray scattering data using U-Nets and MSDNets, segmenting 3D fibers in X-ray tomographic reconstructions of concrete using an ensemble of SMSNets, and leveraging autoencoder latent spaces for data compression and clustering. As experimental data continue to grow in scale and complexity, DLSIA provides accessible CNN construction and abstracts CNN complexities, allowing scientists to tailor their machine learning approaches, accelerate discoveries, foster interdisciplinary collaboration and advance research in scientific image analysis.

2.
Article in English | MEDLINE | ID: mdl-38130938

ABSTRACT

Scientific user facilities present a unique set of challenges for image processing due to the large volume of data generated from experiments and simulations. Furthermore, developing and implementing algorithms for real-time processing and analysis while correcting for any artifacts or distortions in images remains a complex task, given the computational requirements of the processing algorithms. In a collaborative effort across multiple Department of Energy national laboratories, the "MLExchange" project is focused on addressing these challenges. MLExchange is a Machine Learning framework deploying interactive web interfaces to enhance and accelerate data analysis. The platform allows users to easily upload, visualize, label, and train networks. The resulting models can be deployed on real data while both results and models could be shared with the scientists. The MLExchange web-based application for image segmentation allows for training, testing, and evaluating multiple machine learning models on hand-labeled tomography data. This environment provides users with an intuitive interface for segmenting images using a variety of machine learning algorithms and deep-learning neural networks. Additionally, these tools have the potential to overcome limitations in traditional image segmentation techniques, particularly for complex and low-contrast images.

3.
J Appl Crystallogr ; 55(Pt 5): 1277-1288, 2022 Oct 01.
Article in English | MEDLINE | ID: mdl-36249508

ABSTRACT

The implementation is proposed of image inpainting techniques for the reconstruction of gaps in experimental X-ray scattering data. The proposed methods use deep learning neural network architectures, such as convolutional autoencoders, tunable U-Nets, partial convolution neural networks and mixed-scale dense networks, to reconstruct the missing information in experimental scattering images. In particular, the recovered pixel intensities are evaluated against their corresponding ground-truth values using the mean absolute error and the correlation coefficient metrics. The results demonstrate that the proposed methods achieve better performance than traditional inpainting algorithms such as biharmonic functions. Overall, tunable U-Net and mixed-scale dense network architectures achieved the best reconstruction performance among all the tested algorithms, with correlation coefficient scores greater than 0.9980.

4.
Rev Sci Instrum ; 93(6): 064103, 2022 Jun 01.
Article in English | MEDLINE | ID: mdl-35778015

ABSTRACT

Revealing the positions of all the atoms in large macromolecules is powerful but only possible with neutron macromolecular crystallography (NMC). Neutrons provide a sensitive and gentle probe for the direct detection of protonation states at near-physiological temperatures and clean of artifacts caused by x rays or electrons. Currently, NMC use is restricted by the requirement for large crystal volumes even at state-of-the-art instruments such as the macromolecular neutron diffractometer at the Spallation Neutron Source. EWALD's design will break the crystal volume barrier and, thus, open the door for new types of experiments, the study of grand challenge systems, and the more routine use of NMC in biology. EWALD is a single crystal diffractometer capable of collecting data from macromolecular crystals on orders of magnitude smaller than what is currently feasible. The construction of EWALD at the Second Target Station will cause a revolution in NMC by enabling key discoveries in the biological, biomedical, and bioenergy sciences.


Subject(s)
Neutron Diffraction , Neutrons , Crystallography , Electrons , Macromolecular Substances/chemistry
5.
J Chem Phys ; 156(4): 041102, 2022 Jan 28.
Article in English | MEDLINE | ID: mdl-35105059

ABSTRACT

Advancements in x-ray free-electron lasers on producing ultrashort, ultrabright, and coherent x-ray pulses enable single-shot imaging of fragile nanostructures, such as superfluid helium droplets. This imaging technique gives unique access to the sizes and shapes of individual droplets. In the past, such droplet characteristics have only been indirectly inferred by ensemble averaging techniques. Here, we report on the size distributions of both pure and doped droplets collected from single-shot x-ray imaging and produced from the free-jet expansion of helium through a 5 µm diameter nozzle at 20 bars and nozzle temperatures ranging from 4.2 to 9 K. This work extends the measurement of large helium nanodroplets containing 109-1011 atoms, which are shown to follow an exponential size distribution. Additionally, we demonstrate that the size distributions of the doped droplets follow those of the pure droplets at the same stagnation condition but with smaller average sizes.

6.
Article in English | MEDLINE | ID: mdl-38131031

ABSTRACT

Machine learning (ML) algorithms are showing a growing trend in helping the scientific communities across different disciplines and institutions to address large and diverse data problems. However, many available ML tools are programmatically demanding and computationally costly. The MLExchange project aims to build a collaborative platform equipped with enabling tools that allow scientists and facility users who do not have a profound ML background to use ML and computational resources in scientific discovery. At the high level, we are targeting a full user experience where managing and exchanging ML algorithms, workflows, and data are readily available through web applications. Since each component is an independent container, the whole platform or its individual service(s) can be easily deployed at servers of different scales, ranging from a personal device (laptop, smart phone, etc.) to high performance clusters (HPC) accessed (simultaneously) by many users. Thus, MLExchange renders flexible using scenarios-users could either access the services and resources from a remote server or run the whole platform or its individual service(s) within their local network.

7.
J Appl Crystallogr ; 54(Pt 4): 1179-1188, 2021 Aug 01.
Article in English | MEDLINE | ID: mdl-34429723

ABSTRACT

The multitiered iterative phasing (MTIP) algorithm is used to determine the biological structures of macromolecules from fluctuation scattering data. It is an iterative algorithm that reconstructs the electron density of the sample by matching the computed fluctuation X-ray scattering data to the external observations, and by simultaneously enforcing constraints in real and Fourier space. This paper presents the first ever MTIP algorithm acceleration efforts on contemporary graphics processing units (GPUs). The Compute Unified Device Architecture (CUDA) programming model is used to accelerate the MTIP algorithm on NVIDIA GPUs. The computational performance of the CUDA-based MTIP algorithm implementation outperforms the CPU-based version by an order of magnitude. Furthermore, the Heterogeneous-Compute Interface for Portability (HIP) runtime APIs are used to demonstrate portability by accelerating the MTIP algorithm across NVIDIA and AMD GPUs.

8.
Acta Crystallogr D Struct Biol ; 77(Pt 5): 572-586, 2021 May 01.
Article in English | MEDLINE | ID: mdl-33950014

ABSTRACT

Structure-determination methods are needed to resolve the atomic details that underlie protein function. X-ray crystallography has provided most of our knowledge of protein structure, but is constrained by the need for large, well ordered crystals and the loss of phase information. The rapidly developing methods of serial femtosecond crystallography, micro-electron diffraction and single-particle reconstruction circumvent the first of these limitations by enabling data collection from nanocrystals or purified proteins. However, the first two methods also suffer from the phase problem, while many proteins fall below the molecular-weight threshold required for single-particle reconstruction. Cryo-electron tomography of protein nanocrystals has the potential to overcome these obstacles of mainstream structure-determination methods. Here, a data-processing scheme is presented that combines routines from X-ray crystallography and new algorithms that have been developed to solve structures from tomograms of nanocrystals. This pipeline handles image-processing challenges specific to tomographic sampling of periodic specimens and is validated using simulated crystals. The tolerance of this workflow to the effects of radiation damage is also assessed. The simulations indicate a trade-off between a wider tilt range to facilitate merging data from multiple tomograms and a smaller tilt increment to improve phase accuracy. Since phase errors, but not merging errors, can be overcome with additional data sets, these results recommend distributing the dose over a wide angular range rather than using a finer sampling interval to solve the protein structure.


Subject(s)
Algorithms , Crystallography, X-Ray/methods , Electron Microscope Tomography/methods , Image Processing, Computer-Assisted/methods , Nanoparticles/chemistry , Proteins/chemistry , Computer Simulation , Cryoelectron Microscopy/methods , Models, Molecular
9.
Article in English | MEDLINE | ID: mdl-38947249

ABSTRACT

Mathematical optimization lies at the core of many science and industry applications. One important issue with many current optimization strategies is a well-known trade-off between the number of function evaluations and the probability to find the global, or at least sufficiently high-quality local optima. In machine learning (ML), and by extension in active learning - for instance for autonomous experimentation - mathematical optimization is often used to find the underlying uncertain surrogate model from which subsequent decisions are made and therefore ML relies on high-quality optima to obtain the most accurate models. Active learning often has the added complexity of missing offline training data; therefore, the training has to be conducted during the data collection which can stall the acquisition if standard methods are used. In this work, we highlight recent efforts to create a high-performance hybrid optimization algorithm (HGDL), combining derivative-free global optimization strategies with local, derivative-based optimization, ultimately yielding an ordered list of unique local optima. Redundancies are avoided by deflating the objective function around earlier encountered optima. HGDL is designed to take full advantage of parallelism by having the most computationally expensive process, the local first and second-order-derivative-based optimizations, run in parallel on separate compute nodes in separate processes. In addition, the algorithm runs asynchronously; as soon as the first solution is found, it can be used while the algorithm continues to find more solutions. We apply the proposed optimization and training strategy to Gaussian-Process-driven stochastic function approximation and active learning.

10.
Acta Crystallogr D Struct Biol ; 76(Pt 8): 736-750, 2020 Aug 01.
Article in English | MEDLINE | ID: mdl-32744256

ABSTRACT

Intensity-based likelihood functions in crystallographic applications have the potential to enhance the quality of structures derived from marginal diffraction data. Their usage, however, is complicated by the ability to efficiently compute these target functions. Here, a numerical quadrature is developed that allows the rapid evaluation of intensity-based likelihood functions in crystallographic applications. By using a sequence of change-of-variable transformations, including a nonlinear domain-compression operation, an accurate, robust and efficient quadrature is constructed. The approach is flexible and can incorporate different noise models with relative ease.


Subject(s)
Crystallography, X-Ray/methods , Macromolecular Substances/chemistry , Likelihood Functions , Molecular Structure
11.
Acta Crystallogr D Struct Biol ; 75(Pt 11): 959-968, 2019 Nov 01.
Article in English | MEDLINE | ID: mdl-31692470

ABSTRACT

A nonlinear least-squares method for refining a parametric expression describing the estimated errors of reflection intensities in serial crystallographic (SX) data is presented. This approach, which is similar to that used in the rotation method of crystallographic data collection at synchrotrons, propagates error estimates from photon-counting statistics to the merged data. Here, it is demonstrated that the application of this approach to SX data provides better SAD phasing ability, enabling the autobuilding of a protein structure that had previously failed to be built. Estimating the error in the merged reflection intensities requires the understanding and propagation of all of the sources of error arising from the measurements. One type of error, which is well understood, is the counting error introduced when the detector counts X-ray photons. Thus, if other types of random errors (such as readout noise) as well as uncertainties in systematic corrections (such as from X-ray attenuation) are completely understood, they can be propagated along with the counting error, as appropriate. In practice, most software packages propagate as much error as they know how to model and then include error-adjustment terms that scale the error estimates until they explain the variance among the measurements. If this is performed carefully, then during SAD phasing likelihood-based approaches can make optimal use of these error estimates, increasing the chance of a successful structure solution. In serial crystallography, SAD phasing has remained challenging, with the few examples of de novo protein structure solution each requiring many thousands of diffraction patterns. Here, the effects of different methods of treating the error estimates are estimated and it is shown that using a parametric approach that includes terms proportional to the known experimental uncertainty, the reflection intensity and the squared reflection intensity to improve the error estimates can allow SAD phasing even from weak zinc anomalous signal.


Subject(s)
Crystallography, X-Ray/methods , Models, Molecular , Thermolysin/chemistry , Crystallization/methods , Data Interpretation, Statistical , Datasets as Topic , Likelihood Functions
12.
Sci Data ; 5: 180201, 2018 10 02.
Article in English | MEDLINE | ID: mdl-30277481

ABSTRACT

Fluctuation X-ray scattering (FXS) is an emerging experimental technique in which solution scattering data are collected using X-ray exposures below rotational diffusion times, resulting in angularly anisotropic X-ray snapshots that provide several orders of magnitude more information than traditional solution scattering data. Such experiments can be performed using the ultrashort X-ray pulses provided by a free-electron laser source, allowing one to collect a large number of diffraction patterns in a relatively short time. Here, we describe a test data set for FXS, obtained at the Linac Coherent Light Source, consisting of close to 100 000 multi-particle diffraction patterns originating from approximately 50 to 200 Paramecium Bursaria Chlorella virus particles per snapshot. In addition to the raw data, a selection of high-quality pre-processed diffraction patterns and a reference SAXS profile are provided.


Subject(s)
Phycodnaviridae , Scattering, Small Angle , X-Ray Diffraction
13.
Proc Natl Acad Sci U S A ; 115(46): 11772-11777, 2018 11 13.
Article in English | MEDLINE | ID: mdl-30373827

ABSTRACT

Fluctuation X-ray scattering (FXS) is an emerging experimental technique in which X-ray solution scattering data are collected from particles in solution using ultrashort X-ray exposures generated by a free-electron laser (FEL). FXS experiments overcome the low data-to-parameter ratios associated with traditional solution scattering measurements by providing several orders of magnitude more information in the final processed data. Here we demonstrate the practical feasibility of FEL-based FXS on a biological multiple-particle system and describe data-processing techniques required to extract robust FXS data and significantly reduce the required number of snapshots needed by introducing an iterative noise-filtering technique. We showcase a successful ab initio electron density reconstruction from such an experiment, studying the Paramecium bursaria Chlorella virus (PBCV-1).


Subject(s)
Crystallography, X-Ray/methods , Photoelectron Spectroscopy/methods , Chlorella , Crystallography, X-Ray/statistics & numerical data , Photoelectron Spectroscopy/statistics & numerical data , Radiography/statistics & numerical data , Research Design , Scattering, Radiation , X-Ray Diffraction , X-Rays
15.
IUCrJ ; 2(Pt 3): 309-16, 2015 May 01.
Article in English | MEDLINE | ID: mdl-25995839

ABSTRACT

X-ray scattering images collected on timescales shorter than rotation diffusion times using a (partially) coherent beam result in a significant increase in information content in the scattered data. These measurements, named fluctuation X-ray scattering (FXS), are typically performed on an X-ray free-electron laser (XFEL) and can provide fundamental insights into the structure of biological molecules, engineered nanoparticles or energy-related mesoscopic materials beyond what can be obtained with standard X-ray scattering techniques. In order to understand, use and validate experimental FXS data, the availability of basic data characteristics and operational properties is essential, but has been absent up to this point. In this communication, an intuitive view of the nature of FXS data and their properties is provided, the effect of FXS data on the derived structural models is highlighted, and generalizations of the Guinier and Porod laws that can ultimately be used to plan experiments and assess the quality of experimental data are presented.

16.
Phys Chem Chem Phys ; 17(14): 8901-12, 2015 Apr 14.
Article in English | MEDLINE | ID: mdl-25747045

ABSTRACT

Multielectron catalytic reactions, such as water oxidation, nitrogen reduction, or hydrogen production in enzymes and inorganic catalysts often involve multimetallic clusters. In these systems, the reaction takes place between metals or metals and ligands to facilitate charge transfer, bond formation/breaking, substrate binding, and release of products. In this study, we present a method to detect X-ray emission signals from multiple elements simultaneously, which allows for the study of charge transfer and the sequential chemistry occurring between elements. Kß X-ray emission spectroscopy (XES) probes charge and spin states of metals as well as their ligand environment. A wavelength-dispersive spectrometer based on the von Hamos geometry was used to disperse Kß signals of multiple elements onto a position detector, enabling an XES spectrum to be measured in a single-shot mode. This overcomes the scanning needs of the scanning spectrometers, providing data free from temporal and normalization errors and therefore ideal to follow sequential chemistry at multiple sites. We have applied this method to study MnOx-based bifunctional electrocatalysts for the oxygen evolution reaction (OER) and the oxygen reduction reaction (ORR). In particular, we investigated the effects of adding a secondary element, Ni, to form MnNiOx and its impact on the chemical states and catalytic activity, by tracking the redox characteristics of each element upon sweeping the electrode potential. The detection scheme we describe here is general and can be applied to time-resolved studies of materials consisting of multiple elements, to follow the dynamics of catalytic and electron transfer reactions.


Subject(s)
Electrochemistry , Electrons , Metals/chemistry , Oxygen/chemistry , Spectrometry, X-Ray Emission/methods , Catalysis , Oxidation-Reduction , Water/chemistry
17.
Acta Crystallogr D Biol Crystallogr ; 70(Pt 12): 3299-309, 2014 Dec 01.
Article in English | MEDLINE | ID: mdl-25478847

ABSTRACT

X-ray diffraction patterns from still crystals are inherently difficult to process because the crystal orientation is not uniquely determined by measuring the Bragg spot positions. Only one of the three rotational degrees of freedom is directly coupled to spot positions; the other two rotations move Bragg spots in and out of the reflecting condition but do not change the direction of the diffracted rays. This hinders the ability to recover accurate structure factors from experiments that are dependent on single-shot exposures, such as femtosecond diffract-and-destroy protocols at X-ray free-electron lasers (XFELs). Here, additional methods are introduced to optimally model the diffraction. The best orientation is obtained by requiring, for the brightest observed spots, that each reciprocal-lattice point be placed into the exact reflecting condition implied by Bragg's law with a minimal rotation. This approach reduces the experimental uncertainties in noisy XFEL data, improving the crystallographic R factors and sharpening anomalous differences that are near the level of the noise.


Subject(s)
Crystallography, X-Ray/methods , Algorithms , Computer Simulation , Lasers , Likelihood Functions , Models, Chemical
18.
Science ; 345(6199): 906-9, 2014 Aug 22.
Article in English | MEDLINE | ID: mdl-25146284

ABSTRACT

Helium nanodroplets are considered ideal model systems to explore quantum hydrodynamics in self-contained, isolated superfluids. However, exploring the dynamic properties of individual droplets is experimentally challenging. In this work, we used single-shot femtosecond x-ray coherent diffractive imaging to investigate the rotation of single, isolated superfluid helium-4 droplets containing ~10(8) to 10(11) atoms. The formation of quantum vortex lattices inside the droplets is confirmed by observing characteristic Bragg patterns from xenon clusters trapped in the vortex cores. The vortex densities are up to five orders of magnitude larger than those observed in bulk liquid helium. The droplets exhibit large centrifugal deformations but retain axially symmetric shapes at angular velocities well beyond the stability range of viscous classical droplets.

19.
Nat Commun ; 5: 4371, 2014 Jul 09.
Article in English | MEDLINE | ID: mdl-25006873

ABSTRACT

The dioxygen we breathe is formed by light-induced oxidation of water in photosystem II. O2 formation takes place at a catalytic manganese cluster within milliseconds after the photosystem II reaction centre is excited by three single-turnover flashes. Here we present combined X-ray emission spectra and diffraction data of 2-flash (2F) and 3-flash (3F) photosystem II samples, and of a transient 3F' state (250 µs after the third flash), collected under functional conditions using an X-ray free electron laser. The spectra show that the initial O-O bond formation, coupled to Mn reduction, does not yet occur within 250 µs after the third flash. Diffraction data of all states studied exhibit an anomalous scattering signal from Mn but show no significant structural changes at the present resolution of 4.5 Å. This study represents the initial frames in a molecular movie of the structural changes during the catalytic reaction in photosystem II.


Subject(s)
Photosynthesis/physiology , Spectrometry, X-Ray Emission/methods , Water/metabolism , X-Ray Diffraction/methods , Cyanobacteria/metabolism , Models, Chemical , Oxidation-Reduction , Oxygen/metabolism , Photosystem II Protein Complex/chemistry , Photosystem II Protein Complex/metabolism
20.
Nat Methods ; 11(5): 545-8, 2014 May.
Article in English | MEDLINE | ID: mdl-24633409

ABSTRACT

X-ray free-electron laser (XFEL) sources enable the use of crystallography to solve three-dimensional macromolecular structures under native conditions and without radiation damage. Results to date, however, have been limited by the challenge of deriving accurate Bragg intensities from a heterogeneous population of microcrystals, while at the same time modeling the X-ray spectrum and detector geometry. Here we present a computational approach designed to extract meaningful high-resolution signals from fewer diffraction measurements.


Subject(s)
Lasers , Macromolecular Substances/chemistry , Bacillus/enzymology , Calcium/chemistry , Calibration , Computer Simulation , Crystallization , Crystallography, X-Ray , Electrons , Equipment Design , Likelihood Functions , Models, Chemical , Molecular Conformation , Muramidase/chemistry , Nanotechnology , Reproducibility of Results , Software , Thermolysin/chemistry , X-Rays , Zinc/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL