RESUMEN
Serial crystallography experiments at synchrotron and X-ray free-electron laser (XFEL) sources are producing crystallographic data sets of ever-increasing volume. While these experiments have large data sets and high-frame-rate detectors (around 3520 frames per second), only a small percentage of the data are useful for downstream analysis. Thus, an efficient and real-time data classification pipeline is essential to differentiate reliably between useful and non-useful images, typically known as 'hit' and 'miss', respectively, and keep only hit images on disk for further analysis such as peak finding and indexing. While feature-point extraction is a key component of modern approaches to image classification, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. This paper proposes a pipeline to categorize the data, consisting of a real-time feature extraction algorithm called modified and parallelized FAST (MP-FAST), an image descriptor and a machine learning classifier. For parallelizing the primary operations of the proposed pipeline, central processing units, graphics processing units and field-programmable gate arrays are implemented and their performances compared. Finally, MP-FAST-based image classification is evaluated using a multi-layer perceptron on various data sets, including both synthetic and experimental data. This approach demonstrates superior performance compared with other feature extractors and classifiers.
RESUMEN
Serial crystallography experiments at X-ray free-electron laser facilities produce massive amounts of data but only a fraction of these data are useful for downstream analysis. Thus, it is essential to differentiate between acceptable and unacceptable data, generally known as 'hit' and 'miss', respectively. Image classification methods from artificial intelligence, or more specifically convolutional neural networks (CNNs), classify the data into hit and miss categories in order to achieve data reduction. The quantitative performance established in previous work indicates that CNNs successfully classify serial crystallography data into desired categories [Ke, Brewster, Yu, Ushizima, Yang & Sauter (2018). J. Synchrotron Rad.25, 655-670], but no qualitative evidence on the internal workings of these networks has been provided. For example, there are no visualization methods that highlight the features contributing to a specific prediction while classifying data in serial crystallography experiments. Therefore, existing deep learning methods, including CNNs classifying serial crystallography data, are like a 'black box'. To this end, presented here is a qualitative study to unpack the internal workings of CNNs with the aim of visualizing information in the fundamental blocks of a standard network with serial crystallography data. The region(s) or part(s) of an image that mostly contribute to a hit or miss prediction are visualized.
RESUMEN
Serial crystallography experiments produce massive amounts of experimental data. Yet in spite of these large-scale data sets, only a small percentage of the data are useful for downstream analysis. Thus, it is essential to differentiate reliably between acceptable data (hits) and unacceptable data (misses). To this end, a novel pipeline is proposed to categorize the data, which extracts features from the images, summarizes these features with the 'bag of visual words' method and then classifies the images using machine learning. In addition, a novel study of various feature extractors and machine learning classifiers is presented, with the aim of finding the best feature extractor and machine learning classifier for serial crystallography data. The study reveals that the oriented FAST and rotated BRIEF (ORB) feature extractor with a multilayer perceptron classifier gives the best results. Finally, the ORB feature extractor with multilayer perceptron is evaluated on various data sets including both synthetic and experimental data, demonstrating superior performance compared with other feature extractors and classifiers.
RESUMEN
Online reviews play an important role in consumer purchase decisions and have received much research attention. However, previous research has typically examined the effects of online review characteristics independent of firm marketing messages. We argue that how much average review rating influences consumers' decisions depends on the presence of a scarcity appeal and its congruence with review volume information. Through a lab experiment and analyses of real-world data from Amazon.com, we show that claiming a product to have limited supply moves consumers toward more heuristic processing but only when review volume is consistent with the scarcity information. In contrast, when review volume is incongruent with the supply-based scarcity message, the incongruence prompts consumers to process information more carefully and reduces their reliance on review valence.
RESUMEN
Simultaneous low humidity, high temperature, and high wind speeds disturb the water balance in plants, intensify evapotranspiration, and can ultimately lead to crop damage. In addition, these events have been linked to flash droughts and can play a critical role in the spread of human ignited wildfires. The spatial patterns and temporal changes of hot, dry, and windy events (HDWs) for two time periods, 1949 to 2018 (70-years) and 1969 to 2018 (50-years) were analyzed in the central United States. The highest frequencies of HDWs were observed at stations in western Kansas and west Texas. Annually, the highest number of events happened concurrently with the major heat waves and droughts in 1980 and 2011. Temporally, an overall decrease in the HDWs was significant in the eastern regions of North Dakota and South Dakota, and an upward trend was significant in Texas and the western part of the Great Plains. Significant trends in HDWs co-occurred more frequently with significant trends in extreme temperatures compared with low humidity or strong wind events. The results of this study provide valuable information on the location of places where HDWs are more likely to occur. The information provided could be used to improve water management strategies.