RESUMO
Hyperaccumulators are the material basis and key to the phytoremediation of heavy metal contaminated soils. Conventional methods for screening hyperaccumulators are highly dependent on the time- and labor-consuming sampling and chemical analysis. In this study, a novel spectral approach assisted with multi-task deep learning was proposed to streamline accumulating ecotype screening, heavy metal stress discrimination, and heavy metals quantification in plants. The significant Cd/Zn co-hyperaccumulator Sedum alfredii and its non-accumulating ecotype were stressed by Cd, Zn, and Pb. Spectral images of leaves were rapidly acquired by hyperspectral imaging. The self-designed deep learning architecture was composed of a shallow network (ENet) for accumulating ecotype identification, and a multi-task network (HMNet) for heavy metal stress type and accumulation prediction simultaneously. To further assess the robustness of the networks, they were compared with conventional machine learning models (i.e., partial least squares (PLS) and support vector machine (SVM)) on a series of evaluation metrics of classification, multi-label classification, and regression. S. alfredii with heavy metals accumulation capability was identified by ENet with 100â¯% accuracy. HMNet reduced overfitting and outperformed machine learning models with the average exact match ratio (EMR) of heavy metal stress discrimination increased by 7.46â¯%, and residual prediction deviations (RPD) of heavy metal concentrations prediction increased by 53.59â¯%. The method succeeded in rapidly and accurately discriminating heavy metal stress with EMRs over 91â¯% and accuracies over 96â¯%, and in predicting heavy metals accumulation with an average RPD of 3.29 for Zn, 2.57 for Cd, and 2.53 for Pb, indicating the satisfactory practicability and potential for sensing heavy metals accumulation. This study provides a relatively novel spectral method to facilitate hyperaccumulator screening and heavy metals accumulation prediction in the phytoremediation process.
Assuntos
Biodegradação Ambiental , Aprendizado Profundo , Metais Pesados , Sedum , Poluentes do Solo , Sedum/efeitos dos fármacos , Sedum/metabolismo , Metais Pesados/análise , Poluentes do Solo/metabolismo , Poluentes do Solo/toxicidade , Poluentes do Solo/análise , Imageamento Hiperespectral/métodos , Folhas de Planta/metabolismo , Cádmio/metabolismo , Cádmio/toxicidade , Zinco/metabolismo , Zinco/análise , Máquina de Vetores de SuporteRESUMO
Wet chemical methods are usually employed in the analysis of macronutrients such as Potassium (K) and Phosphorus (P) and followed by traditional sensor techniques, including inductively coupled plasma optical emission spectrometry (ICP OES), flame atomic absorption spectrometry (FAAS), graphite furnace atomic absorption spectrometry (GF AAS), and inductively coupled plasma mass spectrometry (ICP-MS). Although these procedures have been established for many years, they are costly, time-consuming, and challenging to follow. This study studied the combination of laser-induced breakdown spectroscopy (LIBS) and visible and near-infrared spectroscopy (Vis-NIR) for the quick detection of PK in different varieties of organic fertilizers. Explainable AI (XAI) through Shapley additive explanation values computation (Shap values) was used to extract the valuable features of both sensors. The characteristic variables from different spectroscopic devices were combined to form the spectra fusion. Then, PK was determined using Support Vector Regression (SVR), Partial Least Squares Regression (PLSR), and Extremely Randomized Trees (Extratrees) models. The computation of the coefficient of determination (R2), root mean squared error (RMSE), and residual prediction deviation (RPD) showed that FUSION was more efficient in detecting P (R2p = 0.9946, RMSEp = 0.0649% and RPD = 13.26) and K (R2p = 0.9976, RMSEp = 0.0508% and RPD = 20.28) than single-sensor detection. The outcomes indicated that the features extracted by XAI and the data fusion of LIBS and Vis-NIR could improve the prediction of PK in different varieties of organic fertilizers.
RESUMO
Laser-induced breakdown spectroscopy (LIBS) shows promising applications in the analysis of environmental heavy metals. However, direct analysis in water by LIBS faces the problems of droplet splashing and laser energy decay. In this study, a novel liquid-solid conversion method based on agarose films is proposed to provide an easy-to-operate and sensitive detection of heavy metals. First, the water samples were converted into semi-solid hydrogels with the aid of agarose and then dried into agarose films to make the signal intensities stronger. The calibration curves of Cd, Pb and Cr were constructed. The proposed method was validated by standard heavy metal solutions and real water samples. The results showed that the values of R2 were 0.990, 0.989 and 0.975, and the values of the LOD were 0.011, 0.122 and 0.118 mg L-1 for Cd (I) 228.80, Pb (I) 405.78 and Cr (I) 427.48 nm, respectively. The RMSEs of validation were 0.068 (Cd), 0.107 (Pb) and 0.112 mg·L-1 (Cr), and the recovery values were in the range of 91.2-107.9%. The agarose film-based liquid-solid conversion method achieved the desired ease of operation and sensitivity of LIBS in heavy-metal detection, thereby, showing good application prospects in heavy metal monitoring of water.
RESUMO
Fast detection of heavy metals is important to ensure the quality and safety of herbal medicines. In this study, laser-induced breakdown spectroscopy (LIBS) was applied to detect the heavy metal content (Cd, Cu, and Pb) in Fritillaria thunbergii. Quantitative prediction models were established using a back-propagation neural network (BPNN) optimized using the particle swarm optimization (PSO) algorithm and sparrow search algorithm (SSA), called PSO-BP and SSA-BP, respectively. The results revealed that the BPNN models optimized by PSO and SSA had better accuracy than the BPNN model without optimization. The performance evaluation metrics of the PSO-BP and SSA-BP models were similar. However, the SSA-BP model had two advantages: it was faster and had higher prediction accuracy at low concentrations. For the three heavy metals Cd, Cu and Pb, the prediction correlation coefficient (Rp2) values for the SSA-BP model were 0.972, 0.991 and 0.956; the prediction root mean square error (RMSEP) values were 5.553, 7.810 and 12.906 mg/kg; and the prediction relative percent deviation (RPD) values were 6.04, 10.34 and 4.94, respectively. Therefore, LIBS could be considered a constructive tool for the quantification of Cd, Cu and Pb contents in Fritillaria thunbergii.
Assuntos
Fritillaria , Metais Pesados , Fritillaria/química , Cádmio , Chumbo , Metais Pesados/análise , Análise Espectral/métodos , Algoritmos , LasersRESUMO
Traditional Chinese herbal medicine (TCHM) plays an essential role in the international pharmaceutical industry due to its rich resources and unique curative properties. The flowers, stems, and leaves of Fritillaria contain a wide range of phytochemical compounds, including flavonoids, essential oils, saponins, and alkaloids, which may be useful for medicinal purposes. Fritillaria thunbergii Miq. Bulbs are commonly used in traditional Chinese medicine as expectorants and antitussives. In this paper, a feasibility study is presented that examines the use of hyperspectral imaging integrated with convolutional neural networks (CNN) to distinguish twelve (12) Fritillaria varieties (n = 360). The performance of support vector machines (SVM) and partial least squares-discriminant analysis (PLS-DA) was compared with that of convolutional neural network (CNN). Principal component analysis (PCA) was used to assess the presence of cluster trends in the spectral data. To optimize the performance of the models, cross-validation was used. Among all the discriminant models, CNN was the most accurate with 98.88%, 88.89% in training and test sets, followed by PLS-DA and SVM with 92.59%, 81.94% and 99.65%, 79.17%, respectively. The results obtained in the present study revealed that application of HSI in conjunction with the deep learning technique can be used for classification of Fritillaria thunbergii varieties rapidly and non-destructively.
Assuntos
Alcaloides , Antitussígenos , Aprendizado Profundo , Medicamentos de Ervas Chinesas , Fritillaria , Óleos Voláteis , Saponinas , Alcaloides/análise , Medicamentos de Ervas Chinesas/química , Expectorantes , Flavonoides , Fritillaria/química , Imageamento Hiperespectral , Compostos Fitoquímicos , TecnologiaRESUMO
The quick identification of heavy metals is of major importance and is beneficial for controlling the fertilizer production process in the fertilizer industries. This work aimed to use visible and near-infrared spectroscopy (Vis-NIR), Boruta, and deep learning to establish rapid heavy metals screening methods. Boruta algorithm was used to extract appropriate wavelengths, and a deep belief network (DBN) was computed to determine the amounts of various heavy metals such as chromium (Cr), cadmium (Cd), lead (Pb), and mercury (Hg) for both the entire and selected wavelengths. To assess the model, coefficient of determination (R2), root mean squared error (RMSE), and residual prediction deviation (RPD) were used to calculate the reliability of the model. The results of the selected wavelengths were excellent and much higher than the full wavelengths with R2p = 0.96, RMSEP = 0.2017 mg kg-1 and RPDpred = 5.0 for Cr; R2p = 0.91, RMSEP = 0.2832 mg kg-1 and RPDpred = 3.4 for Pb; R2p = 0.90, RMSEP = 0.2992 mg kg-1, and RPDpred = 3.3 for Hg. Descent prediction was obtained also for Cd (R2p = 0.87, RMSEP = 0.3435 mg kg-1, and RPDpred = 2.7). To further assess the robustness of the DBN, it was compared with conventional machine learning methods such as support vector machine for regression (SVR), k nearest neighbor (KNN), and partial least squares (PLS). The overall results indicated that the Vis-NIR technique coupled with Boruta and DBN could be reliable and accurate for screening heavy metals in organic fertilizers.
RESUMO
Organic fertilizer is a key component of agricultural sustainability and significantly contributes to the improvement of soil fertility. The values of nutrients such as organic matter and nitrogen in organic fertilizers positively affect plant growth and cause environmental problems when used in large amounts. Hence the importance of implementing fast detection of nitrogen (N) and organic matter (OM). This paper examines the feasibility of a framework that combined a particle swarm optimization (PSO) and two multiple stacked generalizations to determine the amount of nitrogen and organic matter in organic-fertilizer using visible near-infrared spectroscopy (Vis-NIR). The first multiple stacked generalizations for classification coupled with PSO (FSGC-PSO) were for feature selection purposes, while the second stacked generalizations for regression (SSGR) improved the detection of nitrogen and organic matter. The computation of root means square error (RMSE) and the coefficient of determination for calibration and prediction set (R2) was used to gauge the different models. The obtained FSGC-PSO subset combined with SSGR achieved significantly better prediction results than conventional methods such as Ridge, support vector machine (SVM), and partial least square (PLS) for both nitrogen (R2p = 0.9989, root mean square error of prediction (RMSEP) = 0.031 and limit of detection (LOD) = 2.97) and organic matter (R2p = 0.9972, RMSEP = 0.051 and LOD = 2.97). Therefore, our settled approach can be implemented as a promising way to monitor and evaluate the amount of N and OM in organic fertilizer.
Assuntos
Fertilizantes , Espectroscopia de Luz Próxima ao Infravermelho , Análise dos Mínimos Quadrados , Nitrogênio , Máquina de Vetores de SuporteRESUMO
Tracking of free proline (FP)-an indicative substance of heavy metal stress in rice leaf-is conducive to improve plant phenotype detection, which has important guiding significance for precise management of rice production. Hyperspectral imaging was used for high-throughput screening FP in rice leaves under cadmium (Cd) stress with five concentrations and four periods. The average spectral of rice leaves were used to show differences in optical properties. Partial least squares (PLS), least-squares support vector machine (LS-SVM) and extreme learning machine (ELM) models based on full spectra and effective wavelengths were established to detect FP content. Genetic algorithm (GA), competitive adaptive weighted sampling (CARS) and PLS weighting regression coefficient (Bw) were compared to screen the most effective wavelengths. Distribution map of the FP content in rice leaves were obtained to display the changes in the FP of leaves visually. The results illustrated that spectral differences increased with Cd stress time and FP content increased with Cd stress concentration. The best result for FP detection is the ELM model based on 27 wavelengths selected by CARS and Rp is 0.9426. Undoubtedly, hyperspectral imaging combined with chemometrics was a rapid, cost effective and non-destructive technique to excavate changes of FP in rice leaves under Cd stress.
Assuntos
Cádmio , Imageamento Hiperespectral , Oryza , Cádmio/análise , Cádmio/toxicidade , Ensaios de Triagem em Larga Escala , Análise dos Mínimos Quadrados , Oryza/química , Folhas de Planta , Prolina , Máquina de Vetores de SuporteRESUMO
Panax notoginseng (P. notoginseng) is a valuable herbal medicine, as well as a dietary food supplement known for its satisfactory clinical efficacy in alleviating blood stasis, reducing swelling, and relieving pain. However, the ability of P. notoginseng to absorb and accumulate cadmium (Cd) poses a significant environmental pollution risk and potential health hazards to humans. In this study, we employed laser-induced breakdown spectroscopy (LIBS) for the rapid detection of Cd. It is important to note that signal uncertainty can impact the quantification performance of LIBS. Hence, we proposed the crater-spectrum feature fusion method, which comprises ablation crater morphology compensation and characteristic peak ratio correction (CPRC), to explore the feasibility of signal uncertainty reduction. The crater morphology compensation method, namely, adding variables using multiple linear regression (MLR) analysis, decreased the root-mean-square error of the prediction set (RMSEP) from 7.0233 µg/g to 5.4043 µg/g. The prediction results were achieved after CPRC pretreatment using the calibration curve model with an RMSEP of 3.4980 µg/g, a limit of detection of 1.92 µg/g, and a limit of quantification of 6.41 µg/g. The crater-spectrum feature fusion method reached the lowest RMSEP of 2.8556 µg/g, based on a least-squares support vector machine (LSSVM) model. The preliminary results suggest the effectiveness of the crater-spectrum feature fusion method for detecting Cd. Furthermore, this method has the potential to be extended to detect other toxic metals in addition to Cd, which significantly contributes to ensuring the quality and safety of agricultural production.
RESUMO
Objective. Acute hypotension episode (AHE) is one of the most critical complications in intensive care unit (ICU). A timely and precise AHE prediction system can provide clinicians with sufficient time to respond with proper therapeutic measures, playing a crucial role in saving patients' lives. Recent studies have focused on utilizing more complex models to improve predictive performance. However, these models are not suitable for clinical application due to limited computing resources for bedside monitors.Approach. To address this challenge, we propose an efficient lightweight dilated shuffle group network. It effectively incorporates shuffling operations into grouped convolutions on the channel and dilated convolutions on the temporal dimension, enhancing global and local feature extraction while reducing computational load.Main results. Our benchmarking experiments on the MIMIC-III and VitalDB datasets, comprising 6036 samples from 1304 patients and 2958 samples from 1047 patients, respectively, demonstrate that our model outperforms other state-of-the-art lightweight CNNs in terms of balancing parameters and computational complexity. Additionally, we discovered that the utilization of multiple physiological signals significantly improves the performance of AHE prediction. External validation on the MIMIC-IV dataset confirmed our findings, with prediction accuracy for AHE 5 min prior reaching 93.04% and 92.04% on the MIMIC-III and VitalDB datasets, respectively, and 89.47% in external verification.Significance. Our study demonstrates the potential of lightweight CNN architectures in clinical applications, providing a promising solution for real-time AHE prediction under resource constraints in ICU settings, thereby marking a significant step forward in improving patient care.
Assuntos
Hospitalização , Hipotensão , Unidades de Terapia Intensiva , Redes Neurais de Computação , Humanos , Hipotensão/fisiopatologia , Hipotensão/diagnóstico , Doença AgudaRESUMO
Rice stem is the sole conduit for cadmium translocation from underground to aboveground. The presence of cadmium can trigger responses of rice stem multi-phenotype, affecting metabolism, reducing yield, and altering composition, which is related to crop growth, food safety, and new energy utilization. Exploring the adversity response of plant phenotypes can provide a reliable assessment of growth status. However, the phytotoxicity and mechanism of cadmium stress on rice stem remain unclear. Here, we systematically revealed the response mechanisms of cadmium accumulation, adversity physiology, and morphological characteristic in rice stem under cadmium stress for the first time with concentration gradients of CK, 5, 25, 50, and 100 µM, and duration gradients of Day 5, Day 10, Day 15, and Day 20. The results indicated that cadmium stress led to a significant increase in cadmium accumulation, accompanied by the adversity response in stem phenotypes. Specifically, cadmium can cause fluctuations in soluble protein and disturbance of malondialdehyde (MDA), which reflects lipid peroxidation induced by cadmium accumulation. Lipid peroxidation inhibited rice growth by causing (1) a reduction in stem length, diameter, and weight, (2) suppression of air cavity, vascular bundle, parenchyma, and epidermal hair, and (3) disruption of cell structure. Furthermore, rapid detection of cadmium was realized based on the combination of laser-induced breakdown spectroscopy (LIBS) and machine learning, which took less than 3 min. The established qualitative model realized the precise discrimination of cadmium stress degrees with a prediction accuracy exceeding 92 %, and the quantitative model achieved the outstanding prediction effect of cadmium, with Rp of 0.9944. This work systematically revealed the phytotoxicity of cadmium on rice stem multi-phenotype from a novel perspective of lipid peroxidation and realized the rapid detection of cadmium in rice stem, which provided the technical tool and theoretical foundation for accurate prevention and efficient control of heavy metal risks in crops.
Assuntos
Oryza , Poluentes do Solo , Cádmio/análise , Fenótipo , Poluentes do Solo/análiseRESUMO
Rapid and accurate detection of agricultural soil chromium (Cr) is of great significance for soil pollution assessment. Laser-induced breakdown spectroscopy (LIBS) could serve as a rapid and chemical-free method for hazardous metal analysis compared with conventional chemical methods. However, the detection of LIBS is interfered by uncertainty and matrix effect. In this study, an average strategy combined with linear weighted network (LWNet) was proposed to reduce the uncertainty. Adaptive weighted normalization-LWNet (AWN-LWNet) framework was proposed to reduce the matrix effect in two soil types. The results indicated that LWNet outperformed traditional machine learning and achieved the average relative error (ARE) of 2.08 % and 3.03 % for yellow brown soil and lateritic red soil, respectively. Moreover, LWNet could effectively mine Cr feature peaks even under the low spectral resolution. AWN-LWNet was the optimal model compared with commonly used models to reduce matrix effect (ARE=4.12 %). Besides, AWN-LWNet greatly reduced the number (from 22016 to 72) of spectral variables for model input. By extracting Cr peaks from models, the difference of Cr peaks intensity could be intuitively observed, which served as spectral interpretation for matrix effect reduction. The two methods have the potential to realize the detection of hazardous metals in soil by LIBS.
RESUMO
Environmental and health risks associated with heavy metal pollution are serious. Human health can be adversely affected by the smallest amount of heavy metals. Modeling spectrum requires the careful selection of variables. Hence, simple variables that have a low level of interference and a high degree of precision are required for fast analysis and online detection. This study used laser-induced breakdown spectroscopy coupled with variable selection and chemometrics to simultaneously analyze heavy metals (Cd, Cu and Pb) in Fritillaria thunbergii. A total of three machine learning algorithms were utilized, including a gradient boosting machine (GBM), partial least squares regression (PLSR) and support vector regression (SVR). Three promising wavelength selection methods were evaluated for comparison, namely, a competitive adaptive reweighted sampling method (CARS), a random frog method (RF), and an uninformative variable elimination method (UVE). Compared to full wavelengths, the selected wavelengths produced excellent results. Overall, RC2, RV2, RP2, RSMEC, RSMEV and RSMEP for the selected variables are as follows: 0.9967, 0.8899, 0.9403, 1.9853 mg kg-1, 11.3934 mg kg-1, 8.5354 mg kg-1; 0.9933, 0.9316, 0.9665, 5.9332 mg kg-1, 18.3779 mg kg-1, 11.9356 mg kg-1; 0.9992, 0.9736, 0.9686, 1.6707 mg kg-1, 10.2323 mg kg-1, 10.1224 mg kg-1 were obtained for Cd Cu and Pb, respectively. Experimental results showed that all three methods could perform variable selection effectively, with GBM-UVE for Cd, SVR-RF for Pb, and GBM-CARS for Cu providing the best results. The results of the study suggest that LIBS coupled with wavelength selection can be used to detect heavy metals rapidly and accurately in Fritillaria by extracting only a few variables that contain useful information and eliminating non-informative variables.
RESUMO
BACKGROUND: The breeding of high-quality, high-yield, and disease-resistant varieties is closely related to food security. The investigation of breeding results relies on the evaluation of seed phenotype, which is a key step in the process of breeding. In the global digitalization trend, digital technology based on optical sensors can perform the digitization of seed phenotype in a non-contact, high throughput way, thus significantly improving breeding efficiency. AIM OF REVIEW: This paper provides a comprehensive overview of the principles, characteristics, data processing methods, and bottlenecks associated with three digital technique types based on optical sensors: spectroscopy, digital imaging, and three-dimensional (3D) reconstruction techniques. In addition, the applicability and adaptability of digital techniques based on the optical sensors of maize seed phenotype traits, namely external visible phenotype (EVP) and internal invisible phenotype (IIP), are investigated. Furthermore, trends in future equipment, platform, phenotype data, and processing algorithms are discussed. This review offers conceptual and practical support for seed phenotype digitization based on optical sensors, which will provide reference and guidance for future research. KEY SCIENTIFIC CONCEPTS OF REVIEW: The digital techniques based on optical sensors can perform non-contact and high-throughput seed phenotype evaluation. Due to the distinct characteristics of optical sensors, matching suitable digital techniques according to seed phenotype traits can greatly reduce resource loss, and promote the efficiency of seed evaluation as well as breeding decision-making. Future research in phenotype equipment and platform, phenotype data, and processing algorithms will make digital techniques better meet the demands of seed phenotype evaluation, and promote automatic, integrated, and intelligent evaluation of seed phenotype, further helping to lessen the gap between digital techniques and seed phenotyping.
RESUMO
The root is an important organ affecting cadmium accumulation in grains, but there is no comprehensive research involving rice root phenotype under cadmium stress yet. To assess the effect of cadmium on root phenotypes, this paper investigated the response mechanism of phenotypic information including cadmium accumulation, adversity physiology, morphological parameters, and microstructure characteristics, and explored rapid detection methods of cadmium accumulation and adversity physiology. We found that cadmium had the effect of "low-promotion and high-inhibition" on root phenotypes. In addition, the rapid detection of cadmium (Cd), soluble protein (SP), and malondialdehyde (MDA) were achieved based on spectroscopic technology and chemometrics, where the optimal prediction model was least squares support vector machine (LS-SVM) based on the full spectrum (Rp=0.9958) for Cd, competitive adaptive reweighted sampling-extreme learning machine (CARS-ELM) (Rp=0.9161) for SP and CARS-ELM (Rp=0.9021) for MDA, all with Rp higher than 0.9. Surprisingly, it took only about 3 min, which was more than 90% reduction in detection time compared with laboratory analysis, demonstrating the excellent ability of spectroscopy for root phenotype detection. These results reveal response mechanism to heavy metal and provide rapid detection method for phenotypic information, which can substantially contribute to crop heavy metal control and food safety supervision.
Assuntos
Oryza , Oryza/metabolismo , Cádmio/metabolismo , Análise Espectral , Fenótipo , Análise dos Mínimos QuadradosRESUMO
Herbs have been used as natural remedies for disease treatment, prevention, and health care. Some herbs with functional properties are also used as food or food additives for culinary purposes. The quality and safety inspection of herbs are influenced by various factors, which need to be assessed in each operation across the whole process of herb production. Traditional analysis methods are time-consuming and laborious, without quick response, which limits industry development and digital detection. Considering the efficiency and accuracy, faster, cheaper, and more environment-friendly techniques are highly needed to complement or replace the conventional chemical analysis methods. Infrared (IR) and Raman spectroscopy techniques have been applied to the quality control and safety inspection of herbs during the last several decades. In this paper, we generalize the current application using IR and Raman spectroscopy techniques across the whole process, from raw materials to patent herbal products. The challenges and remarks were proposed in the end, which serve as references for improving herb detection based on IR and Raman spectroscopy techniques. Meanwhile, make a path to driving intelligence and automation of herb products factories.
RESUMO
Laser-induced Breakdown Spectroscopy (LIBS) is becoming an increasingly popular analytical technique for characterizing and identifying various products; its multi-element analysis, fast response, remote sensing, and sample preparation is minimal or nonexistent, and low running costs can significantly accelerate the analysis of foods with medicinal properties (FMPs). A comprehensive overview of recent advances in LIBS is presented, along with its future trends, viewpoints, and challenges. Besides reviewing its applications in both FMPs, it is intended to provide a concise description of the use of LIBS and chemometrics for the detection of FMPs, rather than a detailed description of the fundamentals of the technique, which others have already discussed. Finally, LIBS, like conventional approaches, has some limitations. However, it is a promising technique that may be employed as a routine analysis technique for FMPs when utilized effectively.
RESUMO
Soybean seed purity is a critical factor in agricultural products, standardization of seed quality, and food processing. In this study, laser-induced breakdown spectroscopy (LIBS) as an effective technology was successfully used to identify ten varieties of soybean seeds. We improved the traditional sample preparation scheme for LIBS. Instead of grinding and squashing, we propose a time-efficient method by pressing soybean seeds into rubber sand filled with culture plates through a ruler to ensure a relatively uniform surface height. In our experimental scheme, three LIBS spectra were finally collected for each soybean seed. A majority vote based on three spectra was applied as the final decision judging the attribution of a single soybean seed. The results showed that the support vector machine (SVM) obtained the optimal identification accuracy of 90% in the prediction set. In addition, PCA-ResNet (propagation coefficient adaptive ResNet) and PCSA-ResNet (propagation coefficient synchronous adaptive ResNet) were designed based on typical ResNet structure by changing the way of self-adaption of propagation coefficients. Combined with a new form of input data called spectral matrix, PCSA-ResNet obtained the optimal performance with the discriminate accuracy of 91.75% in the prediction set. T-distributed stochastic neighbor embedding (t-SNE) was used to visualize the clustering process of the extracted features by PCSA-ResNet. For the interpretation of the good performance of PCSA-ResNet coupled with the spectral matrix, saliency maps were further applied to visually show the pixel positions of the spectral matrix that had a significant influence on the discrimination results, indicating that the content and proportion of elements in soybean seeds could reflect the variety differences.
RESUMO
Millet is a primary food for people living in the dry and semi-dry regions and is dispersed within most parts of Europe, Africa, and Asian countries. As part of the European Union (EU) efforts to establish food originality, there is a global need to create Protected Geographical Indication (PGI) and Protected Designation of Origin (PDO) of crops and agricultural products to ensure the integrity of the food supply. In the present work, Visible and Near-Infrared Spectroscopy (Vis-NIR) combined with machine learning techniques was used to discriminate 16 millet varieties (n = 480) originating from various regions of China. Five different machine learning algorithms, namely, K-nearest neighbor (K-NN), Linear discriminant analysis (LDA), Logistic regression (LR), Random Forest (RF), and Support vector machine (SVM), were used to train the NIR spectra of these millet samples and to assess their discrimination performance. Visible cluster trends were obtained from the Principal Component Analysis (PCA) of the spectral data. Cross-validation was used to optimize the performance of the models. Overall, the F-Score values were as follows: SVM with 99.5%, accompanied by RF with 99.5%, LDA with 99.5%, K-NN with 99.1%, and LR with 98.8%. Both the linear and non-linear algorithms yielded positive results, but the non-linear models appear slightly better. The study revealed that applying Vis-NIR spectroscopy assisted by machine learning technique can be an essential tool for tracing the origins of millet, contributing to a safe authentication method in a quick, relatively cheap, and non-destructive way.
RESUMO
Accurate geographical origin identification is of great significance to ensure the quality of traditional Chinese medicine (TCM). Laser-induced breakdown spectroscopy (LIBS) was applied to achieve the fast geographical origin identification of wild Gentiana rigescens Franch (G. rigescens Franch). However, LIBS spectra with too many variables could increase the training time of models and reduce the discrimination accuracy. In order to solve the problems, we proposed two methods. One was reducing the number of variables through two consecutive variable selections. The other was transforming the spectrum into spectral matrix by spectrum segmentation and recombination. Combined with convolutional neural network (CNN), both methods could improve the accuracy of discrimination. For the underground parts of G. rigescens Franch, the optimal accuracy in the prediction set for the two methods was 92.19 and 94.01%, respectively. For the aerial parts, the two corresponding accuracies were the same with the value of 94.01%. Saliency map was used to explain the rationality of discriminant analysis by CNN combined with spectral matrix. The first method could provide some support for LIBS portable instrument development. The second method could offer some reference for the discriminant analysis of LIBS spectra with too many variables by the end-to-end learning of CNN. The present results demonstrated that LIBS combined with CNN was an effective tool to quickly identify the geographical origin of G. rigescens Franch.