Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning.
Molecules
; 28(2)2023 Jan 13.
Article
in En
| MEDLINE
| ID: mdl-36677867
Confusing low-molecular-weight hyaluronic acid (LMWHA) from acid degradation and enzymatic hydrolysis (named LMWHA-A and LMWHA-E, respectively) will lead to health hazards and commercial risks. The purpose of this work is to analyze the structural differences between LMWHA-A and LMWHA-E, and then achieve a fast and accurate classification based on near-infrared (NIR) spectroscopy and machine learning. First, we combined nuclear magnetic resonance (NMR), Fourier transform infrared (FTIR) spectroscopy, two-dimensional correlated NIR spectroscopy (2DCOS), and aquaphotomics to analyze the structural differences between LMWHA-A and LMWHA-E. Second, we compared the dimensionality reduction methods including principal component analysis (PCA), kernel PCA (KPCA), and t-distributed stochastic neighbor embedding (t-SNE). Finally, the differences in classification effect of traditional machine learning methods including partial least squares-discriminant analysis (PLS-DA), support vector classification (SVC), and random forest (RF) as well as deep learning methods including one-dimensional convolutional neural network (1D-CNN) and long short-term memory (LSTM) were compared. The results showed that genetic algorithm (GA)-SVC and RF were the best performers in traditional machine learning, but their highest accuracy in the test dataset was 90%, while the accuracy of 1D-CNN and LSTM models in the training dataset and test dataset classification was 100%. The results of this study show that compared with traditional machine learning, the deep learning models were better for the classification of LMWHA-A and LMWHA-E. Our research provides a new methodological reference for the rapid and accurate classification of biological macromolecules.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Deep Learning
Language:
En
Journal:
Molecules
Journal subject:
BIOLOGIA
Year:
2023
Document type:
Article
Affiliation country:
China
Country of publication:
Switzerland