RESUMO
Total human saliva is a biofluid which can be considered as a "mirror" reflecting the state of the body's health. The "spectral mid-infrared fingerprint" represents a snapshot of the intrinsic biomolecular composition of a saliva sample translating multiple information about the patient, and likely to be related not only to his physiopathological status but also to his behavioral habits or even current medical treatments. These different patient-related characteristics are "confounding factors," which may strongly affect the infrared data of salivary samples and disrupt the search for specific salivary biomarkers in the detection of diseases, especially in the case of complex pathologies influenced by multiple risk factors such as genetic factors and behavioral factors, and also other comorbidities. In this study, dealing with the processing of infrared saliva spectra from 56 patients, our aim was to highlight spectral features associated with some patient characteristics, namely tobacco smoking, periodontal diseases, and gender. By using multivariate statistical methods of feature selection (principal component analysis coupled with Kruskal-Wallis test, linear discriminant analysis coupled with randfeatures function), we were able to identify the discriminant vibrations associated with a specific factor and to assess the related spectral variability. Based on the methodology demonstrated here, it could be very valuable in the future to develop processing aimed at neutralizing these variabilities, in order to determine specific spectroscopic markers related to a multifactorial disease for diagnostic or follow-up purposes.