Your browser doesn't support javascript.
loading
Stable isotope and trace element analyses with non-linear machine-learning data analysis improved coffee origin classification and marker selection.
Sim, Joy; Mcgoverin, Cushla; Oey, Indrawati; Frew, Russell; Kebede, Biniam.
Affiliation
  • Sim J; Department of Food Science, University of Otago, Dunedin, New Zealand.
  • Mcgoverin C; Department of Physics, University of Auckland, Auckland, New Zealand.
  • Oey I; The Dodd-Walls Centre for Photonic and Quantum Technologies, Dunedin, New Zealand.
  • Frew R; Department of Food Science, University of Otago, Dunedin, New Zealand.
  • Kebede B; The Riddet Institute, Palmerston North, New Zealand.
J Sci Food Agric ; 103(9): 4704-4718, 2023 Jul.
Article in En | MEDLINE | ID: mdl-36924039
BACKGROUND: This study investigated the geographical origin classification of green coffee beans from continental to country and regional levels. An innovative approach combined stable isotope and trace element analyses with non-linear machine learning data analysis to improve coffee origin classification and marker selection. Specialty green coffee beans sourced from three continents, eight countries, and 22 regions were analyzed by measuring five isotope ratios (δ13 C, δ15 N, δ18 O, δ2 H, and δ34 S) and 41 trace elements. Partial least squares discriminant analysis (PLS-DA) was applied to the integrated dataset for origin classification. RESULTS: Origins were predicted well at the country level and showed promise at the regional level, with discriminating marker selection at all levels. However, PLS-DA predicted origin poorly at the continental and Central American regional levels. Non-linear machine learning techniques improved predictions and enabled the identification of a higher number of origin markers, and those that were identified were more relevant. The best predictive accuracy was found using ensemble decision trees, random forest and extreme gradient boost, with accuracies of up to 0.94 and 0.89 for continental and Central American regional models, respectively. CONCLUSION: The potential for advanced machine learning models to improve origin classification and the identification of relevant origin markers was demonstrated. The decision-tree-based models were superior with their embedded variable identification features and visual interpretation. © 2023 The Authors. Journal of The Science of Food and Agriculture published by John Wiley & Sons Ltd on behalf of Society of Chemical Industry.
Subject(s)
Key words

Full text: 1 Database: MEDLINE Main subject: Machine Learning Type of study: Prognostic_studies Language: En Journal: J Sci Food Agric Year: 2023 Type: Article Affiliation country: New Zealand

Full text: 1 Database: MEDLINE Main subject: Machine Learning Type of study: Prognostic_studies Language: En Journal: J Sci Food Agric Year: 2023 Type: Article Affiliation country: New Zealand