Your browser doesn't support javascript.
loading
The value of human data annotation for machine learning based anomaly detection in environmental systems.
Russo, Stefania; Besmer, Michael D; Blumensaat, Frank; Bouffard, Damien; Disch, Andy; Hammes, Frederik; Hess, Angelika; Lürig, Moritz; Matthews, Blake; Minaudo, Camille; Morgenroth, Eberhard; Tran-Khac, Viet; Villez, Kris.
Afiliação
  • Russo S; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Ecovision Lab, Photogrammetry and Remote Sensing, Zürich, Switzerland. Electronic address: stefania.russo@geod.baug.ethz.ch.
  • Besmer MD; onCyt Microbiology AG, Zürich, Switzerland.
  • Blumensaat F; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland.
  • Bouffard D; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
  • Disch A; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
  • Hammes F; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland.
  • Hess A; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland.
  • Lürig M; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Eawag, Department of Fish Ecology & Evolution, Centre for Ecology Evolution and Biogeochemistry, 79 Seestrasse, 6047, Luzern; Department of Biology, Lund University, 22362 Lund, Sweden.
  • Matthews B; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Eawag, Department of Fish Ecology & Evolution, Centre for Ecology Evolution and Biogeochemistry, 79 Seestrasse, 6047, Luzern.
  • Minaudo C; École Polytechnique Fédérale de Lausanne, Physics of Aquatic Systems Laboratory, Margaretha Kamprad Chair, Lausanne, Switzerland.
  • Morgenroth E; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland.
  • Tran-Khac V; INRAE, Université Savoie Mont Blanc, CARRTEL, 74200 Thonon-les-Bains, France.
  • Villez K; Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.
Water Res ; 206: 117695, 2021 Nov 01.
Article em En | MEDLINE | ID: mdl-34626884
ABSTRACT
Anomaly detection is the process of identifying unexpected data samples in datasets. Automated anomaly detection is either performed using supervised machine learning models, which require a labelled dataset for their calibration, or unsupervised models, which do not require labels. While academic research has produced a vast array of tools and machine learning models for automated anomaly detection, the research community focused on environmental systems still lacks a comparative analysis that is simultaneously comprehensive, objective, and systematic. This knowledge gap is addressed for the first time in this study, where 15 different supervised and unsupervised anomaly detection models are evaluated on 5 different environmental datasets from engineered and natural aquatic systems. To this end, anomaly detection performance, labelling efforts, as well as the impact of model and algorithm tuning are taken into account. As a result, our analysis reveals the relative strengths and weaknesses of the different approaches in an objective manner without bias for any particular paradigm in machine learning. Most importantly, our results show that expert-based data annotation is extremely valuable for anomaly detection based on machine learning.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Curadoria de Dados / Aprendizado de Máquina Tipo de estudo: Diagnostic_studies Limite: Humans Idioma: En Revista: Water Res Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Curadoria de Dados / Aprendizado de Máquina Tipo de estudo: Diagnostic_studies Limite: Humans Idioma: En Revista: Water Res Ano de publicação: 2021 Tipo de documento: Article