Your browser doesn't support javascript.
loading
Evaluating statistical model performance in water quality prediction.
Avila, Rodelyn; Horn, Beverley; Moriarty, Elaine; Hodson, Roger; Moltchanova, Elena.
Afiliación
  • Avila R; School of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand; Institute of Environmental Science and Research, ESR, PO Box 29181, Christchurch 8540, New Zealand. Electronic address: rodelyn.avila@pg.canterbury.ac.nz.
  • Horn B; Institute of Environmental Science and Research, ESR, PO Box 29181, Christchurch 8540, New Zealand.
  • Moriarty E; Institute of Environmental Science and Research, ESR, PO Box 29181, Christchurch 8540, New Zealand.
  • Hodson R; Environment Southland, Private Bag 90116, Invercargill 9840, New Zealand.
  • Moltchanova E; School of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8140, New Zealand.
J Environ Manage ; 206: 910-919, 2018 Jan 15.
Article en En | MEDLINE | ID: mdl-29207304
Exposure to contaminated water while swimming or boating or participating in other recreational activities can cause gastrointestinal and respiratory disease. It is not uncommon for water bodies to experience rapid fluctuations in water quality, and it is therefore vital to be able to predict them accurately and in time so as to minimise population's exposure to pathogenic organisms. E. coli is commonly used as an indicator to measure water quality in freshwater, and higher counts of E. coli are associated with increased risk to illness. In this case study, we compare the performance of a wide range of statistical models in prediction of water quality via E. coli levels for the weekly data collected over the summer months from 2006 to 2014 at the recreational site on the Oreti river in Wallacetown, New Zealand. The models include naive model, multiple linear regression, dynamic regression, regression tree, Markov chain, classification tree, random forests, multinomial logistic regression, discriminant analysis and Bayesian network. The results show that Bayesian network was superior to all the other models. Overall, it had a leave-one-out and k-fold cross validation error rate of 21%, while predicting the majority of instances of E. coli levels classified as unsafe by the Microbiological Water Quality Guidelines for Marine and Freshwater Recreational Areas 2003, New Zealand. Because Bayesian networks are also flexible in handling missing data and outliers and allow for continuous updating in real time, we have found them to be a promising tool, and in the future, plan to extend the analysis beyond the current case study site.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Calidad del Agua / Modelos Estadísticos / Escherichia coli Tipo de estudio: Prognostic_studies / Risk_factors_studies País/Región como asunto: Oceania Idioma: En Revista: J Environ Manage Año: 2018 Tipo del documento: Article Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Calidad del Agua / Modelos Estadísticos / Escherichia coli Tipo de estudio: Prognostic_studies / Risk_factors_studies País/Región como asunto: Oceania Idioma: En Revista: J Environ Manage Año: 2018 Tipo del documento: Article Pais de publicación: Reino Unido