RESUMO
Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include:â¢Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases.â¢Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting.â¢Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm.