RESUMO
Past air pollution epidemiological studies have used a wide range of methods to develop concentration fields for health analyses. The fields developed differ considerably among these methods. The reasons for these differences and comparisons of their strengths, as well as the limitations for estimating exposures, remains under-investigated. Here, we applied nine methods to develop fields of eight pollutants (carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone (O3), fine particulate matter (PM2.5), and three speciated PM2.5 constituents including elemental carbon (EC), organic carbon (OC), and sulfate (SO4)) for the metropolitan Atlanta region for five years. The nine methods are Central Monitor (CM), Site Average (SA), Inverse Distance Weighting (IDW), Kriging (KRIG), Land Use Regression (LUR), satellite Aerosol Optical Depth (AOD), CMAQ model, CMAQ with kriging adjustment (CMAQ-KRIG), and CMAQ based data fusion (CMAQ-DF). Additionally, we applied an increasingly popular method, Random Forest (RF), and compared its results for NO2 and PM2.5 with other methods. For statistical evaluation, we focused our discussion on the temporal coefficient of determination, although other metrics are also calculated. Raw output from the CMAQ model contains modeling biases and errors, which are partially mitigated by fusing observational data in the CMAQ-KRIG and CMAQ-DF methods. Based on analyses that simulated model responses to more limited input data, the RF model is more robust and outperforms LUR for PM2.5. These results suggest RF may have potential in air pollution health studies, especially when limited measurement data are available. The RF method has several important weaknesses, including a relatively poor performance for NO2, diagnostic challenges, and computational intensiveness. The results of this study will help to improve our understanding of the strengths and weaknesses of different methods for estimating air pollutant exposures in epidemiological studies.