Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand
Abstract We have used NASA’s Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA2) reanalysis data of aerosols and meteorology into a machine learning algorithm (MLA) to estimate surface PM2.5 concentration in Thailand. One year of hourly data from 51 ground monitoring...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2021-09-01
|
Series: | Aerosol and Air Quality Research |
Subjects: | |
Online Access: | https://doi.org/10.4209/aaqr.210105 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823862908712386560 |
---|---|
author | Pawan Gupta Shanshan Zhan Vikalp Mishra Aekkapol Aekakkararungroj Amanda Markert Sarawut Paibong Farrukh Chishtie |
author_facet | Pawan Gupta Shanshan Zhan Vikalp Mishra Aekkapol Aekakkararungroj Amanda Markert Sarawut Paibong Farrukh Chishtie |
author_sort | Pawan Gupta |
collection | DOAJ |
description | Abstract We have used NASA’s Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA2) reanalysis data of aerosols and meteorology into a machine learning algorithm (MLA) to estimate surface PM2.5 concentration in Thailand. One year of hourly data from 51 ground monitoring stations in Thailand was spatiotemporally collocated with MERRA2 fields. The integrated data then used to train and validate a supervised MLA’ random forest’ to estimate hourly and daily PM2.5 concentrations. The MLA is cross-validated using a 10-fold random sampling approach. The trained MLA can estimate PM2.5 with close to zero mean bias across the country. The correlation coefficient of 0.95 with slope and intercept values of 0.95 and 0.88 are achieved between observed and estimated PM2.5. The MLA also shows underestimation at hourly scale under very clean conditions (PM2.5 < 10 µg m−3) and overestimation during high loading (PM2.5 > 80 µg m−3). The hourly data also demonstrate high skill in following the diurnal cycle during different seasons of the year. The daily mean PM2.5 (24-hour) values follow day-to-day variability very well (correlation coefficient of 0.98, RMSE = 3.14 µg m−3), showing high value during winter months (November– February) and lower during other seasons. The trained MLA has the potential to reprocess the MERRA2 timeseries for the region, and the bias corrected data can be used in other applications such as long-term trend analysis and health exposure studies. The MLA can also be applied to GEOS forecasted fields to generate bias corrected air quality forecasts for the region. |
format | Article |
id | doaj-art-917cecd443d2486f8dd92c4fccc907a5 |
institution | Kabale University |
issn | 1680-8584 2071-1409 |
language | English |
publishDate | 2021-09-01 |
publisher | Springer |
record_format | Article |
series | Aerosol and Air Quality Research |
spelling | doaj-art-917cecd443d2486f8dd92c4fccc907a52025-02-09T12:20:38ZengSpringerAerosol and Air Quality Research1680-85842071-14092021-09-01211111310.4209/aaqr.210105Machine Learning Algorithm for Estimating Surface PM2.5 in ThailandPawan Gupta0Shanshan Zhan1Vikalp Mishra2Aekkapol Aekakkararungroj3Amanda Markert4Sarawut Paibong5Farrukh Chishtie6Universities Space Research Association (USRA)Earth System Science Center, The University of Alabama in HuntsvilleEarth System Science Center, The University of Alabama in HuntsvilleAsian Disaster Preparedness CenterEarth System Science Center, The University of Alabama in HuntsvilleThai Pollution Control DepartmentAsian Disaster Preparedness CenterAbstract We have used NASA’s Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA2) reanalysis data of aerosols and meteorology into a machine learning algorithm (MLA) to estimate surface PM2.5 concentration in Thailand. One year of hourly data from 51 ground monitoring stations in Thailand was spatiotemporally collocated with MERRA2 fields. The integrated data then used to train and validate a supervised MLA’ random forest’ to estimate hourly and daily PM2.5 concentrations. The MLA is cross-validated using a 10-fold random sampling approach. The trained MLA can estimate PM2.5 with close to zero mean bias across the country. The correlation coefficient of 0.95 with slope and intercept values of 0.95 and 0.88 are achieved between observed and estimated PM2.5. The MLA also shows underestimation at hourly scale under very clean conditions (PM2.5 < 10 µg m−3) and overestimation during high loading (PM2.5 > 80 µg m−3). The hourly data also demonstrate high skill in following the diurnal cycle during different seasons of the year. The daily mean PM2.5 (24-hour) values follow day-to-day variability very well (correlation coefficient of 0.98, RMSE = 3.14 µg m−3), showing high value during winter months (November– February) and lower during other seasons. The trained MLA has the potential to reprocess the MERRA2 timeseries for the region, and the bias corrected data can be used in other applications such as long-term trend analysis and health exposure studies. The MLA can also be applied to GEOS forecasted fields to generate bias corrected air quality forecasts for the region.https://doi.org/10.4209/aaqr.210105ThailandMERRA2PM2.5Air qualityMachine learning |
spellingShingle | Pawan Gupta Shanshan Zhan Vikalp Mishra Aekkapol Aekakkararungroj Amanda Markert Sarawut Paibong Farrukh Chishtie Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand Aerosol and Air Quality Research Thailand MERRA2 PM2.5 Air quality Machine learning |
title | Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand |
title_full | Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand |
title_fullStr | Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand |
title_full_unstemmed | Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand |
title_short | Machine Learning Algorithm for Estimating Surface PM2.5 in Thailand |
title_sort | machine learning algorithm for estimating surface pm2 5 in thailand |
topic | Thailand MERRA2 PM2.5 Air quality Machine learning |
url | https://doi.org/10.4209/aaqr.210105 |
work_keys_str_mv | AT pawangupta machinelearningalgorithmforestimatingsurfacepm25inthailand AT shanshanzhan machinelearningalgorithmforestimatingsurfacepm25inthailand AT vikalpmishra machinelearningalgorithmforestimatingsurfacepm25inthailand AT aekkapolaekakkararungroj machinelearningalgorithmforestimatingsurfacepm25inthailand AT amandamarkert machinelearningalgorithmforestimatingsurfacepm25inthailand AT sarawutpaibong machinelearningalgorithmforestimatingsurfacepm25inthailand AT farrukhchishtie machinelearningalgorithmforestimatingsurfacepm25inthailand |