Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
Over fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10869458/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823859603892338688 |
---|---|
author | Ngong'Ho Bujiku Sende Snehanshu Saha Leon Ruganzu Saibal Kar |
author_facet | Ngong'Ho Bujiku Sende Snehanshu Saha Leon Ruganzu Saibal Kar |
author_sort | Ngong'Ho Bujiku Sende |
collection | DOAJ |
description | Over fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to predict multidimensional poverty status with the help of best performance-supervised machine-learning algorithms. To achieve this objective, longitudinal data from the 2014/15 and 2020/21 surveys, sourced from the Tanzania National Bureau of Statistics (NBS) were analyzed. A variety of supervised machine-learning algorithms such as RBF Kernel in SVM, Linear Kernel in SVM, Polynomial Kernel in SVM, Random Forest, Logistic regression classifier, Decision tree, Gradient Boosting, K-Nearest Neighbours Classifier, Naïve Bayes Classifier, Artificial Neuron Network and Ensemble Learning Model were implemented to predict multidimensional poverty status for each dataset. This captured the dynamic changes from 2014 to 2021 at the national level. The study employed data pre-processing techniques and adjusted imbalance through weighted categorical cross entropy and 5-Fold Cross Validation to mitigate the inefficiencies due to overfitting and under-fitting of the algorithms. Additionally, dimensionality data reduction was performed using Principal Component Analysis (PCA). With regard to the evaluation metrics, we show that the Ensemble Learning Model has achieved the best performance modelling both balanced and unbalanced datasets. Our policy recommendations draw on the results of the algorithmic predictions. |
format | Article |
id | doaj-art-41ffffd431054b7fa404f922f4725edd |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-41ffffd431054b7fa404f922f4725edd2025-02-11T00:01:19ZengIEEEIEEE Access2169-35362025-01-0113234612347110.1109/ACCESS.2025.353780710869458Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From TanzaniaNgong'Ho Bujiku Sende0https://orcid.org/0009-0001-4185-3857Snehanshu Saha1https://orcid.org/0000-0002-8458-604XLeon Ruganzu2Saibal Kar3https://orcid.org/0000-0001-8134-1517African Centre of Excellence in Data Science, University of Rwanda, Kigali, RwandaCSIS and APPCAIR, Birla Institute of Technology and Science Pilani Goa Campus, Zuarinagar, Goa, IndiaAfrican Centre of Excellence in Data Science, University of Rwanda, Kigali, RwandaCentre for Studies in Social Sciences, Calcutta, India, and IZA, Bonn, GermanyOver fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to predict multidimensional poverty status with the help of best performance-supervised machine-learning algorithms. To achieve this objective, longitudinal data from the 2014/15 and 2020/21 surveys, sourced from the Tanzania National Bureau of Statistics (NBS) were analyzed. A variety of supervised machine-learning algorithms such as RBF Kernel in SVM, Linear Kernel in SVM, Polynomial Kernel in SVM, Random Forest, Logistic regression classifier, Decision tree, Gradient Boosting, K-Nearest Neighbours Classifier, Naïve Bayes Classifier, Artificial Neuron Network and Ensemble Learning Model were implemented to predict multidimensional poverty status for each dataset. This captured the dynamic changes from 2014 to 2021 at the national level. The study employed data pre-processing techniques and adjusted imbalance through weighted categorical cross entropy and 5-Fold Cross Validation to mitigate the inefficiencies due to overfitting and under-fitting of the algorithms. Additionally, dimensionality data reduction was performed using Principal Component Analysis (PCA). With regard to the evaluation metrics, we show that the Ensemble Learning Model has achieved the best performance modelling both balanced and unbalanced datasets. Our policy recommendations draw on the results of the algorithmic predictions.https://ieeexplore.ieee.org/document/10869458/Multidimensional povertymachine learningzones |
spellingShingle | Ngong'Ho Bujiku Sende Snehanshu Saha Leon Ruganzu Saibal Kar Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania IEEE Access Multidimensional poverty machine learning zones |
title | Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania |
title_full | Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania |
title_fullStr | Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania |
title_full_unstemmed | Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania |
title_short | Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania |
title_sort | prediction of multidimensional poverty status with machine learning classification at household level empirical evidence from tanzania |
topic | Multidimensional poverty machine learning zones |
url | https://ieeexplore.ieee.org/document/10869458/ |
work_keys_str_mv | AT ngonghobujikusende predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania AT snehanshusaha predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania AT leonruganzu predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania AT saibalkar predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania |