Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania

Over fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to...

Full description

Saved in:
Bibliographic Details
Main Authors: Ngong'Ho Bujiku Sende, Snehanshu Saha, Leon Ruganzu, Saibal Kar
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10869458/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823859603892338688
author Ngong'Ho Bujiku Sende
Snehanshu Saha
Leon Ruganzu
Saibal Kar
author_facet Ngong'Ho Bujiku Sende
Snehanshu Saha
Leon Ruganzu
Saibal Kar
author_sort Ngong'Ho Bujiku Sende
collection DOAJ
description Over fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to predict multidimensional poverty status with the help of best performance-supervised machine-learning algorithms. To achieve this objective, longitudinal data from the 2014/15 and 2020/21 surveys, sourced from the Tanzania National Bureau of Statistics (NBS) were analyzed. A variety of supervised machine-learning algorithms such as RBF Kernel in SVM, Linear Kernel in SVM, Polynomial Kernel in SVM, Random Forest, Logistic regression classifier, Decision tree, Gradient Boosting, K-Nearest Neighbours Classifier, Naïve Bayes Classifier, Artificial Neuron Network and Ensemble Learning Model were implemented to predict multidimensional poverty status for each dataset. This captured the dynamic changes from 2014 to 2021 at the national level. The study employed data pre-processing techniques and adjusted imbalance through weighted categorical cross entropy and 5-Fold Cross Validation to mitigate the inefficiencies due to overfitting and under-fitting of the algorithms. Additionally, dimensionality data reduction was performed using Principal Component Analysis (PCA). With regard to the evaluation metrics, we show that the Ensemble Learning Model has achieved the best performance modelling both balanced and unbalanced datasets. Our policy recommendations draw on the results of the algorithmic predictions.
format Article
id doaj-art-41ffffd431054b7fa404f922f4725edd
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-41ffffd431054b7fa404f922f4725edd2025-02-11T00:01:19ZengIEEEIEEE Access2169-35362025-01-0113234612347110.1109/ACCESS.2025.353780710869458Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From TanzaniaNgong'Ho Bujiku Sende0https://orcid.org/0009-0001-4185-3857Snehanshu Saha1https://orcid.org/0000-0002-8458-604XLeon Ruganzu2Saibal Kar3https://orcid.org/0000-0001-8134-1517African Centre of Excellence in Data Science, University of Rwanda, Kigali, RwandaCSIS and APPCAIR, Birla Institute of Technology and Science Pilani Goa Campus, Zuarinagar, Goa, IndiaAfrican Centre of Excellence in Data Science, University of Rwanda, Kigali, RwandaCentre for Studies in Social Sciences, Calcutta, India, and IZA, Bonn, GermanyOver fifty percent of the population in Tanzania suffers from multidimensional poverty. Because of the high poverty rate and slow improvement, ending poverty by the year 2030 remains challenging and empirically testable proposition and part of a shared challenge. The main purpose of this study is to predict multidimensional poverty status with the help of best performance-supervised machine-learning algorithms. To achieve this objective, longitudinal data from the 2014/15 and 2020/21 surveys, sourced from the Tanzania National Bureau of Statistics (NBS) were analyzed. A variety of supervised machine-learning algorithms such as RBF Kernel in SVM, Linear Kernel in SVM, Polynomial Kernel in SVM, Random Forest, Logistic regression classifier, Decision tree, Gradient Boosting, K-Nearest Neighbours Classifier, Naïve Bayes Classifier, Artificial Neuron Network and Ensemble Learning Model were implemented to predict multidimensional poverty status for each dataset. This captured the dynamic changes from 2014 to 2021 at the national level. The study employed data pre-processing techniques and adjusted imbalance through weighted categorical cross entropy and 5-Fold Cross Validation to mitigate the inefficiencies due to overfitting and under-fitting of the algorithms. Additionally, dimensionality data reduction was performed using Principal Component Analysis (PCA). With regard to the evaluation metrics, we show that the Ensemble Learning Model has achieved the best performance modelling both balanced and unbalanced datasets. Our policy recommendations draw on the results of the algorithmic predictions.https://ieeexplore.ieee.org/document/10869458/Multidimensional povertymachine learningzones
spellingShingle Ngong'Ho Bujiku Sende
Snehanshu Saha
Leon Ruganzu
Saibal Kar
Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
IEEE Access
Multidimensional poverty
machine learning
zones
title Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
title_full Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
title_fullStr Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
title_full_unstemmed Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
title_short Prediction of Multidimensional Poverty Status With Machine Learning Classification at Household Level: Empirical Evidence From Tanzania
title_sort prediction of multidimensional poverty status with machine learning classification at household level empirical evidence from tanzania
topic Multidimensional poverty
machine learning
zones
url https://ieeexplore.ieee.org/document/10869458/
work_keys_str_mv AT ngonghobujikusende predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania
AT snehanshusaha predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania
AT leonruganzu predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania
AT saibalkar predictionofmultidimensionalpovertystatuswithmachinelearningclassificationathouseholdlevelempiricalevidencefromtanzania