Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset

Abstract Asthma diagnosis poses challenges due to underreporting of symptoms, misdiagnoses, and limitations in existing diagnostic tests. Machine learning (ML) offers a promising avenue for addressing these challenges by leveraging demographic and clinical data. In this study, we aim to compare diff...

Full description

Saved in:

Bibliographic Details
Main Authors:	Omid Kohandel Gargari, Mobina Fathi, Shahryar Rajai Firouzabadi, Ida Mohammadi, Mohammad Hossein Mahmoudi, Mehran Sarmadi, Arman Shafiee
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-02-01
Series:	Scientific Reports
Subjects:	Asthma Machine learning Support vector machine Bronchitis
Online Access:	https://doi.org/10.1038/s41598-025-88345-1
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1823862311776944128
author	Omid Kohandel Gargari Mobina Fathi Shahryar Rajai Firouzabadi Ida Mohammadi Mohammad Hossein Mahmoudi Mehran Sarmadi Arman Shafiee
author_facet	Omid Kohandel Gargari Mobina Fathi Shahryar Rajai Firouzabadi Ida Mohammadi Mohammad Hossein Mahmoudi Mehran Sarmadi Arman Shafiee
author_sort	Omid Kohandel Gargari
collection	DOAJ
description	Abstract Asthma diagnosis poses challenges due to underreporting of symptoms, misdiagnoses, and limitations in existing diagnostic tests. Machine learning (ML) offers a promising avenue for addressing these challenges by leveraging demographic and clinical data. In this study, we aim to compare different ML diagnostic models and obtain the most valuable features for asthma diagnosis using data from the National Health and Nutrition Examination Survey (NHANES) dataset. A total of 8,888 participants with available asthma diagnosis data from the 2017–2018 NHANES survey were included. After careful selection of variables related to asthma, various ML algorithms including Support Vector Machine (SVM), Random Forest (RF), AdaBoost (ADA), XGBoost (XGB), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Multi-Layer Perceptron (MLP) were evaluated. SVM and ADA emerged as top performers with the highest area under the curve (AUC) scores of 0.72 and 0.71, respectively. RF exhibited high accuracy but low precision. Feature interpretation using SHapley Additive exPlanations (SHAP) values identified significant predictors such as close relative asthma history, dietary fat intake, and chronic bronchitis. Feature reduction experiments showed promising results without significant loss in predictive performance. Our findings demonstrate the potential diagnosis ability of ML algorithms, particularly SVM and ADA, in asthma diagnosis by incorporating diverse clinical and demographic factors. In addition, close relative asthma history, dietary fat intake, and chronic bronchitis could be suggested as the valuable asthma diagnosis features. These outcomes can bring promising results in early diagnosis of asthma.
format	Article
id	doaj-art-7e99891b423545ceac8e4c8a1b71261e
institution	Kabale University
issn	2045-2322
language	English
publishDate	2025-02-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-7e99891b423545ceac8e4c8a1b71261e2025-02-09T12:33:04ZengNature PortfolioScientific Reports2045-23222025-02-0115111110.1038/s41598-025-88345-1Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES datasetOmid Kohandel Gargari0Mobina Fathi1Shahryar Rajai Firouzabadi2Ida Mohammadi3Mohammad Hossein Mahmoudi4Mehran Sarmadi5Arman Shafiee6Alborz Artificial Intelligence Association, Alborz University of Medical SciencesAdvanced Diagnostic and Interventional Radiology Research Center (ADIR)School of Medicine, Shahid Beheshti University of Medical SciencesSchool of Medicine, Shahid Beheshti University of Medical SciencesIndustrial Engineering Department, Sharif University of TechnologyComputer Engineering Department, Sharif University of TechnologyAlborz Artificial Intelligence Association, Alborz University of Medical SciencesAbstract Asthma diagnosis poses challenges due to underreporting of symptoms, misdiagnoses, and limitations in existing diagnostic tests. Machine learning (ML) offers a promising avenue for addressing these challenges by leveraging demographic and clinical data. In this study, we aim to compare different ML diagnostic models and obtain the most valuable features for asthma diagnosis using data from the National Health and Nutrition Examination Survey (NHANES) dataset. A total of 8,888 participants with available asthma diagnosis data from the 2017–2018 NHANES survey were included. After careful selection of variables related to asthma, various ML algorithms including Support Vector Machine (SVM), Random Forest (RF), AdaBoost (ADA), XGBoost (XGB), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Multi-Layer Perceptron (MLP) were evaluated. SVM and ADA emerged as top performers with the highest area under the curve (AUC) scores of 0.72 and 0.71, respectively. RF exhibited high accuracy but low precision. Feature interpretation using SHapley Additive exPlanations (SHAP) values identified significant predictors such as close relative asthma history, dietary fat intake, and chronic bronchitis. Feature reduction experiments showed promising results without significant loss in predictive performance. Our findings demonstrate the potential diagnosis ability of ML algorithms, particularly SVM and ADA, in asthma diagnosis by incorporating diverse clinical and demographic factors. In addition, close relative asthma history, dietary fat intake, and chronic bronchitis could be suggested as the valuable asthma diagnosis features. These outcomes can bring promising results in early diagnosis of asthma.https://doi.org/10.1038/s41598-025-88345-1AsthmaMachine learningSupport vector machineBronchitis
spellingShingle	Omid Kohandel Gargari Mobina Fathi Shahryar Rajai Firouzabadi Ida Mohammadi Mohammad Hossein Mahmoudi Mehran Sarmadi Arman Shafiee Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset Scientific Reports Asthma Machine learning Support vector machine Bronchitis
title	Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset
title_full	Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset
title_fullStr	Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset
title_full_unstemmed	Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset
title_short	Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset
title_sort	assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in united states adults based on nhanes dataset
topic	Asthma Machine learning Support vector machine Bronchitis
url	https://doi.org/10.1038/s41598-025-88345-1
work_keys_str_mv	AT omidkohandelgargari assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT mobinafathi assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT shahryarrajaifirouzabadi assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT idamohammadi assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT mohammadhosseinmahmoudi assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT mehransarmadi assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset AT armanshafiee assessingthediagnosticaccuracyofmachinelearningalgorithmsforidentificationofasthmainunitedstatesadultsbasedonnhanesdataset

Assessing the diagnostic accuracy of machine learning algorithms for identification of asthma in United States adults based on NHANES dataset

Similar Items