Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.

This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity level...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ahmed Akib Jawad Karim, Kazi Hafiz Md Asad, Md Golam Rabiul Alam
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2025-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0315829
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1823856800049397760
author	Ahmed Akib Jawad Karim Kazi Hafiz Md Asad Md Golam Rabiul Alam
author_facet	Ahmed Akib Jawad Karim Kazi Hafiz Md Asad Md Golam Rabiul Alam
author_sort	Ahmed Akib Jawad Karim
collection	DOAJ
description	This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million-resulting in a model approximately 73.64% smaller. On the General Language Understanding Evaluation (GLUE) benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy of 85%, F1 score of 85%, precision of 85%, and recall of 85%. When compared to DistilBERT (66 million parameters) and ClinicalBERT (110 million parameters), LastBERT demonstrated comparable performance, with DistilBERT slightly outperforming it at 87%, and ClinicalBERT achieving 86% across the same metrics. These findings highlight the LastBERT model's capacity to classify degrees of ADHD severity properly, so it offers a useful tool for mental health professionals to assess and comprehend material produced by users on social networking platforms. The study emphasizes the possibilities of knowledge distillation to produce effective models fit for use in resource-limited conditions, hence advancing NLP and mental health diagnosis. Furthermore underlined by the considerable decrease in model size without appreciable performance loss is the lower computational resources needed for training and deployment, hence facilitating greater applicability. Especially using readily available computational tools like Google Colab and Kaggle Notebooks. This study shows the accessibility and usefulness of advanced NLP methods in pragmatic world applications.
format	Article
id	doaj-art-0f77aca1f46f4bf2a1976f2b1ada27cb
institution	Kabale University
issn	1932-6203
language	English
publishDate	2025-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj-art-0f77aca1f46f4bf2a1976f2b1ada27cb2025-02-12T05:31:01ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031582910.1371/journal.pone.0315829Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.Ahmed Akib Jawad KarimKazi Hafiz Md AsadMd Golam Rabiul AlamThis work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million-resulting in a model approximately 73.64% smaller. On the General Language Understanding Evaluation (GLUE) benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy of 85%, F1 score of 85%, precision of 85%, and recall of 85%. When compared to DistilBERT (66 million parameters) and ClinicalBERT (110 million parameters), LastBERT demonstrated comparable performance, with DistilBERT slightly outperforming it at 87%, and ClinicalBERT achieving 86% across the same metrics. These findings highlight the LastBERT model's capacity to classify degrees of ADHD severity properly, so it offers a useful tool for mental health professionals to assess and comprehend material produced by users on social networking platforms. The study emphasizes the possibilities of knowledge distillation to produce effective models fit for use in resource-limited conditions, hence advancing NLP and mental health diagnosis. Furthermore underlined by the considerable decrease in model size without appreciable performance loss is the lower computational resources needed for training and deployment, hence facilitating greater applicability. Especially using readily available computational tools like Google Colab and Kaggle Notebooks. This study shows the accessibility and usefulness of advanced NLP methods in pragmatic world applications.https://doi.org/10.1371/journal.pone.0315829
spellingShingle	Ahmed Akib Jawad Karim Kazi Hafiz Md Asad Md Golam Rabiul Alam Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation. PLoS ONE
title	Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_full	Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_fullStr	Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_full_unstemmed	Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_short	Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_sort	larger models yield better results streamlined severity classification of adhd related concerns using bert based knowledge distillation
url	https://doi.org/10.1371/journal.pone.0315829
work_keys_str_mv	AT ahmedakibjawadkarim largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation AT kazihafizmdasad largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation AT mdgolamrabiulalam largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation

Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.

Similar Items