Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.

This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity level...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmed Akib Jawad Karim, Kazi Hafiz Md Asad, Md Golam Rabiul Alam
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0315829
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823856800049397760
author Ahmed Akib Jawad Karim
Kazi Hafiz Md Asad
Md Golam Rabiul Alam
author_facet Ahmed Akib Jawad Karim
Kazi Hafiz Md Asad
Md Golam Rabiul Alam
author_sort Ahmed Akib Jawad Karim
collection DOAJ
description This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million-resulting in a model approximately 73.64% smaller. On the General Language Understanding Evaluation (GLUE) benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy of 85%, F1 score of 85%, precision of 85%, and recall of 85%. When compared to DistilBERT (66 million parameters) and ClinicalBERT (110 million parameters), LastBERT demonstrated comparable performance, with DistilBERT slightly outperforming it at 87%, and ClinicalBERT achieving 86% across the same metrics. These findings highlight the LastBERT model's capacity to classify degrees of ADHD severity properly, so it offers a useful tool for mental health professionals to assess and comprehend material produced by users on social networking platforms. The study emphasizes the possibilities of knowledge distillation to produce effective models fit for use in resource-limited conditions, hence advancing NLP and mental health diagnosis. Furthermore underlined by the considerable decrease in model size without appreciable performance loss is the lower computational resources needed for training and deployment, hence facilitating greater applicability. Especially using readily available computational tools like Google Colab and Kaggle Notebooks. This study shows the accessibility and usefulness of advanced NLP methods in pragmatic world applications.
format Article
id doaj-art-0f77aca1f46f4bf2a1976f2b1ada27cb
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-0f77aca1f46f4bf2a1976f2b1ada27cb2025-02-12T05:31:01ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031582910.1371/journal.pone.0315829Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.Ahmed Akib Jawad KarimKazi Hafiz Md AsadMd Golam Rabiul AlamThis work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task-classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million-resulting in a model approximately 73.64% smaller. On the General Language Understanding Evaluation (GLUE) benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy of 85%, F1 score of 85%, precision of 85%, and recall of 85%. When compared to DistilBERT (66 million parameters) and ClinicalBERT (110 million parameters), LastBERT demonstrated comparable performance, with DistilBERT slightly outperforming it at 87%, and ClinicalBERT achieving 86% across the same metrics. These findings highlight the LastBERT model's capacity to classify degrees of ADHD severity properly, so it offers a useful tool for mental health professionals to assess and comprehend material produced by users on social networking platforms. The study emphasizes the possibilities of knowledge distillation to produce effective models fit for use in resource-limited conditions, hence advancing NLP and mental health diagnosis. Furthermore underlined by the considerable decrease in model size without appreciable performance loss is the lower computational resources needed for training and deployment, hence facilitating greater applicability. Especially using readily available computational tools like Google Colab and Kaggle Notebooks. This study shows the accessibility and usefulness of advanced NLP methods in pragmatic world applications.https://doi.org/10.1371/journal.pone.0315829
spellingShingle Ahmed Akib Jawad Karim
Kazi Hafiz Md Asad
Md Golam Rabiul Alam
Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
PLoS ONE
title Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_full Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_fullStr Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_full_unstemmed Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_short Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation.
title_sort larger models yield better results streamlined severity classification of adhd related concerns using bert based knowledge distillation
url https://doi.org/10.1371/journal.pone.0315829
work_keys_str_mv AT ahmedakibjawadkarim largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation
AT kazihafizmdasad largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation
AT mdgolamrabiulalam largermodelsyieldbetterresultsstreamlinedseverityclassificationofadhdrelatedconcernsusingbertbasedknowledgedistillation