Enhancing Arabic text-to-speech synthesis for emotional expression in visually impaired individuals using the artificial hummingbird and hybrid deep learning model

Depression is one of the most dangerous mental health conditions, often leading to suicide, which is the fourth leading cause of death in the Middle East. Particularly, Egypt has the highest suicide rate in the region, making it crucial to recognize depression and suicidal thoughts early. In Arab cu...

Full description

Saved in:
Bibliographic Details
Main Authors: Mahmoud M. Selim, Mohammed S. Assiri
Format: Article
Language:English
Published: Elsevier 2025-04-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1110016825001784
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Depression is one of the most dangerous mental health conditions, often leading to suicide, which is the fourth leading cause of death in the Middle East. Particularly, Egypt has the highest suicide rate in the region, making it crucial to recognize depression and suicidal thoughts early. In Arab culture, awareness of mental health issues is limited, but in recent years, people have increasingly expressed their feelings on social media platforms. This shift presents an opportunity for mental health intervention through digital means. Furthermore, while facial expressions are not accessible to the blind and visually impaired, voice signals can convey emotional nuances, offering an alternative method for detecting mental health states. Natural Language Processing (NLP) and machine learning (ML) techniques provide powerful tools for analysing social media text data, helping detect emotional distress and providing timely support. By applying these technologies, AI-driven solutions can assist in understanding and addressing mental health concerns more inclusively. This study designs an Arabic Mood Changing and Depression Detection using the Artificial Hummingbird Optimization Algorithm with Deep Learning (AMCDD-AHODL) technique for visually impaired individuals. The AMCDD-AHODL technique detects different kinds of emotions and depression using Arabic tweets. After pre-processing, the word embedding process is carried out using the AraBERT model. Furthermore, the AMCDD-AHODL technique utilizes a hybrid LSTM+BiGRU model for the recognition and classification model. To improve the performance of the hybrid LSTM+BiGRU methodology, the AMCDD-AHODL technique comprises an AHO-based hyperparameter tuning process. Finally, the WaveNet model enhances the naturalness and clarity of text-to-speech synthesis, delivering high-quality, human-like audio output. The AMCDD-AHODL approach is examined using the Modern Standard Arabic dataset containing 1229 records. The performance validation of the AMCDD-AHODL approach portrayed a superior accuracy value of 95.80 % compared to the existing ML and DL models. Therefore, the AMCDD-AHODL technique is applied for the early identification of various kinds of depression that can decrease the distress from the illness and the stigma related to mental health problems.
ISSN:1110-0168