TepiSense: A Social Computing-Based Real-Time Epidemic Surveillance System Using Artificial Intelligence
Artificial Intelligence (AI) technologies have enabled researchers to develop tools to monitor real-world events and user behavior using social media platforms. Twitter is particularly useful for gathering invaluable information related to diseases and public health to build real-time disease survei...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10858732/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Artificial Intelligence (AI) technologies have enabled researchers to develop tools to monitor real-world events and user behavior using social media platforms. Twitter is particularly useful for gathering invaluable information related to diseases and public health to build real-time disease surveillance systems. Such systems offer a cost-effective and efficient alternative to the passive, expensive, and time-consuming process of using data from healthcare organizations and hospitals. In this paper, we propose a novel system of TepiSense to automatically perform disease surveillance of epidemic-prone diseases. Our system classifies tweets related to diseases and further identifies ‘indication’ tweets that highlight the presence of patients. Our system consists of four distinct modules of pre-processor, feature extractor, classifier, and evaluator. TepiSense compares the performance of 3 feature extraction techniques, 9 machine/deep learning models, and 3 Large Language Models (LLMs). To test the performance of our system, we build a dataset of Twitter Epidemic Surveillance Corpus (TESC) containing 23.9K English and 13K labelled Urdu tweets related to six diseases: COVID19, hepatitis, malaria, flu, dengue, and HIV/AIDS. Our results show that mBERT LLM achieves the highest F-measure values of 0.96 and 0.83 for topic and indication tweets classification, respectively. Furthermore, we compute the correlation of signals generated by our system with real-world cases to test the efficacy on COVID19 disease. We notice that real-world cases have a correlation of 0.58-0.63 with the indication category tweets. Finally, we develop an interactive and user-friendly dashboard to disseminate the analytics of our system. Overall, our system offers a powerful tool for real-time disease surveillance using social media with potential implications for public health policy and decision-making. |
---|---|
ISSN: | 2169-3536 |