TepiSense: A Social Computing-Based Real-Time Epidemic Surveillance System Using Artificial Intelligence

Artificial Intelligence (AI) technologies have enabled researchers to develop tools to monitor real-world events and user behavior using social media platforms. Twitter is particularly useful for gathering invaluable information related to diseases and public health to build real-time disease survei...

Full description

Saved in:
Bibliographic Details
Main Authors: Bilal Tahir, Muhammad Amir Mehmood
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10858732/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Artificial Intelligence (AI) technologies have enabled researchers to develop tools to monitor real-world events and user behavior using social media platforms. Twitter is particularly useful for gathering invaluable information related to diseases and public health to build real-time disease surveillance systems. Such systems offer a cost-effective and efficient alternative to the passive, expensive, and time-consuming process of using data from healthcare organizations and hospitals. In this paper, we propose a novel system of TepiSense to automatically perform disease surveillance of epidemic-prone diseases. Our system classifies tweets related to diseases and further identifies ‘indication’ tweets that highlight the presence of patients. Our system consists of four distinct modules of pre-processor, feature extractor, classifier, and evaluator. TepiSense compares the performance of 3 feature extraction techniques, 9 machine/deep learning models, and 3 Large Language Models (LLMs). To test the performance of our system, we build a dataset of Twitter Epidemic Surveillance Corpus (TESC) containing 23.9K English and 13K labelled Urdu tweets related to six diseases: COVID19, hepatitis, malaria, flu, dengue, and HIV/AIDS. Our results show that mBERT LLM achieves the highest F-measure values of 0.96 and 0.83 for topic and indication tweets classification, respectively. Furthermore, we compute the correlation of signals generated by our system with real-world cases to test the efficacy on COVID19 disease. We notice that real-world cases have a correlation of 0.58-0.63 with the indication category tweets. Finally, we develop an interactive and user-friendly dashboard to disseminate the analytics of our system. Overall, our system offers a powerful tool for real-time disease surveillance using social media with potential implications for public health policy and decision-making.
ISSN:2169-3536