Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic

BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have inves...

Full description

Saved in:
Bibliographic Details
Main Authors: Xingyue Wu, Chun Sing Lam, Ka Ho Hui, Herbert Ho-fung Loong, Keary Rui Zhou, Chun-Kit Ngan, Yin Ting Cheung
Format: Article
Language:English
Published: JMIR Publications 2025-02-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e60948
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823859754892525568
author Xingyue Wu
Chun Sing Lam
Ka Ho Hui
Herbert Ho-fung Loong
Keary Rui Zhou
Chun-Kit Ngan
Yin Ting Cheung
author_facet Xingyue Wu
Chun Sing Lam
Ka Ho Hui
Herbert Ho-fung Loong
Keary Rui Zhou
Chun-Kit Ngan
Yin Ting Cheung
author_sort Xingyue Wu
collection DOAJ
description BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have investigated the perceptions about its use by analyzing social media data. ObjectiveThis study aims to use BERTopic (a topic modeling technique that is an extension of the Bidirectional Encoder Representation from Transformers machine learning model) to explore the perceptions of online cancer communities regarding immunotherapy. MethodsA total of 4.9 million posts were extracted from Facebook, Twitter, Reddit, and 16 online cancer-related forums. The textual data were preprocessed by natural language processing. BERTopic modeling was performed to identify topics from the posts. The effectiveness of isolating topics from the posts was evaluated using 3 metrics: topic diversity, coherence, and quality. Sentiment analysis was performed to determine the polarity of each topic and categorize them as positive or negative. Based on the topics generated through topic modeling, thematic analysis was conducted to identify themes associated with immunotherapy. ResultsAfter data cleaning, 3.6 million posts remained for modeling. The highest overall topic quality achieved by BERTopic was 70.47% (topic diversity: 87.86%; topic coherence: 80.21%). BERTopic generated 14 topics related to the perceptions of immunotherapy. The sentiment score of around 0.3 across the 14 topics suggested generally positive sentiments toward immunotherapy within the online communities. Six themes were identified, primarily covering (1) hopeful prospects offered by immunotherapy, (2) perceived effectiveness of immunotherapy, (3) complementary therapies or self-treatments, (4) financial and mental impact of undergoing immunotherapy, (5) impact on lifestyle and time schedules, and (6) side effects due to treatment. ConclusionsThis study provides an overview of the multifaceted considerations essential for the application of immunotherapy as a therapeutic intervention. The topics and themes identified can serve as supporting information to facilitate physician-patient communication and the decision-making process. Furthermore, this study also demonstrates the effectiveness of BERTopic in analyzing large amounts of data to identify perceptions underlying social media and online communities.
format Article
id doaj-art-478f5cbe83ff42a792d07741c8293d31
institution Kabale University
issn 1438-8871
language English
publishDate 2025-02-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj-art-478f5cbe83ff42a792d07741c8293d312025-02-10T21:01:20ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-02-0127e6094810.2196/60948Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopicXingyue Wuhttps://orcid.org/0009-0007-7717-530XChun Sing Lamhttps://orcid.org/0000-0002-6478-6706Ka Ho Huihttps://orcid.org/0000-0003-0950-8524Herbert Ho-fung Loonghttps://orcid.org/0000-0002-6607-1106Keary Rui Zhouhttps://orcid.org/0000-0001-5054-4290Chun-Kit Nganhttps://orcid.org/0000-0003-2151-0459Yin Ting Cheunghttps://orcid.org/0000-0001-9874-8938 BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have investigated the perceptions about its use by analyzing social media data. ObjectiveThis study aims to use BERTopic (a topic modeling technique that is an extension of the Bidirectional Encoder Representation from Transformers machine learning model) to explore the perceptions of online cancer communities regarding immunotherapy. MethodsA total of 4.9 million posts were extracted from Facebook, Twitter, Reddit, and 16 online cancer-related forums. The textual data were preprocessed by natural language processing. BERTopic modeling was performed to identify topics from the posts. The effectiveness of isolating topics from the posts was evaluated using 3 metrics: topic diversity, coherence, and quality. Sentiment analysis was performed to determine the polarity of each topic and categorize them as positive or negative. Based on the topics generated through topic modeling, thematic analysis was conducted to identify themes associated with immunotherapy. ResultsAfter data cleaning, 3.6 million posts remained for modeling. The highest overall topic quality achieved by BERTopic was 70.47% (topic diversity: 87.86%; topic coherence: 80.21%). BERTopic generated 14 topics related to the perceptions of immunotherapy. The sentiment score of around 0.3 across the 14 topics suggested generally positive sentiments toward immunotherapy within the online communities. Six themes were identified, primarily covering (1) hopeful prospects offered by immunotherapy, (2) perceived effectiveness of immunotherapy, (3) complementary therapies or self-treatments, (4) financial and mental impact of undergoing immunotherapy, (5) impact on lifestyle and time schedules, and (6) side effects due to treatment. ConclusionsThis study provides an overview of the multifaceted considerations essential for the application of immunotherapy as a therapeutic intervention. The topics and themes identified can serve as supporting information to facilitate physician-patient communication and the decision-making process. Furthermore, this study also demonstrates the effectiveness of BERTopic in analyzing large amounts of data to identify perceptions underlying social media and online communities.https://www.jmir.org/2025/1/e60948
spellingShingle Xingyue Wu
Chun Sing Lam
Ka Ho Hui
Herbert Ho-fung Loong
Keary Rui Zhou
Chun-Kit Ngan
Yin Ting Cheung
Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
Journal of Medical Internet Research
title Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
title_full Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
title_fullStr Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
title_full_unstemmed Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
title_short Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
title_sort perceptions in 3 6 million web based posts of online communities on the use of cancer immunotherapy data mining using bertopic
url https://www.jmir.org/2025/1/e60948
work_keys_str_mv AT xingyuewu perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT chunsinglam perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT kahohui perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT herberthofungloong perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT kearyruizhou perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT chunkitngan perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic
AT yintingcheung perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic