Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic
BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have inves...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JMIR Publications
2025-02-01
|
Series: | Journal of Medical Internet Research |
Online Access: | https://www.jmir.org/2025/1/e60948 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823859754892525568 |
---|---|
author | Xingyue Wu Chun Sing Lam Ka Ho Hui Herbert Ho-fung Loong Keary Rui Zhou Chun-Kit Ngan Yin Ting Cheung |
author_facet | Xingyue Wu Chun Sing Lam Ka Ho Hui Herbert Ho-fung Loong Keary Rui Zhou Chun-Kit Ngan Yin Ting Cheung |
author_sort | Xingyue Wu |
collection | DOAJ |
description |
BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have investigated the perceptions about its use by analyzing social media data.
ObjectiveThis study aims to use BERTopic (a topic modeling technique that is an extension of the Bidirectional Encoder Representation from Transformers machine learning model) to explore the perceptions of online cancer communities regarding immunotherapy.
MethodsA total of 4.9 million posts were extracted from Facebook, Twitter, Reddit, and 16 online cancer-related forums. The textual data were preprocessed by natural language processing. BERTopic modeling was performed to identify topics from the posts. The effectiveness of isolating topics from the posts was evaluated using 3 metrics: topic diversity, coherence, and quality. Sentiment analysis was performed to determine the polarity of each topic and categorize them as positive or negative. Based on the topics generated through topic modeling, thematic analysis was conducted to identify themes associated with immunotherapy.
ResultsAfter data cleaning, 3.6 million posts remained for modeling. The highest overall topic quality achieved by BERTopic was 70.47% (topic diversity: 87.86%; topic coherence: 80.21%). BERTopic generated 14 topics related to the perceptions of immunotherapy. The sentiment score of around 0.3 across the 14 topics suggested generally positive sentiments toward immunotherapy within the online communities. Six themes were identified, primarily covering (1) hopeful prospects offered by immunotherapy, (2) perceived effectiveness of immunotherapy, (3) complementary therapies or self-treatments, (4) financial and mental impact of undergoing immunotherapy, (5) impact on lifestyle and time schedules, and (6) side effects due to treatment.
ConclusionsThis study provides an overview of the multifaceted considerations essential for the application of immunotherapy as a therapeutic intervention. The topics and themes identified can serve as supporting information to facilitate physician-patient communication and the decision-making process. Furthermore, this study also demonstrates the effectiveness of BERTopic in analyzing large amounts of data to identify perceptions underlying social media and online communities. |
format | Article |
id | doaj-art-478f5cbe83ff42a792d07741c8293d31 |
institution | Kabale University |
issn | 1438-8871 |
language | English |
publishDate | 2025-02-01 |
publisher | JMIR Publications |
record_format | Article |
series | Journal of Medical Internet Research |
spelling | doaj-art-478f5cbe83ff42a792d07741c8293d312025-02-10T21:01:20ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-02-0127e6094810.2196/60948Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopicXingyue Wuhttps://orcid.org/0009-0007-7717-530XChun Sing Lamhttps://orcid.org/0000-0002-6478-6706Ka Ho Huihttps://orcid.org/0000-0003-0950-8524Herbert Ho-fung Loonghttps://orcid.org/0000-0002-6607-1106Keary Rui Zhouhttps://orcid.org/0000-0001-5054-4290Chun-Kit Nganhttps://orcid.org/0000-0003-2151-0459Yin Ting Cheunghttps://orcid.org/0000-0001-9874-8938 BackgroundImmunotherapy has become a game changer in cancer treatment. The internet has been used by patients as a platform to share personal experiences and seek medical guidance. Despite the increased utilization of immunotherapy in clinical practice, few studies have investigated the perceptions about its use by analyzing social media data. ObjectiveThis study aims to use BERTopic (a topic modeling technique that is an extension of the Bidirectional Encoder Representation from Transformers machine learning model) to explore the perceptions of online cancer communities regarding immunotherapy. MethodsA total of 4.9 million posts were extracted from Facebook, Twitter, Reddit, and 16 online cancer-related forums. The textual data were preprocessed by natural language processing. BERTopic modeling was performed to identify topics from the posts. The effectiveness of isolating topics from the posts was evaluated using 3 metrics: topic diversity, coherence, and quality. Sentiment analysis was performed to determine the polarity of each topic and categorize them as positive or negative. Based on the topics generated through topic modeling, thematic analysis was conducted to identify themes associated with immunotherapy. ResultsAfter data cleaning, 3.6 million posts remained for modeling. The highest overall topic quality achieved by BERTopic was 70.47% (topic diversity: 87.86%; topic coherence: 80.21%). BERTopic generated 14 topics related to the perceptions of immunotherapy. The sentiment score of around 0.3 across the 14 topics suggested generally positive sentiments toward immunotherapy within the online communities. Six themes were identified, primarily covering (1) hopeful prospects offered by immunotherapy, (2) perceived effectiveness of immunotherapy, (3) complementary therapies or self-treatments, (4) financial and mental impact of undergoing immunotherapy, (5) impact on lifestyle and time schedules, and (6) side effects due to treatment. ConclusionsThis study provides an overview of the multifaceted considerations essential for the application of immunotherapy as a therapeutic intervention. The topics and themes identified can serve as supporting information to facilitate physician-patient communication and the decision-making process. Furthermore, this study also demonstrates the effectiveness of BERTopic in analyzing large amounts of data to identify perceptions underlying social media and online communities.https://www.jmir.org/2025/1/e60948 |
spellingShingle | Xingyue Wu Chun Sing Lam Ka Ho Hui Herbert Ho-fung Loong Keary Rui Zhou Chun-Kit Ngan Yin Ting Cheung Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic Journal of Medical Internet Research |
title | Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic |
title_full | Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic |
title_fullStr | Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic |
title_full_unstemmed | Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic |
title_short | Perceptions in 3.6 Million Web-Based Posts of Online Communities on the Use of Cancer Immunotherapy: Data Mining Using BERTopic |
title_sort | perceptions in 3 6 million web based posts of online communities on the use of cancer immunotherapy data mining using bertopic |
url | https://www.jmir.org/2025/1/e60948 |
work_keys_str_mv | AT xingyuewu perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT chunsinglam perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT kahohui perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT herberthofungloong perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT kearyruizhou perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT chunkitngan perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic AT yintingcheung perceptionsin36millionwebbasedpostsofonlinecommunitiesontheuseofcancerimmunotherapydataminingusingbertopic |