Wesly M Sihombing (1), Triyanna Widiyaningtyas (2)
Background: Public health challenges in Indonesia continue to expand across areas such as mental health, chronic diseases, vaccination debates, and environmental issues. Specific background: The rapid use of platform X provides large-scale public discourse that can be analyzed to understand real-time health discussions. Knowledge gap: Limited studies integrate advanced sentiment and topic modeling tailored to informal Indonesian social media language. Aim: This study analyzes public health conversations on platform X using IndoBERTweet for sentiment classification and BERTopic for topic extraction. Results: From 6,740 processed tweets, neutral sentiment dominated public discussions, while topic modeling produced 44 themes, with mental well-being, vaccination debates, chronic disease concerns, and regional disease reports emerging as key issues. IndoBERTweet demonstrated reliable performance (Weighted F1-Score 0.7822), and BERTopic produced coherent and diverse topics. Novelty: This research combines IndoBERTweet and BERTopic to generate a contextual, adaptive, and real-time mapping of public health discourse in Indonesia. Implications: The findings support data-driven health policymaking, enabling authorities to monitor public perceptions, strengthen communication strategies, and design region-specific interventions.
• Public conversations emphasize mental health and lifestyle-related issues.
• Topic modeling identifies diverse clusters, including vaccination debates and endemic diseases.
• Integrated sentiment–topic analysis enables real-time mapping of health discussions in Indonesia.
Public Health, Social Media Analysis, IndoBERTweet, BERTopic, Sentiment Classification
Kementerian Kesehatan Republik Indonesia, Buku Saku Hasil Studi Status Gizi Indonesia (SSGI) 2022. Jakarta: Kementerian Kesehatan RI, 2022.
Kementerian Kesehatan Republik Indonesia, Profil Kesehatan Indonesia 2023. Jakarta: Kementerian Kesehatan RI, 2023.
A. Prasetyo and S. Nurjanah, “Hubungan Gaya Hidup Modern dengan Risiko Penyakit Tidak Menular di Indonesia,” Jurnal Kesehatan Masyarakat Nasional, vol. 17, no. 1, pp. 25–34, 2023.
World Health Organization, World Mental Health Report: Transforming Mental Health for All. Geneva: WHO, 2022.
A. Ningsih and R. Wardani, “Stigma Sosial dan Penanganan Gangguan Mental di Indonesia,” Jurnal Psikologi Sosial dan Budaya, vol. 9, no. 2, pp. 115–124, 2023.
Systemiq, Better Air, Better Indonesia. London: Systemiq Earth, 2025.
L. Suryani and T. Hartono, “Dampak Pencemaran Udara terhadap Kesehatan Pernapasan di Kota Besar,” Jurnal Lingkungan dan Kesehatan, vol. 11, no. 2, pp. 66–75, 2024.
DataReportal, Digital 2023: Indonesia. [Online]. Available: https://datareportal.com/reports/digital-2023-indonesia
. [Accessed: May 2024].
R. Hidayat and D. Putra, “Pemanfaatan Media Sosial untuk Analisis Isu Publik di Indonesia,” Jurnal Komunikasi Digital, vol. 6, no. 1, pp. 12–21, 2023.
J. F. Kusuma and A. Chowanda, “Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” International Journal on Informatics Visualization (JOIV), vol. 7, no. 1, pp. 45–52, 2023.
F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” in Proc. EMNLP 2021, pp. 1057–1070, 2021.
N. N. Hidayati and S. Shaleha, “BERTopic Analysis of Indonesian Biodiversity Policy on Social Media,” ECTI Transactions on Computer and Information Technology, vol. 18, no. 3, pp. 244–252, 2024.
A. Rahmawati and D. Sari, “Tantangan Analisis Bahasa Informal di Media Sosial Menggunakan NLP,” Jurnal Teknologi Informasi dan Komunikasi, vol. 10, no. 1, pp. 19–27, 2023.
I. Setiawan and T. Gunawan, “Penerapan IndoBERTweet untuk Analisis Sentimen pada Isu Sosial,” Jurnal Data Sains Indonesia, vol. 4, no. 2, pp. 88–97, 2023.
M. Grootendorst, “BERTopic: Neural Topic Modeling with Contextual Embeddings,” arXiv preprint arXiv:2203.05794, 2022.
Feldman, R., Techniques and Applications for Sentiment Analysis, Communications of the ACM, 2020.
Mining, Text. Foundations and Applications, Springer, 2021.
A. Prasetyo, R. Hidayat, dan D. Putra, “Analisis Sentimen Media Sosial untuk Pemantauan Isu Kesehatan Masyarakat di Indonesia,” Jurnal Teknologi Informasi dan Kesehatan Digital, vol. 5, no. 2, pp. 44–53, 2023.
Koto, F., et al., “IndoBERTweet: Pretrained Language Model for Indonesian Twitter,” EMNLP 2021.
Blei, D. M., Ng, A. Y., & Jordan, M. I., “Latent Dirichlet Allocation,” Journal of Machine Learning Research, 2021.
Grootendorst, M., “BERTopic: Neural Topic Modeling with Contextual Embeddings,” arXiv:2203.05794, 2022.
M. A. Oktaviani dan R. Suryana, “Implementasi Teknik NLP untuk Analisis Sentimen Menggunakan Google Colab,” Journal of Data Science Research, vol. 5, no. 2, pp. 45–53, 2022.
T. Nurzaman et al., “Eksplorasi Model Deep Learning untuk Analisis Teks di Platform Cloud,” Indonesian Journal of Artificial Intelligence, vol. 7, no. 1, pp. 22–30, 2023.
P. Wibowo dan A. N. Lestari, “Monitoring Tren Kesehatan Publik melalui Data Twitter di Indonesia,” Procedia Computer Science, vol. 213, pp. 550–558, 2022.
w11wo, “Indonesian-RoBERTa-Base-Sentiment-Classifier,” Hugging Face, May 12, 2023. [Online]. Available: https://huggingface.co/w11wo/indonesian-roberta-base-sentiment-classifier
C. Wibisono et al., “Fine-Tuning IndoBERTweet untuk Analisis Sentimen Bahasa Indonesia,” Applied Computing and Informatics, vol. 20, no. 1, pp. 25–34, 2024.
Sokolova, M., & Lapalme, G., “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 57, no. 1, pp. 102–129, 2020.
Purwarianti, A., & Crisdayanti, I., “Improving Indonesian Sentiment Analysis using Multilingual BERT,” in Proceedings of the International Conference on Asian Language Processing (IALP), 2019.