Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

  • Uce Indahyanti Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Yulian Findawati Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Achmad Ariansyah Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Endah Asmawati Faculty of Engineering, Universitas Surabaya, Indonesia
Keywords: topic modeling, Twitter data, the Kanjuruhan tragedy, LDA, text mining

Abstract

This study aims to apply topic modeling from Twitter data about the Kanjuruhan tragedy, one of the trending topics due to a fatal incident that occurred after a football match at Kanjuruhan Stadium, in Malang, Indonesia. The research was conducted using the Latent Dirichlet Allocation (LDA), namely a text mining method to find certain patterns in a document by producing several different kinds of topics. The data used consists of 1480 tweets in the Indonesia language that had been pre-processed. This modeling has produced 5 main topics related to the Kanjuruhan tragedy such as the PSSI (Indonesian Football Association) investigation, suspects, the Itaewon tragedy, Korean netizens (Knetz), and tear gas. The implication of this research is not only to provide information about the comments and expectations of Twitter users regarding the Kanjuruhan tragedy but also to provide considerations for the stakeholder.

Downloads

Download data is not yet available.

References

1. L. Liu, L. Tang, W. Dong, S. Yao, and W. Zhou, “An overview of topic modeling and its current applications in bioinformatics,” Springerplus, vol. 5, no. 1, 2016, doi: 10.1186/s40064-016-3252-8.
2. L. Sun and Y. Yin, “Discovering themes and trends in transportation research using topic modeling,” Transp. Res. Part C Emerg. Technol., vol. 77, no. April, pp. 49–66, 2017, doi: 10.1016/j.trc.2017.01.013.
3. G. Lansley and P. A. Longley, “The geography of Twitter topics in London,” Comput. Environ. Urban Syst., vol. 58, pp. 85–96, 2016, doi: 10.1016/j.compenvurbsys.2016.04.002.
4. A. F. Hidayatullah and and M. R. Ma’arif, “Pre-processing Tasks in Indonesian Twitter Messages,” J. Phys. Conf. Ser., 2017, doi: 10.1088/1742-6596/755/1/011001.
5. H. Jelodar et al., “Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey,” Multimed. Tools Appl., vol. 78, no. 11, pp. 15169–15211, 2019, doi: 10.1007/s11042-018-6894-4.
6. A. F. Hidayatullah, E. C. Pembrani, W. Kurniawan, G. Akbar, and R. Pranata, “Twitter Topic Modeling on Football News,” 2018 3rd Int. Conf. Comput. Commun. Syst. ICCCS 2018, pp. 94–98, 2018, doi: 10.1109/CCOMS.2018.8463231.
7. A. A. Amrullah, A. Tantoni, N. Hamdani, R. T. R. L. Bau, and E. U. Ahsan, Muhammad Rafiqudin, “Review Atas Analisis Sentimen Pada Twitter Sebagai Representasi Opini Publik Terhadap Bakal Calon Pemimpin. Prosiding Seminar Nasional Multi Disiplin Ilmu & Call For Papers Unisbank,” 2016.
8. A. Y. N. I. J. David M. Blei, “Latent Dirichlet Allocation: Extracting Topics from Software Engineering Data,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
9. D. Blei, L. Carin, and D. Dunson, “Probabilistic topic models,” IEEE Signal Process. Mag., vol. 27, no. 6, pp. 55–65, 2010, doi: 10.1109/MSP.2010.938079.
10. I. M. K. B. Putra and R. P. Kusumawardani, “Analisis Topik Informasi Publik Media Sosial Di Surabaya Menggunakan Pemodelan Latent Dirichlet Allocation ( LDA ),” J. Tek. Its, vol. 6, no. 2, pp. 2–7, 2017.
11. Y. Sahria, “Analisis Topik Penelitian Kesehatan di Indonesia Menggunakan Metode Topic Modeling LDA (Latent Dirichlet Allocation),” Resti, vol. 4, no. 2, pp. 336–344, 2020.
12. M. Cendana and S. D. H. Permana, “Pra-Pemrosesan Teks Pada Grup Whatsapp Untuk Pemodelan Topik,” Junal Mantik Penusa, vol. 3, no. 3, pp. 107–116, 2019.
13. C. Sievert and K. Shirley, “LDAvis: A method for visualizing and interpreting topics,” no. September, pp. 63–70, 2015, doi: 10.3115/v1/w14-3110.
Published
2023-09-07
How to Cite
Uce Indahyanti, Yulian Findawati, Achmad Ariansyah, & Endah Asmawati. (2023). Twitter Topic Modelling Using Latent Dirichlet Allocation Approach. Central Asian Journal of Theoretical and Applied Science, 4(9), 20-27. Retrieved from https://cajotas.centralasianstudies.org/index.php/CAJOTAS/article/view/1274