Twitter Topic Modelling Using Latent Dirichlet Allocation Approach

  • Uce Indahyanti Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Yulian Findawati Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Achmad Ariansyah Faculty of Science and Technology, Universitas Muhammadiyah Sidoarjo, Indonesia
  • Endah Asmawati Faculty of Engineering, Universitas Surabaya, Indonesia
Keywords: topic modeling, Twitter data, the Kanjuruhan tragedy, LDA, text mining


This study aims to apply topic modeling from Twitter data about the Kanjuruhan tragedy, one of the trending topics due to a fatal incident that occurred after a football match at Kanjuruhan Stadium, in Malang, Indonesia. The research was conducted using the Latent Dirichlet Allocation (LDA), namely a text mining method to find certain patterns in a document by producing several different kinds of topics. The data used consists of 1480 tweets in the Indonesia language that had been pre-processed. This modeling has produced 5 main topics related to the Kanjuruhan tragedy such as the PSSI (Indonesian Football Association) investigation, suspects, the Itaewon tragedy, Korean netizens (Knetz), and tear gas. The implication of this research is not only to provide information about the comments and expectations of Twitter users regarding the Kanjuruhan tragedy but also to provide considerations for the stakeholder.


