SMU Data Science Review

Content-Based Unsupervised Fake News Detection on Ukraine-Russia War

Yucheol Shin, Southern Methodist UniversityFollow
Yvan Sojdehei, Southern Methodist UniversityFollow
Limin Zheng, Southern Methodist UniversityFollow
Brad Blanchard, Southern Methodist UniversityFollow

Abstract

The Ukrainian-Russian war has garnered significant attention worldwide, with fake news obstructing the formation of public opinion and disseminating false information. This scholarly paper explores the use of unsupervised learning methods and the Bidirectional Encoder Representations from Transformers (BERT) to detect fake news in news articles from various sources. BERT topic modeling is applied to cluster news articles by their respective topics, followed by summarization to measure the similarity scores. The hypothesis posits that topics with larger variances are more likely to contain fake news. The proposed method was evaluated using a dataset of approximately 1000 labeled news articles related to the Syrian war. The study found that while unsupervised content clustering with topic similarity was insufficient to detect fake news, it demonstrated the prevalence of fake news content and its potential for clustering by topic.

Recommended Citation

Shin, Yucheol; Sojdehei, Yvan; Zheng, Limin; and Blanchard, Brad (2023) "Content-Based Unsupervised Fake News Detection on Ukraine-Russia War," SMU Data Science Review: Vol. 7: No. 1, Article 3.
Available at: https://scholar.smu.edu/datasciencereview/vol7/iss1/3