Bilingual COVID-19 Fake News Detection Based on LDA Topic Modeling and BERT Transformer
Oral Presentation
Authors
1Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran, Iran. Adak Vira Iranian Rahjoo Company, Tehran, Iran
2School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran. Adak Vira Iranian Rahjoo Company, Tehran, Iran
3School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
Abstract
The spread of fake news has become more
prevalent given the popularity of social media and the
various news that circulates on it. As a result, it is
crucial to discern between real and fake news. During
the COVID-19 pandemic, there have been numerous
tweets, posts, and news about this illness in social
media and electronic media worldwide. This research
presents a bilingual model combining Latent Dirichlet
Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian
and English. First, the dataset is prepared in Persian
and English, and then the proposed method is used to
detect COVID-19 fake news on the prepared dataset.
Finally, the proposed model is evaluated using various
metrics such as accuracy, precision, recall, and the f1-
score. As a result of this approach, we achieve 92.18%
accuracy, which shows that adding topic information
to the pre-trained contextual representations given by
the BERT network, significantly improves the solving
of instances that are domain-specific. Also, the results
show that our proposed approach outperforms previous
state-of-the-art methods.
prevalent given the popularity of social media and the
various news that circulates on it. As a result, it is
crucial to discern between real and fake news. During
the COVID-19 pandemic, there have been numerous
tweets, posts, and news about this illness in social
media and electronic media worldwide. This research
presents a bilingual model combining Latent Dirichlet
Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian
and English. First, the dataset is prepared in Persian
and English, and then the proposed method is used to
detect COVID-19 fake news on the prepared dataset.
Finally, the proposed model is evaluated using various
metrics such as accuracy, precision, recall, and the f1-
score. As a result of this approach, we achieve 92.18%
accuracy, which shows that adding topic information
to the pre-trained contextual representations given by
the BERT network, significantly improves the solving
of instances that are domain-specific. Also, the results
show that our proposed approach outperforms previous
state-of-the-art methods.
Keywords
Proceeding Title [Persian]
Bilingual COVID-19 Fake News Detection Based on LDA Topic Modeling and BERT Transformer
Authors [Persian]
Abstract [Persian]
The spread of fake news has become more
prevalent given the popularity of social media and the
various news that circulates on it. As a result, it is
crucial to discern between real and fake news. During
the COVID-19 pandemic, there have been numerous
tweets, posts, and news about this illness in social
media and electronic media worldwide. This research
presents a bilingual model combining Latent Dirichlet
Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian
and English. First, the dataset is prepared in Persian
and English, and then the proposed method is used to
detect COVID-19 fake news on the prepared dataset.
Finally, the proposed model is evaluated using various
metrics such as accuracy, precision, recall, and the f1-
score. As a result of this approach, we achieve 92.18%
accuracy, which shows that adding topic information
to the pre-trained contextual representations given by
the BERT network, significantly improves the solving
of instances that are domain-specific. Also, the results
show that our proposed approach outperforms previous
state-of-the-art methods.
prevalent given the popularity of social media and the
various news that circulates on it. As a result, it is
crucial to discern between real and fake news. During
the COVID-19 pandemic, there have been numerous
tweets, posts, and news about this illness in social
media and electronic media worldwide. This research
presents a bilingual model combining Latent Dirichlet
Allocation (LDA) topic modeling and the BERT transformer to detect COVID-19 fake news in both Persian
and English. First, the dataset is prepared in Persian
and English, and then the proposed method is used to
detect COVID-19 fake news on the prepared dataset.
Finally, the proposed model is evaluated using various
metrics such as accuracy, precision, recall, and the f1-
score. As a result of this approach, we achieve 92.18%
accuracy, which shows that adding topic information
to the pre-trained contextual representations given by
the BERT network, significantly improves the solving
of instances that are domain-specific. Also, the results
show that our proposed approach outperforms previous
state-of-the-art methods.
Keywords [Persian]
BERT transformer، topic modeling، fake news detection، COVID-19