Detecting Spam Tweets Using Machine Learning and Effective Preprocessing

Kardas, Berk; Bayar, Ismail Erdem; Ozyer, Tansel; Alhajj, Reda

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/8605

Title:	Detecting Spam Tweets Using Machine Learning and Effective Preprocessing
Authors:	Kardas, Berk Bayar, Ismail Erdem Ozyer, Tansel Alhajj, Reda
Keywords:	Twitter Spam Detection Machine Learning Preprocessing Social Media
Publisher:	Assoc Computing Machinery
Source:	Kardaş, B., Bayar, İ. E., Özyer, T., & Alhajj, R. (2021, November). Detecting spam tweets using machine learning and effective preprocessing. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 393-398).
Series/Report no.:	Proceedings of the IEEE-ACM International Conference on Advances in Social Networks Analysis and Mining
Abstract:	Nowadays, with the rapid increase in popularity of online social networks (OSNs), these platforms are realized as ideal places for spammers. Unfortunately, these spammers can easily publish malicious content, advertise phishing scams by taking advantage of OSNs. Therefore, effective identification and filtering of spam tweets will be beneficial to both OSNs and users. However, it is becoming increasingly difficult to check and eliminate spam tweets due to this great flow of posts. Motivated by these observations, in this paper we propose an approach for the detection of spam tweets using machine learning and effective preprocessing techniques. The approach proposes the advantages of the preprocessing and which of these preprocessing techniques are the most effective. To compare these techniques UtkML Twitter spam dataset is used in testing. After the most effective methods determined, the detection accuracy of the spam tweets will be better optimized by combining them. We have evaluated our solution with four different machine learning algorithms namely - Naive Bayes Classifier, Neural Network, Logistic Regression and Support Vector Machine. With SVM Classifier, we are able to achieve an accuracy of 93.02%. Experimental results show that our approach can improve the performance of spam tweet classification effectively.
URI:	https://doi.org/10.1145/3487351.3490968
ISBN:	9781450391283
ISSN:	2473-9928 2473-991X
Appears in Collections:	Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record

CORE Recommender

SCOPUS^TM
Citations

10

checked on Sep 13, 2025

WEB OF SCIENCE^TM
Citations

2

checked on Sep 6, 2025

Page view(s)

196

checked on Sep 15, 2025

Google Scholar^TM

Check

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM