Detecting malicious tweets in trending topics using a statistical analysis of language

Martínez Romo, Juan; Araujo Serna, M. Lourdes

Publicación:
Detecting malicious tweets in trending topics using a statistical analysis of language

dc.contributor.author	Martínez Romo, Juan
dc.contributor.author	Araujo Serna, M. Lourdes
dc.date.accessioned	2024-05-21T13:03:28Z
dc.date.available	2024-05-21T13:03:28Z
dc.date.issued	2013-06-01
dc.description.abstract	Twitter spam detection is a recent area of research in which most previous works had focused on the identification of malicious user accounts and honeypot-based approaches. However, in this paper we present a methodology based on two new aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Trending topics capture the emerging Internet trends and topics of discussion that are in everybody’s lips. This growing microblogging phenomenon therefore allows spammers to disseminate malicious tweets quickly and massively. In this paper we present the first work that tries to detect spam tweets in real time using language as the primary tool. We first collected and labeled a large dataset with 34 K trending topics and 20 million tweets. Then, we have proposed a reduced set of features hardly manipulated by spammers. In addition, we have developed a machine learning system with some orthogonal features that can be combined with other sets of features with the aim of analyzing emergent characteristics of spam in social networks. We have also conducted an extensive evaluation process that has allowed us to show how our system is able to obtain an F-measure at the same level as the best state-ofthe- art systems based on the detection of spam accounts. Thus, our system can be applied to Twitter spam detection in trending topics in real time due mainly to the analysis of tweets instead of user accounts.	es
dc.description.version	versión publicada
dc.identifier.doi	http://doi.org/10.1016/j.eswa.2012.12.015
dc.identifier.issn	0957-4174
dc.identifier.uri	https://hdl.handle.net/20.500.14468/19982
dc.language.iso	en
dc.publisher	Elsevier
dc.relation.center	E.T.S. de Ingeniería Informática
dc.relation.department	Lenguajes y Sistemas Informáticos
dc.rights	info:eu-repo/semantics/openAccess
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0
dc.subject.keywords	spam detection
dc.subject.keywords	social network
dc.subject.keywords	statistical natural language processing
dc.subject.keywords	machine learning
dc.title	Detecting malicious tweets in trending topics using a statistical analysis of language	es
dc.type	journal article	en
dspace.entity.type	Publication
relation.isAuthorOfPublication	91b7e317-2a30-494f-98e9-3a0e026747b1
relation.isAuthorOfPublication	77c4023e-4374-442a-9dfb-b9d4b609c31e
relation.isAuthorOfPublication.latestForDiscovery	91b7e317-2a30-494f-98e9-3a0e026747b1

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: Detecting_malicious.pdf
Tamaño:: 642.73 KB
Formato:: Adobe Portable Document Format

Descargar

Colecciones

Artículos y papers

Publicación: Detecting malicious tweets in trending topics using a statistical analysis of language

Archivos

Bloque original

Colecciones

Publicación:
Detecting malicious tweets in trending topics using a statistical analysis of language