Automatic detection of trends in time-stamped sequences : an evolutionary approach

Araujo, Lourdes y Merelo, Juan Julián . (2009) Automatic detection of trends in time-stamped sequences : an evolutionary approach. Soft Computing 01/2010, 14:211-227. DOI: 10.1007/s00500-008-0395-8

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
Documento.pdf Pdf del documento application/pdf

Título Automatic detection of trends in time-stamped sequences : an evolutionary approach
Autor(es) Araujo, Lourdes
Merelo, Juan Julián
Resumen This paper presents an evolutionary algorithm for modeling the arrival dates in time-stamped data sequences such as newscasts, e-mails, IRC conversations, scientific journal articles or weblog postings. These models are applied to the detection of buzz (i.e. terms that occur with a higher-than-normal frequency) in them, which has attracted a lot of interest in the online world with the increasing number of periodic content producers. That is why in this paper we have used this kind of online sequences to test our system, though it is also valid for other types of event sequences. The algorithm assigns frequencies (number of events per time unit) to time intervals so that it produces an optimal fit to the data. The optimization procedure is a trade off between accurately fitting the data and avoiding too many frequency changes, thus overcoming the noise inherent in these sequences. This process has been traditionally performed using dynamic programming algorithms, which are limited by memory and efficiency requirements. This limitation can be a problem when dealing with long sequences, and suggests the application of alternative search methods with some degree of uncertainty to achieve tractability, such as the evolutionary algorithm proposed in this paper. This algorithm is able to reach the same solution quality as those classical dynamic programming algorithms, but in a shorter time. We also test different cost functions and propose a new one that yields better fits than the one originally proposed by Kleinberg on real-world data. Finally, several distributions of states for the finite state automata are tested, with the result that an uniform distribution produces much better fits than the geometric distribution also proposed by Kleinberg. We also present a variant of the evolutionary algorithm, which achieves a fast fit of a sequence extended with new data, by taking advantage of the fit obtained for the original subsequence.
Palabras clave evolutionary algorithms
event tracking
data time-stamped sequences
burst detection
Editor(es) Springer-Verlag
Fecha 2009-01-14
Formato application/pdf
Identificador http://e-spacio.uned.es/fez/view/bibliuned:DptoLSI-ETSI-MA2VICMR-1085
bibliuned:DptoLSI-ETSI-MA2VICMR-1085
DOI - identifier 10.1007/s00500-008-0395-8
Publicado en la Revista Soft Computing 01/2010, 14:211-227. DOI: 10.1007/s00500-008-0395-8
Idioma eng
Versión de la publicación publishedVersion
Relacionado con el proyecto: info:eu-repo/grantAgreement/S2009/TIC-1542
Tipo de recurso Article
Derechos de acceso y licencia http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Tipo de acceso Acceso abierto

 
Versiones
Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 630 Visitas, 376 Descargas  -  Estadísticas en detalle
Creado: Wed, 26 Nov 2014, 15:19:27 CET