Evaluation of unsupervised clustering algorithms for variable stars data

Herrera Plaza, Pilar. (2008). Evaluation of unsupervised clustering algorithms for variable stars data Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial.

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
Herrera_Plaza_Pilar_TFM.pdf Herrera_Plaza_Pilar_TFM.pdf application/pdf 2.50MB

Título Evaluation of unsupervised clustering algorithms for variable stars data
Autor(es) Herrera Plaza, Pilar
Abstract The aim of this master thesis is to assess the validity of unsupervised clustering algorithms to variable stars data classification for the Gaia mission. The use of these techniques allows to identify natural clustering without using any previous information about the classes and its distribution and, therefore, allows to discover new classes of objects. With this objective, we evaluate two probabilistic algorithms, one in which each cluster is characterized by a parametric distribution, and other, by a no-parametric distribution in a hierarchical clustering: Autoclass and HMAC (Hierarchical Mode Association Clustering). Both methods are evaluated against the same criteria, reproducibility, computation time, sensitivity to new classes and interpretability, in datasets that can grow up to 108 instances. These criteria are the first step to assess the feasibility of application of the algorithm but they are not enough to evaluate the goodness of clustering results. Despite the popular use of the unsupervised clustering techniques, the performance evaluation of clustering is an open question. It includes knowing how many clusters are actually present and how real is the clustering itself. Our clustering evaluation starts applying the expert knowledge and using a labeled dataset what allows to match some clusters with some variable stars types, but this is not enough to reach the objective of identifying each cluster. A review of the existing indices to evaluate clustering with objective criteria is included. Clusters and data are then analyzed to understand the results obtained with both methods biased by the method itself. A clustering combination method of these two algorithms is also tested as a technique that optimizes according multiple objective functions and trying to avoid some limitations of both algorithms.
Notas adicionales Trabajo de Fin de Máster. Máster Universitario en I.A. Avanzada: Fundamentos, Métodos y Aplicaciones. UNED
Materia(s) Ingeniería Informática
Palabra clave Unsupervised clustering
Autoclass
HMAC
model-based clustering
hierarchical clustering
validation indices
Editor(es) Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial.
Director/Tutor Sarro Baro, Luis M.
Fecha 2008-09
Formato application/pdf
Identificador bibliuned:master-ETSInformatica-IAA-Pherrera
http://e-spacio.uned.es/fez/view/bibliuned:master-ETSInformatica-IAA-Pherrera
Idioma eng
Versión de la publicación acceptedVersion
Nivel de acceso y licencia http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Tipo de recurso master Thesis
Tipo de acceso acceptedVersion

 
Versiones
Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 245 Visitas, 123 Descargas  -  Estadísticas en detalle
Creado: Thu, 01 Jul 2021, 22:04:02 CET