Publicación:
Evaluation of unsupervised clustering algorithms for variable stars data

dc.contributor.authorHerrera Plaza, Pilar
dc.date.accessioned2024-05-20T12:22:59Z
dc.date.available2024-05-20T12:22:59Z
dc.date.issued2008-09
dc.description.abstractThe aim of this master thesis is to assess the validity of unsupervised clustering algorithms to variable stars data classification for the Gaia mission. The use of these techniques allows to identify natural clustering without using any previous information about the classes and its distribution and, therefore, allows to discover new classes of objects. With this objective, we evaluate two probabilistic algorithms, one in which each cluster is characterized by a parametric distribution, and other, by a no-parametric distribution in a hierarchical clustering: Autoclass and HMAC (Hierarchical Mode Association Clustering). Both methods are evaluated against the same criteria, reproducibility, computation time, sensitivity to new classes and interpretability, in datasets that can grow up to 108 instances. These criteria are the first step to assess the feasibility of application of the algorithm but they are not enough to evaluate the goodness of clustering results. Despite the popular use of the unsupervised clustering techniques, the performance evaluation of clustering is an open question. It includes knowing how many clusters are actually present and how real is the clustering itself. Our clustering evaluation starts applying the expert knowledge and using a labeled dataset what allows to match some clusters with some variable stars types, but this is not enough to reach the objective of identifying each cluster. A review of the existing indices to evaluate clustering with objective criteria is included. Clusters and data are then analyzed to understand the results obtained with both methods biased by the method itself. A clustering combination method of these two algorithms is also tested as a technique that optimizes according multiple objective functions and trying to avoid some limitations of both algorithms.en
dc.description.versionversión final
dc.identifier.urihttps://hdl.handle.net/20.500.14468/14090
dc.language.isoen
dc.publisherUniversidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial.
dc.relation.centerE.T.S. de Ingeniería Informática
dc.relation.departmentInteligencia Artificial
dc.rightsAtribución-NoComercial-SinDerivadas 4.0 Internacional
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.subject.keywordsUnsupervised clustering
dc.subject.keywordsAutoclass
dc.subject.keywordsHMAC
dc.subject.keywordsmodel-based clustering
dc.subject.keywordshierarchical clustering
dc.subject.keywordsvalidation indices
dc.titleEvaluation of unsupervised clustering algorithms for variable stars dataes
dc.typetesis de maestríaes
dc.typemaster thesisen
dspace.entity.typePublication
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Herrera_Plaza_Pilar_TFM.pdf
Tamaño:
2.51 MB
Formato:
Adobe Portable Document Format