Publicación:
A refinement of the well-founded Information Content models with a very detailed experimental survey on WordNet.

dc.contributor.authorLastra-Díaz, Juan J.
dc.contributor.authorGarcía Serrano, Ana Mª
dc.date.accessioned2024-05-20T07:46:18Z
dc.date.available2024-05-20T07:46:18Z
dc.description.abstractIn a recent paper, we introduce a new family of Information Content (IC) models based on the estimation of the conditional probability between child and parent concepts. This work is encouraged by the nding of two drawbacks in the computational method of our aforementioned family of IC models, as well as other two gaps in the literature. First gap is that two of our cognitive IC models do not satisfy the axiom that constrains the sum of probabilities on the leaf nodes to be 1, whilst some ontologies with multiple inheritance could prevent the IC model satisfying the growing monotonicity axiom in concepts with multiple parents. Second gap is the lack of a complete and updated experimental survey including a pairwise statistical signi cance analysis between most IC models and ontology-based similarity measures. Finally a third gap is the lack of replication and con rmation of previous methods and results in most works. The latest two gaps are especially signi cant in the current state of the problem, in which there is no convincing winner within the family of intrinsic IC-based similarity measures and the performance margin is very narrow. In order to bridge the aforementioned gaps, this paper introduces the following contributions: (1) a re nement of our recent family of well-founded Information Content (IC) models; (2) eight new intrinsic IC models and one new corpus-based IC model; and (3) a very detailed experimental survey of ontology-based similarity measures and Information Content (IC) models on WordNet, including the evaluation and statistical signi cance analysis on the ve most signi cant datasets of most ontology-based similarity measures and all WordNet-based IC models reported in the literature, with the only exception of the IC models recently introduced by Harispe et al. (2015a) and Ben Aouicha et al. (2016b). The evaluation is entirely based on a Java software library called HESML which has been developed by the authors in order to replicate all methods evaluated herein. The new IC models obtain rivaling results as regard the state-of-the-art methods and improve our previous mod- els, whilst the experimental survey allows a detailed and conclusive image of the state of the problem to be drawn by setting the new state of the art and quantifying the main achievements of the last three decades.en
dc.identifier.urihttps://hdl.handle.net/20.500.14468/9886
dc.language.isoen
dc.publisherUniversidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Lenguajes y Sistemas Informáticos
dc.relation.centerE.T.S. de Ingeniería Informática
dc.relation.departmentLenguajes y Sistemas Informáticos
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subject.keywordsIntrinsic Information Content models
dc.subject.keywordsontology-based semantic similarity measures
dc.subject.keywordsIC- based similarity measures
dc.subject.keywordsword similarity benchmark
dc.subject.keywordssemantic similarity
dc.subject.keywordsconcept similarity model
dc.subject.keywordsexperimental survey
dc.titleA refinement of the well-founded Information Content models with a very detailed experimental survey on WordNet.es
dc.typereporten
dc.typeinformees
dspace.entity.typePublication
relation.isAuthorOfPublication170ac137-4953-41fe-ad27-14eff0a57df5
relation.isAuthorOfPublication.latestForDiscovery170ac137-4953-41fe-ad27-14eff0a57df5
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Refinement_final_Espace_LastraGarcia.pdf
Tamaño:
585.27 KB
Formato:
Adobe Portable Document Format
Colecciones