Publicación:
Knowledge capture and textual inference

Cargando...
Miniatura
Fecha
2014-02-27
Editor/a
Tutor/a
Coordinador/a
Prologuista
Revisor/a
Ilustrador/a
Derechos de acceso
info:eu-repo/semantics/openAccess
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Lenguajes y Sistemas Informáticos
Proyectos de investigación
Unidades organizativas
Número de la revista
Resumen
The present and future information needs of the society rely on the ability of computers to understand and manage knowledge. The lack of this mechanism explains the problems of knowledge driven systems to effectively perform tasks as question answering and machine reading. One of the biggest bottlenecks is the automatic knowledge acquisition problem. In the actual stage of development, it seems obvious that only semisupervised or unsupervised techniques can scale to deal with large corpora of natural language like the Web. The trend has evolved from populating a predefined ontology to expressing knowledge through either unconstrained relations or propositions. The arrival of new deep language processing technologies let us think that we can annotate large collections of text with accurate predicates that can be used to extracting knowledge from text without tying it to any predefined logical schema. On the other hand, it is not clear which tasks can harness this knowledge and how it can be done. This master’s thesis proposes a new method of knowledge capture and textual inference based on three cornerstones: (1) First, we develop a procedure to turn plain text into a graph based representation taking advantage of existing tools. (2) Second, we develop a proposition extraction system. (3) Lastly, we study an unsupervised method for correction of appositive dependencies, as an example of the textual inferences that the generated proposition store enables. In addition, we generate two useful resources for future tasks of natural language processing: A corpus of 7 million documents represented as semantically enriched graphs and a proposition store of semantic classes with 8 million instances of entity-class relations.
Descripción
Categorías UNESCO
Palabras clave
Citación
Centro
E.T.S. de Ingeniería Informática
Departamento
Lenguajes y Sistemas Informáticos
Grupo de investigación
Grupo de innovación
Programa de doctorado
Cátedra
DOI