Persona:
Ghajari Espinosa, Adrián

Cargando...
Foto de perfil
Dirección de correo electrónico
ORCID
Fecha de nacimiento
Proyectos de investigación
Unidades organizativas
Puesto de trabajo
Apellidos
Ghajari Espinosa
Nombre de pila
Adrián
Nombre

Resultados de la búsqueda

Mostrando 1 - 1 de 1
  • Publicación
    Querying the Depths: Unveiling the Strengths and Struggles of Large Language Models in SPARQL Generation
    (Sociedad Española para el procesamiento del lenguaje natural, 2024) Ghajari Espinosa, Adrián; Ros Muñoz, Salvador; Pérez Pozo, Álvaro; Fresno Fernández, Víctor Diego
    In the quest to democratize access to databases and knowledge graphs, the ability to express queries in natural language and obtain the requested information becomes paramount, particularly for individuals lacking formal training in query languages. This situation affects SPARQL, the standard for querying ontology-based knowledge graphs, posing a significant barrier to many, hindering their ability to leverage these rich resources for research and analysis. To address this gap, our research delves into harnessing the power of Large Language Models (LLMs) to facilitate the generation of SPARQL queries directly from natural language descriptions. For this purpose, we have explored the most popular prompt engineering techniques, a powerful tool in crafting queries that help generative AI models understand and produce specific or generalized outputs based on the quality of provided prompts, without the need of aditional training. By integrating few-shot learning (FSL), Chain-of-Thought (CoT) reasoning, and Retrieval-Augmented Generation (RAG), we devise prompts that streamline the creation of effective SPARQL queries, facilitating more straightforward access to ontology knowledge graphs. Our analysis involved prompts evaluated across three distinct LLMs: DeepSeek-Code 6.7b, CodeLlama-13b and GPT 3.5 TURBO. The comparative results revealed marginal variations in accuracy among these models, with FSL emerging as the most effective technique. Our results highlight the potential of LLMs to make knowledge graphs more accessible to a broader audience, but also that much more research is needed to get results comparable to human performance.