Publicación: Desambiguación de acrónimos en literatura médica española
Cargando...
Fecha
2021-09-01
Autores
Editor/a
Director/a
Tutor/a
Coordinador/a
Prologuista
Revisor/a
Ilustrador/a
Derechos de acceso
Atribución-NoComercial-SinDerivadas 4.0 Internacional
info:eu-repo/semantics/openAccess
info:eu-repo/semantics/openAccess
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Lenguajes y Sistemas Informáticos
Resumen
La literatura biomédica esta repleta de abreviaciones y acrónimos, los cuales en muchas ocasiones son ambiguos. En las tareas de Procesamiento del Lenguaje Natural en los que este tipo de textos están involucrados, supone un gran problema por parte del sistema, para poder identificar y comprender tanto el documento como este tipo de palabras. En la última década se han desarrollado muchas investigaciones para poder desambiguar los acrónimos en literatura médica según el contexto del documento. Sin embargo, el reto siempre ha estado en el coste computacional que supone entrenar un modelo con textos de un ámbito concreto. Recientemente ha habido avances en este tema gracias a modelos lingüísticos basados en mecanismos de atención llamados Transformers, especialmente aquellos preentrenados ya con grandes corpus, como BERT (Bidirectional Encoder Representations from Transformers). Estos novedosos modelos han sido usados en los últimos tres años para la desambiguación de acrónimos en literatura médica, especialmente inglesa. En este trabajo se propone adaptarlos para poder realizarlo en literatura médica española.
Biomedical literature is full of abbreviations and acronyms, which are often ambiguous. In Natural Language Processing tasks in which these types of texts are involved, it is a big problem for the system to identify and understand both the document and these types of words. In the last decade, a lot of research has been done to disambiguate acronyms in medical literature according to the context of the document. However, the challenge has always been the computational cost of training a model with texts from a specific field. Recently there have been advances in this subject thanks to linguistic models based on attention mechanisms called Transformers, especially those already pre-trained with large corpus, such as BERT (Bidirectional Encoder Representations from Transformers). These models have been used in the last three years for the disambiguation of acronyms in medical literature, especially in English. In this document we propose to adapt them to use in Spanish medical literature.
Biomedical literature is full of abbreviations and acronyms, which are often ambiguous. In Natural Language Processing tasks in which these types of texts are involved, it is a big problem for the system to identify and understand both the document and these types of words. In the last decade, a lot of research has been done to disambiguate acronyms in medical literature according to the context of the document. However, the challenge has always been the computational cost of training a model with texts from a specific field. Recently there have been advances in this subject thanks to linguistic models based on attention mechanisms called Transformers, especially those already pre-trained with large corpus, such as BERT (Bidirectional Encoder Representations from Transformers). These models have been used in the last three years for the disambiguation of acronyms in medical literature, especially in English. In this document we propose to adapt them to use in Spanish medical literature.
Descripción
Categorías UNESCO
Palabras clave
acrónimo, desambiguación, Biomedicina, Transformers, mecanismos de atención, BERT, BETO, acronym, disambiguation, Biomedical, Transformers, attention mechanisms
Citación
Centro
Facultades y escuelas::E.T.S. de Ingeniería Informática
Departamento
Inteligencia Artificial