Logotipo del repositorio
  • English
  • Español
  • Français
  • Iniciar sesión
    Identificación habilitada exclusivamente para personal de Biblioteca.
    ¿Has olvidado tu contraseña?
Logotipo del repositorio
  • Comunidades
  • Filtrar búsqueda
  • Depositar
  • English
  • Español
  • Français
  • Iniciar sesión
    Identificación habilitada exclusivamente para personal de Biblioteca.
    ¿Has olvidado tu contraseña?
  1. Inicio
  2. Buscar por autor

Examinando por Autor "Velasco Arribas, Maria"

Mostrando 1 - 1 de 1
Resultados por página
Opciones de ordenación
  • Cargando...
    Miniatura
    Publicación
    Early diagnosis of HIV cases by means of text mining and machine learning models on clinical notes
    (ELSEVIER, 2024) Morales Sánchez, Rodrigo; Montalvo Herranz, Soto; Riaño Martínez, Adrián; Martínez Unanue, Raquel; Velasco Arribas, Maria; https://orcid.org/0000-0001-8158-7939; https://orcid.org/0009-0004-8755-255X; https://orcid.org/0000-0001-6554-2095
    Undiagnosed and untreated human immunodeficiency virus (HIV) infection increases morbidity in the HIV-positive person and allows onward transmission of the virus. Minimizing missed opportunities for HIV diagnosis when a patient visits a healthcare facility is essential in restraining the epidemic and working toward its eventual elimination. Most state-of-the-art proposals employ machine learning (ML) methods and structured data to enhance HIV diagnoses, however, there is a dearth of recent proposals utilizing unstructured textual data from Electronic Health Records (EHRs). In this work, we propose to use only the unstructured text of the clinical notes as evidence for the classification of patients as suspected or not suspected. For this purpose, we first compile a dataset of real clinical notes from a hospital with patients classified as suspects and non-suspects of having HIV. Then, we evaluate the effectiveness of two types of classification models to identify patients suspected of being infected with the virus: classical ML algorithms and two Large Language Models (LLMs) from the biomedical domain in Spanish. The results show that both LLMs outperform classical ML algorithms in the two settings we explore: one dataset version is balanced, containing an equal number of suspicious and non-suspicious patients, while the other reflects the real distribution of patients in the hospital, being unbalanced. We obtain F score figures of 94.7 with both LLMs in the unbalanced setting, while in the balance one, RoBERTa model outperforms the other one with a F score of 95.7. The findings indicate that leveraging unstructured text with LLMs in the biomedical domain yields promising outcomes in diminishing missed opportunities for HIV diagnosis. A tool based on our system could assist a doctor in deciding whether a patient in consultation should undergo a serological test.
Enlaces de interés

Aviso legal

Política de privacidad

Política de cookies

Reclamaciones, sugerencias y felicitaciones

Recursos adicionales

Biblioteca UNED

Depósito de datos de investigación

Portal de investigación UNED

InvestigaUNED

Contacto

Teléfono: 913988766 / 6633 / 7891 / 6172

Correo: repositoriobiblioteca@adm.uned.es