Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation

Fresneda García, Julio. (2024). Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática.

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
Fresneda_Garcia_Julio_Antonio_TFM.pdf Fresneda Garcia_Julio Antonio_TFM.pdf application/pdf 2.85MB

Título Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation
Autor(es) Fresneda García, Julio
Abstract With recent advances in largue language models, the evolution of speech-to-text tasks has been exponential. While state-of-the-art automatic speech recognition (ASR) models have taken a big step in speech transcription, creating quality subtitles still requires human intervention. This project has two main aspects: evaluating cutting-edge ASR models for speech-to-text, and developing a package that uses this ASR models to generate high-quality and compliant subtitles. ASR models do not inherently provide results suitable for subtitles. Therefore, one of the primary objectives of this package is to utilize and enhance the output generated by ASR models to create subtitles of a quality that requires minimal human modification. This enhancement is necessary because ASR models alone are incapable of producing subtitles that meet the required standards of quality. Speak2Subs has achieved this goal, being a tool that produces high-quality subtitles with minimal human interaction.
Notas adicionales Trabajo de Fin de Máster Universitario en Ingeniería y Ciencia de Datos. UNED
Materia(s) Ingeniería Informática
Palabra clave ASR
LLM
Speech-To-Text
Subtitle
Editor(es) Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática.
Director/Tutor Pérez Martín, Jorge
Rodrigo Yuste, Álvaro
Fecha 2024-02
Formato application/pdf
Identificador bibliuned:master-ETSInformatica-ICD-Jfresneda
http://e-spacio.uned.es/fez/view/bibliuned:master-ETSInformatica-ICD-Jfresneda
Idioma eng
Versión de la publicación acceptedVersion
Nivel de acceso y licencia http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Tipo de recurso master Thesis
Tipo de acceso Acceso abierto

 
Versiones
Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 59 Visitas, 37 Descargas  -  Estadísticas en detalle
Creado: Fri, 15 Mar 2024, 21:32:38 CET