Speak2Subs: Evaluating State-of-the-Art Speech Recognition Models and Compliant Subtitle Generation

Fresneda García, Julio

Fecha

2024-02

Derechos de acceso

info:eu-repo/semantics/openAccess

Editor

Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática.

Citas

0 citas en

Resumen

With recent advances in largue language models, the evolution of speech-to-text tasks has been exponential. While state-of-the-art automatic speech recognition (ASR) models have taken a big step in speech transcription, creating quality subtitles still requires human intervention. This project has two main aspects: evaluating cutting-edge ASR models for speech-to-text, and developing a package that uses this ASR models to generate high-quality and compliant subtitles. ASR models do not inherently provide results suitable for subtitles. Therefore, one of the primary objectives of this package is to utilize and enhance the output generated by ASR models to create subtitles of a quality that requires minimal human modification. This enhancement is necessary because ASR models alone are incapable of producing subtitles that meet the required standards of quality. Speak2Subs has achieved this goal, being a tool that produces high-quality subtitles with minimal human interaction.

Palabras clave

ASR, LLM, Speech-To-Text, Subtitle

Centro

E.T.S. de Ingeniería Informática

Departamento

Inteligencia Artificial

Handle

https://hdl.handle.net/20.500.14468/22596

Colecciones

Trabajos de fin de máster (TFM)

Página completa del ítem

Fecha

Editor/a

Director/a

Tutor/a

Coordinador/a

Prologuista

Revisor/a

Ilustrador/a

Derechos de acceso

Título de la revista

ISSN de la revista

Título del volumen

Editor

Citas

Proyectos de investigación

Unidades organizativas

Número de la revista

Resumen

Descripción

Categorías UNESCO

Palabras clave

Citación

Centro

Departamento

Grupo de investigación

Grupo de innovación

Programa de doctorado

Cátedra

Handle

DOI

Colecciones