Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish

Yauri-Lozano, Eduardo, Castillo-Cara, Manuel, Orozco-Barbosa, Luis y García-Castro, Raúl . (2024) Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish. Information Processing and Management

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
Castillo_Cara_Jose_Manuel_GANs.pdf Castillo Cara_Jose Manuel_GANs.pdf application/pdf 1.73MB

Título Generative Adversarial Networks for text-to-face synthesis & generation: A quantitative–qualitative analysis of Natural Language Processing encoders for Spanish
Autor(es) Yauri-Lozano, Eduardo
Castillo-Cara, Manuel
Orozco-Barbosa, Luis
García-Castro, Raúl
Materia(s) Informática
Abstract In recent years, the development of Natural Language Processing (NLP) text-to-face encoders and Generative Adversarial Networks (GANs) has enabled the synthesis and generation of facial images from textual description. However, most encoders have been developed for the English language. This work presents the first study of three text-to-face encoders, namely, the RoBERTa pre-trained model and the Sent2Vec and RoBERTa models, trained with the CelebA dataset in Spanish. It then introduces customised and fine-tuned conditional Deep Convolutional Generative Adversarial Networks (cDCGANs) trained with the CelebA dataset for text-to-face generation in Spanish. To validate the results obtained, a qualitative evaluation was carried out with a visual analysis and a quantitative evaluation based on the IS, FID and LPIPS metrics. Our findings show promising results with respect to the literature, improving the numerical metrics of FID and LPIPS by 5% and 37%, respectively. Our results also show, through a quantitative–qualitative comparison of the cDCGAN training epochs, that the IS metric is not a reliable objective metric to be considered in the evaluation of similar works
Palabras clave mage synthesis
CelebA dataset
RoBERTa transformer
Spanish
cDCGAN
Text-to-face generation
Text-to-image synthesis
Editor(es) Elsevier
Fecha 2024-01
Formato application/pdf
Identificador bibliuned:557-Jmcastillo-0001
http://e-spacio.uned.es/fez/view/bibliuned:557-Jmcastillo-0001
DOI - identifier https://doi.org/10.1016/j.ipm.2024.103667
ISSN - identifier 0306-4573 eISSN 1873-5371
Nombre de la revista Information Processing and Management
Número de Volumen 61
Número de Issue 3
Publicado en la Revista Information Processing and Management
Idioma spa
Versión de la publicación publishedVersion
Tipo de recurso Article
Derechos de acceso y licencia http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Tipo de acceso Acceso abierto
Notas adicionales La versión registrada de este artículo, publicado por primera vez en Information Processing and Management, está disponible en línea en el sitio web del editor: Elsevier https://doi.org/10.1016/j.ipm.2024.103667
Notas adicionales The registered version of this article, first published in Information Processing and Management, is available online at the publisher's website: Elsevier https://doi.org/10.1016/j.ipm.2024.103667

 
Versiones
Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 38 Visitas, 8 Descargas  -  Estadísticas en detalle
Creado: Wed, 28 Feb 2024, 00:12:57 CET