Publicación:
MIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDM

dc.contributor.authorIbáñez Martínez, Laura
dc.contributor.directorCuadra Troncoso, José Manuel
dc.date.accessioned2024-09-18T13:01:24Z
dc.date.available2024-09-18T13:01:24Z
dc.date.issued2023-09
dc.description.abstractText-to-audio systems have gained attention in recent months, achieving impressive results in general audio synthesis. However, they often lack fine-grained control over the musical output, as note-level adjustments cannot be determined by text. In this work, we present MIDI-AudioLDM, which implements MIDI conditioning into AudioLDM with the use of ControlNet. This enables MIDI-conditional text-to-audio synthesis, which adds up to AudioLDM’s previous capacities, including direct text-to-audio synthesis as well as audio style transfer and inpainting. Like AudioLDM, the model uses contrastive language-audio pretraining (CLAP) latents and is trained on audio embeddings, while using text embeddings for inference. In contrast to unconditional audio synthesis, MIDI-AudioLDM offers detailed control over various musical aspects such as notes, genre, mood, and timbre, which makes it a more valuable tool for the music production process. A demo is available at https://huggingface.co/spaces/lauraibnz/midi-audioldm.en
dc.identifier.citationIbáñez Martínez, Laura (2023) MIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDM. Trabajo Fin de Máster. Universidad de Educación a Distancia (UNED)
dc.identifier.urihttps://hdl.handle.net/20.500.14468/23782
dc.language.isoen
dc.relation.centerFacultades y escuelas::E.T.S. de Ingeniería Informática
dc.relation.degreeMáster universitario en Investigación en Inteligencia Artificial
dc.relation.departmentInteligencia Artificial
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
dc.subject12 Matemáticas::1203 Ciencia de los ordenadores ::1203.04 Inteligencia artificial
dc.subject12 Matemáticas::1203 Ciencia de los ordenadores ::1203.17 Informática
dc.subject.keywordsaudio synthesisen
dc.subject.keywordsMIDI conditioningen
dc.subject.keywordstext-to-audio systemsen
dc.subject.keywordsAudioLDMen
dc.subject.keywordsControlNeten
dc.titleMIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDMes
dc.typetesis de maestríaes
dc.typemaster thesisen
dspace.entity.typePublication
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Ibanez_Martinez_Laura_TFM.pdf
Tamaño:
1.88 MB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.62 KB
Formato:
Item-specific license agreed to upon submission
Descripción: