Publicación: MIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDM
dc.contributor.author | Ibáñez Martínez, Laura | |
dc.contributor.director | Cuadra Troncoso, José Manuel | |
dc.date.accessioned | 2024-09-18T13:01:24Z | |
dc.date.available | 2024-09-18T13:01:24Z | |
dc.date.issued | 2023-09 | |
dc.description.abstract | Text-to-audio systems have gained attention in recent months, achieving impressive results in general audio synthesis. However, they often lack fine-grained control over the musical output, as note-level adjustments cannot be determined by text. In this work, we present MIDI-AudioLDM, which implements MIDI conditioning into AudioLDM with the use of ControlNet. This enables MIDI-conditional text-to-audio synthesis, which adds up to AudioLDM’s previous capacities, including direct text-to-audio synthesis as well as audio style transfer and inpainting. Like AudioLDM, the model uses contrastive language-audio pretraining (CLAP) latents and is trained on audio embeddings, while using text embeddings for inference. In contrast to unconditional audio synthesis, MIDI-AudioLDM offers detailed control over various musical aspects such as notes, genre, mood, and timbre, which makes it a more valuable tool for the music production process. A demo is available at https://huggingface.co/spaces/lauraibnz/midi-audioldm. | en |
dc.identifier.citation | Ibáñez Martínez, Laura (2023) MIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDM. Trabajo Fin de Máster. Universidad de Educación a Distancia (UNED) | |
dc.identifier.uri | https://hdl.handle.net/20.500.14468/23782 | |
dc.language.iso | en | |
dc.relation.center | Facultades y escuelas::E.T.S. de Ingeniería Informática | |
dc.relation.degree | Máster universitario en Investigación en Inteligencia Artificial | |
dc.relation.department | Inteligencia Artificial | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es | |
dc.subject | 12 Matemáticas::1203 Ciencia de los ordenadores ::1203.04 Inteligencia artificial | |
dc.subject | 12 Matemáticas::1203 Ciencia de los ordenadores ::1203.17 Informática | |
dc.subject.keywords | audio synthesis | en |
dc.subject.keywords | MIDI conditioning | en |
dc.subject.keywords | text-to-audio systems | en |
dc.subject.keywords | AudioLDM | en |
dc.subject.keywords | ControlNet | en |
dc.title | MIDI-Conditional Text-to-Audio Synthesis Using ControlNet on AudioLDM | es |
dc.type | tesis de maestría | es |
dc.type | master thesis | en |
dspace.entity.type | Publication |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- Ibanez_Martinez_Laura_TFM.pdf
- Tamaño:
- 1.88 MB
- Formato:
- Adobe Portable Document Format
Bloque de licencias
1 - 1 de 1
No hay miniatura disponible
- Nombre:
- license.txt
- Tamaño:
- 3.62 KB
- Formato:
- Item-specific license agreed to upon submission
- Descripción: