RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Kirch, Sascha; Olyunina, Valeria; Ondřej, Jan; Pagés, Rafael; Martín Gutiérrez, Sergio; Pérez Molina, Clara María

Fecha

2023-09-04

Derechos de acceso

info:eu-repo/semantics/openAccess

Editor

IEEE Xplore

Citas

0 citas en

1 citas en

Resumen

We present RGB-D-Fusion, a multi-modal conditional denoising diffusion probabilistic model to generate high resolution depth maps from low-resolution monocular RGB images of humanoid subjects. Accurately representing the human body in 3D is a very active research field given its wide variety of applications. Most 3D reconstruction algorithms rely on depth maps, either coming from low-resolution consumer-level depth sensors, or from monocular depth estimation from standard images. While many modern frameworks use VAEs or GANs for monocular depth estimation, we leverage recent advances in the field of diffusion denoising probabilistic models. We implement a multi-stage conditional diffusion model that first generates a low-resolution depth map conditioned on an image and then upsamples the depth map conditioned on a low-resolution RGB-D image. We further introduce a novel augmentation technique, depth noise augmentation, to increase the robustness of our super-resolution model. Lastly, we show how our method performs on a wide variety of humans with different body types, clothing and poses.

Descripción

Este es el manuscrito aceptado del artículo. La versión registrada fue publicada por primera vez en EEE Access, vol. 11, pp. 99111-99129, 2023, , está disponible en línea en el sitio web del editor: https://doi.org/10.1109/ACCESS.2023.3312017 This is the accepted manuscript of the article. The registered version was first published in EEE Access, vol. 11, pp. 99111-99129, 2023, is available online at the publisher's website: https://doi.org/10.1109/ACCESS.2023.3312017

Palabras clave

solid modeling, data models, estimation, three-dimensional displays, noise reduction, cameras, probabilistic logic, diffusion processes, deep learning, superresolution, augmented reality, virtual reality, diffusion models, generative deep learning, monocular depth estimation, depth super-resolution, multi-modal, augmented-reality, virtual-reality

Citación

S. Kirch, V. Olyunina, J. Ondřej, R. Pagés, S. Martín and C. Pérez-Molina, "RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects," in IEEE Access, vol. 11, pp. 99111-99129, 2023, doi: https://doi.org/10.1109/ACCESS.2023.3312017

Centro

Facultades y escuelas::E.T.S. de Ingenieros Industriales

Departamento

Ingeniería Eléctrica, Electrónica, Control, Telemática y Química Aplicada a la Ingeniería

Fecha

Editor/a

Director/a

Tutor/a

Coordinador/a

Prologuista

Revisor/a

Ilustrador/a

Derechos de acceso

Título de la revista

ISSN de la revista

Título del volumen

Editor

Citas

Proyectos de investigación

Unidades organizativas

Número de la revista

Resumen

Descripción

Categorías UNESCO

Palabras clave

Citación

Centro

Departamento

Grupo de investigación

Grupo de innovación

Programa de doctorado

Cátedra

Handle

DOI

Colecciones