Heterogeneous gradient computing optimization for scalable deep neural networks

Moreno Álvarez, Sergio; Paoletti, Mercedes Eugenia; Rico Gallego, Juan Antonio; Haut, Juan M.

Fecha

2022

Derechos de acceso

info:eu-repo/semantics/openAccess

Editorial

Springer

Citas

0 citas en

Resumen

Nowadays, data processing applications based on neural networks cope with the growth in the amount of data to be processed and with the increase in both the depth and complexity of the neural networks architectures, and hence in the number of parameters to be learned. High-performance computing platforms are provided with fast computing resources, including multi-core processors and graphical processing units, to manage such computational burden of deep neural network applications. A common optimization technique is to distribute the workload between the processes deployed on the resources of the platform. This approach is known as data-parallelism. Each process, known as replica, trains its own copy of the model on a disjoint data partition. Nevertheless, the heterogeneity of the computational resources composing the platform requires to unevenly distribute the workload between the replicas according to its computational capabilities, to optimize the overall execution performance. Since the amount of data to be processed is different in each replica, the influence of the gradients computed by the replicas in the global parameter updating should be different. This work proposes a modification of the gradient computation method that considers the different speeds of the replicas, and hence, its amount of data assigned. The experimental results have been conducted on heterogeneous high-performance computing platforms for a wide range of models and datasets, showing an improvement in the final accuracy with respect to current techniques, with a comparable performance.

Descripción

The registered version of this article, first published in “The Journal of Supercomputing, 78, 2022", is available online at the publisher's website: Springer, https://doi.org/10.1007/s11227-022-04399-2 La versión registrada de este artículo, publicado por primera vez en “The Journal of Supercomputing, 78, 2022", está disponible en línea en el sitio web del editor: Springer, https://doi.org/10.1007/s11227-022-04399-2

Palabras clave

deep learning, deep neural networks, high-performance computing, heterogeneous platforms, distributed training

Citación

Sergio Moreno-Álvarez, Mercedes E Paoletti, Juan A Rico-Gallego, Juan M Haut. "Heterogeneous gradient computing optimization for scalable deep neural networks". The Journal of Supercomputing, 78, 11, 19 March 2022, 13455-13469.

Centro

E.T.S. de Ingeniería Informática

Departamento

Lenguajes y Sistemas Informáticos

Fecha

Editor/a

Director/a

Tutor/a

Coordinador/a

Prologuista

Revisor/a

Ilustrador/a

Derechos de acceso

Título de la revista

ISSN de la revista

Título del volumen

Editorial

Citas

Proyectos de investigación

Unidades organizativas

Número de la revista

Resumen

Descripción

Categorías UNESCO

Palabras clave

Citación

Centro

Departamento

Grupo de investigación

Grupo de innovación

Programa de doctorado

Cátedra

Datos de investigación relacionados

Handle

DOI

Colecciones