Publicación:
Enhancing Distributed Neural Network Training Through Node-Based Communications

dc.contributor.authorMoreno Álvarez, Sergio
dc.contributor.authorPaoletti, Mercedes Eugenia
dc.contributor.authorCavallaro, Gabriele
dc.contributor.authorHaut, Juan M.
dc.contributor.orcidhttps://orcid.org/0000-0003-1030-3729
dc.contributor.orcidhttps://orcid.org/0000-0002-3239-9904
dc.contributor.orcidhttps://orcid.org/0000-0001-6701-961X
dc.date.accessioned2024-11-20T08:07:09Z
dc.date.available2024-11-20T08:07:09Z
dc.date.issued2023
dc.descriptionThe registered version of this article, first published in “IEEE Transactions on Neural Networks and Learning Systems, 2023", is available online at the publisher's website: Elsevier, https://doi.org/10.1016/j.chb.2020.106595 La versión registrada de este artículo, publicado por primera vez en “IEEE Transactions on Neural Networks and Learning Systems, 2023", está disponible en línea en el sitio web del editor: Elsevier, https://doi.org/10.1016/j.chb.2020.106595
dc.description.abstractThe amount of data needed to effectively train modern deep neural architectures has grown significantly, leading to increased computational requirements. These intensive computations are tackled by the combination of last generation computing resources, such as accelerators, or classic processing units. Nevertheless, gradient communication remains as the major bottleneck, hindering the efficiency notwithstanding the improvements in runtimes obtained through data parallelism strategies. Data parallelism involves all processes in a global exchange of potentially high amount of data, which may impede the achievement of the desired speedup and the elimination of noticeable delays or bottlenecks. As a result, communication latency issues pose a significant challenge that profoundly impacts the performance on distributed platforms. This research presents node-based optimization steps to significantly reduce the gradient exchange between model replicas whilst ensuring model convergence. The proposal serves as a versatile communication scheme, suitable for integration into a wide range of general-purpose deep neural network (DNN) algorithms. The optimization takes into consideration the specific location of each replica within the platform. To demonstrate the effectiveness, different neural network approaches and datasets with disjoint properties are used. In addition, multiple types of applications are considered to demonstrate the robustness and versatility of our proposal. The experimental results show a global training time reduction whilst slightly improving accuracy. Code: https://github.com/mhaut/eDNNcomm.en
dc.description.versionversión publicada
dc.identifier.citationS. Moreno-Álvarez, M. E. Paoletti, G. Cavallaro and J. M. Haut, "Enhancing Distributed Neural Network Training Through Node-Based Communications," in IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2023.3309735
dc.identifier.doihttps://doi.org/10.1109/TNNLS.2023.3309735
dc.identifier.issn2162-237X | eISSN 2162-2388
dc.identifier.urihttps://hdl.handle.net/20.500.14468/24438
dc.journal.titleIEEE Transactions on Neural Networks and Learning Systems
dc.language.isoen
dc.page.final15
dc.page.initial1
dc.publisherIEEE
dc.relation.centerFacultades y escuelas::E.T.S. de Ingeniería Informática
dc.relation.departmentLenguajes y Sistemas Informáticos
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.es
dc.subject12 Matemáticas::1203 Ciencia de los ordenadores ::1203.17 Informática
dc.subject.keywordsTrainingen
dc.subject.keywordsComputational modelingen
dc.subject.keywordsData modelsen
dc.subject.keywordsDistributed databasesen
dc.subject.keywordsParallel processingen
dc.subject.keywordsCostsen
dc.subject.keywordsOptimizationen
dc.titleEnhancing Distributed Neural Network Training Through Node-Based Communicationsen
dc.typeartículoes
dc.typejournal articleen
dspace.entity.typePublication
relation.isAuthorOfPublication3482d7bc-e120-48a3-812e-cc4b25a6d2fe
relation.isAuthorOfPublication.latestForDiscovery3482d7bc-e120-48a3-812e-cc4b25a6d2fe
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
MorenoAlvarez_Sergio_2023EnhancingDistributed.pdf
Tamaño:
3.03 MB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.62 KB
Formato:
Item-specific license agreed to upon submission
Descripción: