Persona: Moreno Álvarez, Sergio
Cargando...
Dirección de correo electrónico
ORCID
0000-0002-1858-9920
Fecha de nacimiento
Proyectos de investigación
Unidades organizativas
Puesto de trabajo
Apellidos
Moreno Álvarez
Nombre de pila
Sergio
Nombre
22 resultados
Resultados de la búsqueda
Mostrando 1 - 10 de 22
Publicación Evaluación de Rendimiento del Entrenamiento Distribuido de Redes Neuronales Profundas en Plataformas Heterogéneas(Universidad de Extremadura, 2019) Moreno Álvarez, Sergio; Paoletti, Mercedes Eugenia; Haut, Juan Mario; Rico Gallego, Juan Antonio; Plaza, Javier; Díaz Martín, Juan Carlos; Vega Rodriguez, Miguel ángel; Plaza Miguel, Antonio J.; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8908-1606; https://orcid.org/0000-0002-8435-3844Asynchronous stochastic gradient descent es una tecnica de optimizacion comunmente utilizada en el entrenamiento distribuido de redes neuronales profundas. En distribuciones basadas en particionamiento de datos, se entrena una replica del modelo en cada unidad de procesamiento de la plataforma, utilizando conjuntos de muestras denominados mini-batches. Este es un proceso iterativo en el que al nal de cada mini-batch, las replicas combinan los gradientes calculados para actualizar su copia local de los parametros. Sin embargo, al utilizar asincronismo, las diferencias en el tiempo de entrenamiento por iteracion entre replicas provocan la aparicion del staleness, esto es, las replicas progresan a diferente velocidad y en el entrenamiento de cada replica se utiliza una vers on no actualizada de los parametros. Un alto gradde staleness tiene un impacto negativo en la precision del modelo resultante. Ademas, las plataformas de computacion de alto rendimiento suelen ser heterogeneas, compuestas por CPUs y GPUs de diferentes capacidades, lo que agrava el problema de staleness. En este trabajo, se propone aplicar t ecnicas de equilibrio de carga computacional, bien conocidas en el campo de la Computaci on de Altas Prestaciones, al entrenamiento distribuido de modelos profundos. A cada r eplica se asignar a un n umero de mini-batches en proporci on a su velocidad relativa. Los resultados experimentales obtenidos en una plataforma hete-rog enea muestran que, si bien la precisi on se mantiene constante, el rendimiento del entrenamiento aumenta considerablemente, o desde otro punto de vista, en el mismo tiempo de entrenamiento, se alcanza una mayor precisi on en las estimaciones del modelo. Discutimos las causas de tal incremento en el rendimiento y proponemos los pr oximos pasos para futuras investigaciones.Publicación Correlation-Aware Averaging for Federated Learning in Remote Sensing Data Classification(IEEE, 2024) Moreno Álvarez, Sergio; han, lirong; Paoletti, Mercedes Eugenia; Haut, Juan Mario; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961XThe increasing volume of remote sensing (RS) data offers substantial benefits for the extraction and interpretation of features from these scenes. Indeed, the detection of distinguishing features among captured materials and objects is crucial for classification purposes, such as in environmental monitoring applications. In these algorithms, the classes characterized by lower correlation often exhibit more distinct and discernible features, facilitating their differentiation in a straightforward manner. Nevertheless, the rise of Big Data provides a wide range of data acquired through multiple decentralized devices, where its susceptibility to be shared among various users or clients presents challenges in safeguarding privacy. Meanwhile, global features for similar classes are required to be learned for generalization purposes in the classification process. To address this, federated learning (FL) emerges as a privacy efficient decentralized solution. Firstly, in such scenarios, proprietary data is held by individual clients participating in the training of a global model. Secondly, clients may encounter challenges in identifying features that are more distinguishable within the data distributions of other clients. In this study, in order to handle these challenges, a novel methodology is proposed that considers the least correlated classes (LCCs) included in each client data distribution. This strategy exploits the distinctive features between classes, thereby enhancing performance and generalization ability in a secure and private environment.Publicación Optimizing Distributed Deep Learning in Heterogeneous Computing Platforms for Remote Sensing Data Classification(IEEE, 2022) Moreno Álvarez, Sergio; Paoletti, Mercedes Eugenia; Rico Gallego, Juan Antonio; Cavallaro, Gabriele; Haut, Juan M.; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-3239-9904; https://orcid.org/0000-0001-6701-961XApplications from Remote Sensing (RS) unveiled unique challenges to Deep Learning (DL) due to the high volume and complexity of their data. On the one hand, deep neural network architectures have the capability to automatically ex-tract informative features from RS data. On the other hand, these models have massive amounts of tunable parameters, re-quiring high computational capabilities. Distributed DL with data parallelism on High-Performance Computing (HPC) sys-tems have proved necessary in dealing with the demands of DL models. Nevertheless, a single HPC system can be al-ready highly heterogeneous and include different computing resources with uneven processing power. In this context, a standard data parallelism strategy does not partition the data efficiently according to the available computing resources. This paper proposes an alternative approach to compute the gradient, which guarantees that the contribution to the gradi-ent calculation is proportional to the processing speed of each DL model's replica. The experimental results are obtained in a heterogeneous HPC system with RS data and demon-strate that the proposed approach provides a significant training speed up and gain in the global accuracy compared to one of the state-of-the-art distributed DL framework.Publicación Deep Attention-Driven HSI Scene Classification Based on Inverted Dot-Product(Institute of Electrical and Electronics Engineers Inc., 2022) Paoletti, Mercedes Eugenia; Tao, Xuanwen; han, lirong; Wu, Zhaoyue; Moreno Álvarez, Sergio; Haut, Juan M.; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0003-1093-0079; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0002-6797-2440; https://orcid.org/0000-0001-6701-961XCapsule networks have been a breakthrough in the field of automatic image analysis, opening a new frontier in the art for image classification. Nevertheless, these models were initially designed for RGB images and naively applying these techniques to remote sensing hyperspectral images (HSI) may lead to sub-optimal behaviour, blowing up the number of parameters needed to train the model or not correctly modeling the spectral relations between the different layers of the scene. To overcome this drawback, this work implements a new capsule-based architecture with attention mechanism to improve the HSI data processing. The attention mechanism is applied during the concurrent iterative routing procedure through an inverted dot-product attentionPublicación Deep Robust Hashing Using Self-Distillation for Remote Sensing Image Retrieval(IEEE, 2024) han,lirong; Paoletti, Mercedes Eugenia; Moreno Álvarez, Sergio; Haut, Juan Mario; Plaza, Antonio; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-9613-1659This paper presents a novel self-distillation based deep robust hash for fast remote sensing (RS) image retrieval. Specifically, there are two primary processes in our proposed model: teacher learning (TL) and student learning (SL). Two transformed samples are produced from one sample image through nuanced and signalized transformations, respectively. Transformed samples are fed into both the TL and the SL flows. To reduce discrepancies in the processed samples and guarantee a consistent hash code, the parameters are shared by the two modules during the training stage. Then, a resilient module is employed to enhance the image features in order to ensure more dependable hash code production. Lastly, a three-component loss function is developed to train the entire model. Comprehensive experiments are conducted on two common RS datasets: UCMerced and AID. The experimental results validate that the proposed method has competitive performance against other RS image hashing methods.Publicación Heterogeneous model parallelism for deep neural networks(ELSEVIER, 2021-06-21) Moreno Álvarez, Sergio; Haut, Juan M.; Paoletti, Mercedes Eugenia; Rico Gallego, Juan Antonio; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0002-4264-7473Deep neural networks (DNNs) have transformed computer vision, establishing themselves as the current state-of-the-art for image processing. Nevertheless, the training of current large DNN models is one of the main challenges to be solved. In this sense, data-parallelism has been the most widespread distributed training strategy since it is easy to program and can be applied to almost all cases. However, this solution suffers from several limitations, such as its high communication requirements and the memory constraints when training very large models. To overcome these limitations model-parallelism has been proposed, solving the most substantial problems of the former strategy. However, describing and implementing the parallelization of the training of a DNN model across a set of processes deployed on several devices is a challenging task. Current proposed solutions assume a homogeneous distribution, being impractical when working with devices of different computational capabilities, which is quite common on high performance computing platforms. To address previous shortcomings, this work proposes a novel model-parallelism technique considering heterogeneous platforms, where a load balancing mechanism between uneven devices of an HPC platform has been implemented. Our proposal takes advantage of the Google Brain’s Mesh-TensorFlow for convolutional networks, splitting computing tensors across filter dimension in order to balance the computational load of the available devices. Conducted experiments show an improvement in the exploitation of heterogeneous computational resources, enhancing the training performance. The code is available on: https://github.com/mhaut/HeterogeneusModelDNN.Publicación Parameter-Free Attention Network for Spectral–Spatial Hyperspectral Image Classification(IEEE, 2023) Paoletti, Mercedes Eugenia; Tao, Xuanwen; han, lirong; Wu, Zhaoyue; Moreno Álvarez, Sergio; Kumar Roy, Swalpa; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0003-1093-0079; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0002-6797-2440; https://orcid.org/0000-0002-6580-3977Hyperspectral images (HSIs) comprise plenty of information in the spatial and spectral domain, which is highly beneficial for performing classification tasks in a very accurate way. Recently, attention mechanisms have been widely used in the HSI classification due to their ability to extract relevant spatial and spectral features. Notwithstanding their positive results, most of the attentional strategies usually introduce a significant number of parameters to be trained, making the models more complex and increasing the computational load. In this article, we develop a new parameter-free attention network for HSI classification. The main advantage of our model is that it does not add parameters to the original network (as opposed to other state-of-the-art approaches) while providing higher classification accuracies. Extensive experimental validations and quantitative comparisons are conducted—using different benchmark HSIs—to illustrate these advantages. The code is available on https://github.com/mhaut/Free2ResnetPublicación Distributed Deep Learning for Remote Sensing Data Interpretation(IEEE, 2021-03-15) Haut, Juan Mario; Paoletti, Mercedes Eugenia; Moreno Álvarez, Sergio; Plaza, Javier; Rico Gallego, Juan Antonio; Plaza, Antonio; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0002-2384-9141; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-9613-1659As a newly emerging technology, deep learning (DL) is a very promising field in big data applications. Remote sensing often involves huge data volumes obtained daily by numerous in-orbit satellites. This makes it a perfect target area for data-driven applications. Nowadays, technological advances in terms of software and hardware have a noticeable impact on Earth observation applications, more specifically in remote sensing techniques and procedures, allowing for the acquisition of data sets with greater quality at higher acquisition ratios. This results in the collection of huge amounts of remotely sensed data, characterized by their large spatial resolution (in terms of the number of pixels per scene), and very high spectral dimensionality, with hundreds or even thousands of spectral bands. As a result, remote sensing instruments on spaceborne and airborne platforms are now generating data cubes with extremely high dimensionality, imposing several restrictions in terms of both processing runtimes and storage capacity. In this article, we provide a comprehensive review of the state of the art in DL for remote sensing data interpretation, analyzing the strengths and weaknesses of the most widely used techniques in the literature, as well as an exhaustive description of their parallel and distributed implementations (with a particular focus on those conducted using cloud computing systems). We also provide quantitative results, offering an assessment of a DL technique in a specific case study (source code available: https://github.com/mhaut/cloud-dnn-HSI). This article concludes with some remarks and hints about future challenges in the application of DL techniques to distributed remote sensing data interpretation problems. We emphasize the role of the cloud in providing a powerful architecture that is now able to manage vast amounts of remotely sensed data due to its implementation simplicity, low cost, and high efficiency compared to other parallel and distributed architectures, such as grid computing or dedicated clusters.Publicación Deep shared proxy construction hashing for cross-modal remote sensing image fast target retrieval(ELSEVIER, 2024) han, lirong; Paoletti, Mercedes Eugenia; Moreno Álvarez, Sergio; Haut, Juan M.; Plaza, Antonio; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-9613-1659The diversity of remote sensing (RS) image modalities has expanded alongside advancements in RS technologies. A plethora of optical, multispectral, and hyperspectral RS images offer rich geographic class information. The ability to swiftly access multiple RS image modalities is crucial for fully harnessing the potential of RS imagery. In this work, an innovative method, called Deep Shared Proxy Construction Hashing (DSPCH), is introduced for cross-modal hyperspectral scene target retrieval using accessible RS images such as optical and sketch. Initially, a shared proxy hash code is generated in the hash space for each land use class. Subsequently, an end-to-end deep hash network is built to generate hash codes for hyperspectral pixels and accessible RS images. Furthermore, a proxy hash loss function is designed to optimize the proposed deep hashing network, aiming to generate hash codes that closely resemble the corresponding proxy hash code. Finally, two benchmark datasets are established for cross-modal hyperspectral and accessible RS image retrieval, allowing us to conduct extensive experiments with these datasets. Our experimental results validate that the novel DSPCH method can efficiently and effectively achieve RS image cross-modal target retrieval, opening up new avenues in the field of cross-modal RS image retrievalPublicación Hashing for Retrieving Long-Tailed Distributed Remote Sensing Images(IEEE, 2024) han, lirong; Paoletti, Mercedes Eugenia; Moreno Álvarez, Sergio; Haut, Juan M.; Pastor Vargas, Rafael; Plaza, Antonio; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-4089-9538; https://orcid.org/0000-0002-9613-1659The widespread availability of remotely sensed datasets establishes a cornerstone for comprehensive image retrieval within the realm of remote sensing (RS). In response, the investigation into hashing-driven retrieval methods garners significance, enabling proficient image acquisition within such extensive data magnitudes. Nevertheless, the used datasets in practical applications are invariably less desirable and with long-tailed distribution. The primary hurdle pertains to the substantial discrepancy in class volumes. Moreover, commonly utilized RS datasets for hashing tasks encompass approximately two–three dozen classes. However, real-world datasets exhibit a randomized number of classes, introducing a challenging variability. This article proposes a new centripetal intensive attention hashing (CIAH) mechanism based on intensive attention features for long-tailed distribution RS image retrieval. Specifically, an intensive attention module (IAM) is adopted to enhance the significant features to facilitate the subsequent generation of representative hash codes. Furthermore, to deal with the inherent imbalance of long-tailed distributed datasets, the utilization of a centripetal loss function is introduced. This endeavor constitutes the inaugural effort toward long-tailed distributed RS image retrieval. In pursuit of this objective, a collection of long-tail datasets is meticulously curated using four widely recognized RS datasets, subsequently disseminated as benchmark datasets. The selected fundamental datasets contain 7, 25, 38, and 45 land-use classes to mimic different real RS datasets. Conducted experiments demonstrate that the proposed methodology attains a performance benchmark that surpasses currently existing methodologies.
- «
- 1 (current)
- 2
- 3
- »