Persona:
Moreno Álvarez, Sergio

ORCID

0000-0002-1858-9920

Apellidos

Moreno Álvarez

Nombre de pila

Sergio

Página completa del ítem

Resultados de la búsqueda

Mostrando 1 - 10 de 17

Heterogeneous gradient computing optimization for scalable deep neural networks
(Springer, 2022) Moreno Álvarez, Sergio; Paoletti, Mercedes Eugenia; Rico Gallego, Juan Antonio; Haut, Juan M.; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0001-6701-961X
Nowadays, data processing applications based on neural networks cope with the growth in the amount of data to be processed and with the increase in both the depth and complexity of the neural networks architectures, and hence in the number of parameters to be learned. High-performance computing platforms are provided with fast computing resources, including multi-core processors and graphical processing units, to manage such computational burden of deep neural network applications. A common optimization technique is to distribute the workload between the processes deployed on the resources of the platform. This approach is known as data-parallelism. Each process, known as replica, trains its own copy of the model on a disjoint data partition. Nevertheless, the heterogeneity of the computational resources composing the platform requires to unevenly distribute the workload between the replicas according to its computational capabilities, to optimize the overall execution performance. Since the amount of data to be processed is different in each replica, the influence of the gradients computed by the replicas in the global parameter updating should be different. This work proposes a modification of the gradient computation method that considers the different speeds of the replicas, and hence, its amount of data assigned. The experimental results have been conducted on heterogeneous high-performance computing platforms for a wide range of models and datasets, showing an improvement in the final accuracy with respect to current techniques, with a comparable performance.
Performance evaluation of model-driven partitioning algorithms for data-parallel kernels on heterogeneous platforms
(Wiley, 2019) Rico Gallego, Juan Antonio; Díaz Martín, Juan Carlos; Moreno Álvarez, Sergio; Calvo Jurado, Carmen; García Zapata, Juan Luis; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8435-3844; https://orcid.org/0000-0001-9842-081X; https://orcid.org/0000-0003-1419-1672
Data- parallel applications running on heterogeneous high-performance computing platforms require a nonuniform distribution of the workload between available processes. Data partitioning algorithms are formulated as an optimization problem. Departing from the computational performance models of the processes, the goal is to find the partition that minimizes the communication cost. Traditionally, communication volume is the metric used to guide the partitioning. This metric, however, is unable to capture the complexity of current heterogeneous systems, which show uneven communication channels and execute applications with different communication patterns. In this paper, we discuss the role of analytical communication performance models as a metric in partitioning algorithms. First, we describe a method to programmatically predict the communication cost of a data-parallel kernel based on the τ-Lop analytical model. We show that this figure better captures the communication features of applications and platforms. We present results showing that this approach builds partitions that equal or improve the performance of data parallel applications on heterogeneous platforms with respect to previous volume-based strategies.
Cloud-Based Analysis of Large-Scale Hyperspectral Imagery for Oil Spill Detection
(IEEE, 2024) Haut, Juan M.; Moreno Álvarez, Sergio; Pastor Vargas, Rafael; Pérez García, Ámbar; Paoletti, Mercedes Eugenia; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-4089-9538; https://orcid.org/0000-0002-2943-6348; https://orcid.org/0000-0003-1030-3729
Spectral indices are of fundamental importance in providing insights into the distinctive characteristics of oil spills, making them indispensable tools for effective action planning. The normalized difference oil index (NDOI) is a reliable metric and suitable for the detection of coastal oil spills, effectively leveraging the visible and near-infrared (VNIR) spectral bands offered by commercial sensors. The present study explores the calculation of NDOI with a primary focus on leveraging remotely sensed imagery with rich spectral data. This undertaking necessitates a robust infrastructure to handle and process large datasets, thereby demanding significant memory resources and ensuring scalability. To overcome these challenges, a novel cloud-based approach is proposed in this study to conduct the distributed implementation of the NDOI calculation. This approach offers an accessible and intuitive solution, empowering developers to harness the benefits of cloud platforms. The evaluation of the proposal is conducted by assessing its performance using the scene acquired by the airborne visible infrared imaging spectrometer (AVIRIS) sensor during the 2010 oil rig disaster in the Gulf of Mexico. The catastrophic nature of the event and the subsequent challenges underscore the importance of remote sensing (RS) in facilitating decision-making processes. In this context, cloud-based approaches have emerged as a prominent technological advancement in the RS field. The experimental results demonstrate noteworthy performance by the proposed cloud-based approach and pave the path for future research for fast decision-making applications in scalable environments.
Self-Supervised Learning on Small In-Domain Datasets Can Overcome Supervised Learning in Remote Sensing
(IEEE, 2024) Sanchez-Fernandez, Andres J.; Moreno Álvarez, Sergio; Rico Gallego, Juan Antonio; Tabik, Siham; https://orcid.org/0000-0001-6743-3570; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0003-4093-5356
The availability of high-resolution satellite images has accelerated the creation of new datasets designed to tackle broader remote sensing (RS) problems. Although popular tasks, such as scene classification, have received significant attention, the recent release of the Land-1.0 RS dataset marks the initiation of endeavors to estimate land-use and land-cover (LULC) fraction values per RGB satellite image. This challenging problem involves estimating LULC composition, i.e., the proportion of different LULC classes from satellite imagery, with major applications in environmental monitoring, agricultural/urban planning, and climate change studies. Currently, supervised deep learning models—the state-of-the-art in image classification—require large volumes of labeled training data to provide good generalization. To face the challenges posed by the scarcity of labeled RS data, self-supervised learning (SSL) models have recently emerged, learning directly from unlabeled data by leveraging the underlying structure. This is the first article to investigate the performance of SSL in LULC fraction estimation on RGB satellite patches using in-domain knowledge. We also performed a complementary analysis on LULC scene classification. Specifically, we pretrained Barlow Twins, MoCov2, SimCLR, and SimSiam SSL models with ResNet-18 using the Sentinel2GlobalLULC small RS dataset and then performed transfer learning to downstream tasks on Land-1.0. Our experiments demonstrate that SSL achieves competitive or slightly better results when trained on a smaller high-quality in-domain dataset of 194 877 samples compared to the supervised model trained on ImageNet-1k with 1 281 167 samples. This outcome highlights the effectiveness of SSL using in-distribution datasets, demonstrating efficient learning with fewer but more relevant data.
Training deep neural networks: a static load balancing approach
(Springer, 2020-03-02) Moreno Álvarez, Sergio; Haut, Juan Mario; Paoletti, Mercedes Eugenia; Rico Gallego, Juan Antonio; Díaz Martín, Juan Carlos; Plaza, Javier; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8435-3844; https://orcid.org/0000-0002-8908-1606
Deep neural networks are currently trained under data-parallel setups on high-performance computing (HPC) platforms, so that a replica of the full model is charged to each computational resource using non-overlapped subsets known as batches. Replicas combine the computed gradients to update their local copies at the end of each batch. However, differences in performance of resources assigned to replicas in current heterogeneous platforms induce waiting times when synchronously combining gradients, leading to an overall performance degradation. Albeit asynchronous communication of gradients has been proposed as an alternative, it suffers from the so-called staleness problem. This is due to the fact that the training in each replica is computed using a stale version of the parameters, which negatively impacts the accuracy of the resulting model. In this work, we study the application of well-known HPC static load balancing techniques to the distributed training of deep models. Our approach is assigning a different batch size to each replica, proportional to its relative computing capacity, hence minimizing the staleness problem. Our experimental results (obtained in the context of a remotely sensed hyperspectral image processing application) show that, while the classification accuracy is kept constant, the training time substantially decreases with respect to unbalanced training. This is illustrated using heterogeneous computing platforms, made up of CPUs and GPUs with different performance.
Deep mixed precision for hyperspectral image classification
(Springer, 2021-02-03) Paoletti, Mercedes Eugenia; X. Tao; Haut, Juan Mario; Moreno Álvarez, Sergio; Plaza, Antonio; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-9613-1659
Hyperspectral images (HSIs) record scenes at different wavelength channels, providing detailed spatial and spectral information. How to storage and process this highdimensional data plays a vital role in many practical applications, where classification technologies have emerged as excellent processing tools. However, their high computational complexity and energy requirements bring some challenges. Adopting low-power consumption architectures and deep learning (DL) approaches has to provide acceptable computing capabilities without reducing accuracy demand. However, most DL architectures employ single-precision (FP32) to train models, and some big DL architectures will have a limitation on memory and computation resources. This can negatively affect the network learning process. This letter leads these challenges by using mixed precision into DL architectures for HSI classification to speed up the training process and reduce the memory consumption/access. Proposed models are evaluated on four widely used data sets. Also, low and highpower consumption devices are compared, considering NVIDIA Jetson Xavier and Titan RTX GPUs, to evaluate the proposal viability in on-board processing devices. Obtained results demonstrate the efficiency and effectiveness of these models within HSI classification task for both devices. Source codes: https ://githu b.com/mhaut / CNN-MP-HSI.
Estimación Automática del Coste de Comunicación de Aplicaciones Paralelas en Plataformas Heterogéneas
(Universidad Extremadura, 2018) Moreno Álvarez, Sergio; Rico Gallego, Juan A.; Díaz Martín, Juan Carlos; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8435-3844
Optimizar el tiempo de ejecución de aplicaciones paralelas en plataformas heterogéneas de altas prestaciones es un problema complejo. Estas aplicaciones cient´ıficas normalmente se componen de kernels que implementan algoritmos como la multiplicación de matrices, ecuaciones en derivadas parciales o Transformadas de Fourier. Los kernels son ejecutados por los procesos desplegados en los diferentes recursos de cómputo de una plataforma, por ejemplo, en procesadores multi-core o aceleradores (GPUs, Xeon PHIs, etc.). El volumen de datos del kernel se distribuye entre los procesos de forma proporcional a su capacidad de cómputo, de forma que se equilibra la carga computacional global. Este equilibrado de carga no homogéneo tiene un impacto importante en el coste de las comunicaciones. La optimización del coste de las comunicaciones de éstas aplicaciones se aborda habitualmente mediante pruebas exhaustivas en la plataforma destino. Sin embargo, estas pruebas consumen recursos y tiempo, y a menudo se basan en la extrapolación de los resultados obtenidos con la ejecución de una versión reducida de la aplicación en la plataforma. Los Modelos Anal´ıticos de Rendimiento de Comunicaciones ofrecen una alternativa factible y prometedora en este sentido. Estos modelos representan el coste de las comunicaciones de un kernel en una plataforma heterogénea, ofreciendo una estimación precisa de su tiempo de comunicación de forma no invasiva, esto es, sin utilizar recursos de cómputo HPC en la estimación. Este trabajo contribuye ofreciendo una herramienta de estimación que permite representar y evaluar expresiones de coste de comunicaciones que siguen el modelo t- Lop. Adem´as, permite incluir el c´alculo de coste de las comunicaciones de forma autom´atica en algoritmos de particionamiento y optimización de comunicaciones. En este documento se proporcionan ejemplos tanto de uso b´asico como avanzado. Se incluyen tres casos de ejemplo de modelado de comunicaciones en kernels representativos utilizando la herramienta: la solución de una ecuación diferencial utilizando la técnica de elementos finitos, un algoritmo paralelo de multiplicación de matrices densas, y una simulación N-Body. Estos kernels utilizan diferentes patrones de comunicación y particionamiento del espacio de datos.
A tool to assess the communication cost of parallel kernels on heterogeneous platforms
(Springer, 2020) Rico Gallego, Juan Antonio; Moreno Álvarez, Sergio; Díaz Martín, Juan Carlos; Lastovetsky, Alexey L.; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8435-3844; https://orcid.org/0000-0001-9460-3897
Ensuring applications to achieve an efficient usage of resources and fast execution time in the complex current heterogeneous high-performance computing platforms is a paramount problem. Essential efforts to reach the goal are the optimal partitioning of the data space between the processes composing a typical task/data-parallel application, and their right mapping and deployment on the platform. The computational and communication performance modeling describing the platform and the application behaviors is an increasingly recognized approach. This paper discusses the utility of the τ–Lop analytic communication performance model in facing these issues and contributes with a practical symbolic computation tool that represents, manipulates and accurately evaluates the formal communication cost expression derived from a hybrid kernel. We identify a set of scenarios where the tool could be applied, provide with both basic and advanced use examples and evaluate the tool on real-life kernels.
Federated learning meets remote sensing
(ELSEVIER, 2024-12-01) Moreno Álvarez, Sergio; Paoletti, Mercedes Eugenia; Sanchez Fernandez, Andres J.; Rico Gallego, Juan Antonio; han, lirong; Haut, Juan M.; https://orcid.org/0000-0003-1030-3729; https://orcid.org/0000-0001-6743-3570; https://orcid.org/0000-0002-4264-7473; https://orcid.org/0000-0002-8613-7037; https://orcid.org/0000-0001-6701-961X
Remote sensing (RS) imagery provides invaluable insights into characterizing the Earth’s land surface within the scope of Earth observation (EO). Technological advances in capture instrumentation, coupled with the rise in the number of EO missions aimed at data acquisition, have significantly increased the volume of accessible RS data. This abundance of information has alleviated the challenge of insufficient training samples, a common issue in the application of machine learning (ML) techniques. In this context, crowd-sourced data play a crucial role in gathering diverse information from multiple sources, resulting in heterogeneous datasets that enable applications to harness a more comprehensive spatial coverage of the surface. However, the sensitive nature of RS data requires ensuring the privacy of the complete collection. Consequently, federated learning (FL) emerges as a privacy-preserving solution, allowing collaborators to combine such information from decentralized private data collections to build efficient global models. This paper explores the convergence between the FL and RS domains, specifically in developing data classifiers. To this aim, an extensive set of experiments is conducted to analyze the properties and performance of novel FL methodologies. The main emphasis is on evaluating the influence of such heterogeneous and disjoint data among collaborating clients. Moreover, scalability is evaluated for a growing number of clients, and resilience is assessed against Byzantine attacks. Finally, the work concludes with future directions and serves as the opening of a new research avenue for developing efficient RS applications under the FL paradigm. The source code is publicly available at https://github.com/hpc-unex/FLmeetsRS.
Hyperspectral Image Analysis Using Cloud-Based Support Vector Machines
(Springer, 2024) Haut, Juan M.; Franco Valiente, José M.; Paoletti, Mercedes Eugenia; Moreno Álvarez, Sergio; Pardo-Diaz, Alfonso; https://orcid.org/0000-0001-6701-961X; https://orcid.org/0000-0002-3880-6697; https://orcid.org/0000-0003-1030-3729
Hyperspectral image processing techniques involve time-consuming calculations due to the large volume and complexity of the data. Indeed, hyperspectral scenes contain a wealth of spatial and spectral information thanks to the hundreds of narrow and continuous bands collected across the electromagnetic spectrum. Predictive models, particularly supervised machine learning classifiers, take advantage of this information to predict the pixel categories of images through a training set of real observations. Most notably, the Support Vector Machine (SVM) has demonstrate impressive accuracy results for image classification. Notwithstanding the performance offered by SVMs, dealing with such a large volume of data is computationally challenging. In this paper, a scalable and high-performance cloud-based approach for distributed training of SVM is proposed. The proposal address the overwhelming amount of remote sensing (RS) data information through a parallel training allocation. The implementation is performed over a memory-efficient Apache Spark distributed environment. Experiments are performed on a benchmark of real hyperspectral scenes to show the robustness of the proposal. Obtained results demonstrate efficient classification whilst optimising data processing in terms of training times.

Persona:
Moreno Álvarez, Sergio

Dirección de correo electrónico

ORCID

Fecha de nacimiento

Proyectos de investigación

Unidades organizativas

Puesto de trabajo

Apellidos

Nombre de pila

Nombre

Filtros

Autor

Tipo

Departamento

Centro

Fecha

Tiene archivos

Tipo de ítem

Nivel de acceso

Ajustes

Ordenar por

resultados por página

Resultados de la búsqueda

Persona: Moreno Álvarez, Sergio

Dirección de correo electrónico

ORCID

Fecha de nacimiento

Proyectos de investigación

Unidades organizativas

Puesto de trabajo

Apellidos

Nombre de pila

Nombre

Filtros

Autor

Tipo

Departamento

Centro

Fecha

Tiene archivos

Tipo de ítem

Nivel de acceso

Ajustes

Ordenar por

resultados por página

Resultados de la búsqueda

Persona:
Moreno Álvarez, Sergio