Prediction of the noise pollution in Barcelona and model explainability using SHAP values

Jordà Mascaró, Marc. (2022). Prediction of the noise pollution in Barcelona and model explainability using SHAP values Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
JordaMascaro_Marc_TFM.pdf JordaMascaro_Marc_TFM.pdf application/pdf 10.64MB

Título Prediction of the noise pollution in Barcelona and model explainability using SHAP values
Autor(es) Jordà Mascaró, Marc
Abstract Noise pollution is the second most important environmental risk factor for health in Western Europe. It affects a large amount of people, it can cause a wide range of serious illnesses, and it is estimated to be the reason for 12000 premature deaths in Europe every year. Barcelona is above the 75th percentile of European cities exposed to harmful road traffic noise levels, and it is one of the most affected by nightly leisure noise. Several initiatives have been recently developed to address this problem, following the European regulations on this matter. The city provides a network of sensors to collect noise data at every minute all over the territory. We use noise data from 2017 to 2021 from a significant point of Barcelona. We process this information to transform it into an appropriate input for machine learning models, handling the missing values with the Prophet algorithm. Our multivariate time series problem is the following one: predicting the hourly noise values of the following 10 hours based on the previous 48 hourly values of noise and the values of weather and seasonal variables from the last hour. We compare different modelling approaches, all of them introduced with a theoretical framework. On the one hand, we use AutoML tools, such as TPOT and Keras, to determine optimal models for our problem. On the other hand, we manually tune Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs) and Gated Recurrent Units (GRUs), designed to perform well on long sequences of data. A manually tuned neural network combining RNN, LSTM and GRU layers outperformed all the other approaches with an average test RMSE of 3.412 dB(A) over all prediction horizons. Neural networks, though, are often considered black boxes, because they are so complex that it is very hard for the developers to justify the decisions they make. Therefore, in this work there is a theoretical introduction about the explainability of machine learning and deep learning models, focused on SHAP (Shapley Additive explanation) values. The Deep SHAP method is used to calculate the importance of the features on the predictions of the RNN-LSTM-GRU model. The feature with the highest contribution to the output is a seasonal variable informing the hour range of the day, followed by the noise in the three most recent hours.
Notas adicionales Trabajo de Fin de Máster Universitario en Ingeniería y Ciencia de Datos. UNED
Materia(s) Ingeniería Informática
Palabra clave noise pollution
Time Series
Barcelona
AutoML
LSTM
GRU
SHAP
Editor(es) Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial
Director/Tutor Aznarte Mellado, José Luis
Fecha 2022-09-18
Formato application/pdf
Identificador bibliuned:master-ETSInformatica-ICD-Mjorda
http://e-spacio.uned.es/fez/view/bibliuned:master-ETSInformatica-ICD-Mjorda
Idioma eng
Versión de la publicación acceptedVersion
Nivel de acceso y licencia http://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
Tipo de recurso master Thesis
Tipo de acceso Acceso abierto

 
Versiones
Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 333 Visitas, 297 Descargas  -  Estadísticas en detalle
Creado: Mon, 24 Oct 2022, 19:36:38 CET