Prediction of the noise pollution in Barcelona and model explainability using SHAP values

Jordà Mascaró, Marc

Prediction of the noise pollution in Barcelona and model explainability using SHAP values

Jordà Mascaró, Marc. (2022). Prediction of the noise pollution in Barcelona and model explainability using SHAP values Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre			Descripción	Tipo MIME		Size
JordaMascaro_Marc_TFM.pdf			JordaMascaro_Marc_TFM.pdf		application/pdf	10.64MB

Título	Prediction of the noise pollution in Barcelona and model explainability using SHAP values
Autor(es)	Jordà Mascaró, Marc
Abstract	Noise pollution is the second most important environmental risk factor for health in Western Europe. It affects a large amount of people, it can cause a wide range of serious illnesses, and it is estimated to be the reason for 12000 premature deaths in Europe every year. Barcelona is above the 75th percentile of European cities exposed to harmful road traffic noise levels, and it is one of the most affected by nightly leisure noise. Several initiatives have been recently developed to address this problem, following the European regulations on this matter. The city provides a network of sensors to collect noise data at every minute all over the territory. We use noise data from 2017 to 2021 from a significant point of Barcelona. We process this information to transform it into an appropriate input for machine learning models, handling the missing values with the Prophet algorithm. Our multivariate time series problem is the following one: predicting the hourly noise values of the following 10 hours based on the previous 48 hourly values of noise and the values of weather and seasonal variables from the last hour. We compare different modelling approaches, all of them introduced with a theoretical framework. On the one hand, we use AutoML tools, such as TPOT and Keras, to determine optimal models for our problem. On the other hand, we manually tune Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs) and Gated Recurrent Units (GRUs), designed to perform well on long sequences of data. A manually tuned neural network combining RNN, LSTM and GRU layers outperformed all the other approaches with an average test RMSE of 3.412 dB(A) over all prediction horizons. Neural networks, though, are often considered black boxes, because they are so complex that it is very hard for the developers to justify the decisions they make. Therefore, in this work there is a theoretical introduction about the explainability of machine learning and deep learning models, focused on SHAP (Shapley Additive explanation) values. The Deep SHAP method is used to calculate the importance of the features on the predictions of the RNN-LSTM-GRU model. The feature with the highest contribution to the output is a seasonal variable informing the hour range of the day, followed by the noise in the three most recent hours.
Notas adicionales	Trabajo de Fin de Máster Universitario en Ingeniería y Ciencia de Datos. UNED
Materia(s)	Ingeniería Informática
Palabra clave	noise pollution Time Series Barcelona AutoML LSTM GRU SHAP
Editor(es)	Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial
Director/Tutor	Aznarte Mellado, José Luis
Fecha	2022-09-18
Formato	application/pdf
Identificador	bibliuned:master-ETSInformatica-ICD-Mjorda http://e-spacio.uned.es/fez/view/bibliuned:master-ETSInformatica-ICD-Mjorda
Idioma	eng
Versión de la publicación	acceptedVersion
Nivel de acceso y licencia	http://creativecommons.org/licenses/by-nc-nd/4.0 info:eu-repo/semantics/openAccess
Tipo de recurso	master Thesis
Tipo de acceso	Acceso abierto

Tipo de documento:	master Tesis
Collections:	Máster Universitario en Ingeniería y Ciencia de Datos (UNED) Set de openaire Set de items trabajo fin de máster

Contador de citas:	Search Google Scholar
Estadísticas de acceso:	333 Visitas, 297 Descargas - Estadísticas en detalle
Creado:	Mon, 24 Oct 2022, 19:36:38 CET

e-spacio

Prediction of the noise pollution in Barcelona and model explainability using SHAP values