Prediction of the noise pollution in Barcelona and model explainability using SHAP values

Jordà Mascaró, Marc

Publicación:
Prediction of the noise pollution in Barcelona and model explainability using SHAP values

dc.contributor.author	Jordà Mascaró, Marc
dc.contributor.director	Aznarte, José L.
dc.date.accessioned	2024-05-20T12:25:11Z
dc.date.available	2024-05-20T12:25:11Z
dc.date.issued	2022-09-18
dc.description.abstract	Noise pollution is the second most important environmental risk factor for health in Western Europe. It affects a large amount of people, it can cause a wide range of serious illnesses, and it is estimated to be the reason for 12000 premature deaths in Europe every year. Barcelona is above the 75th percentile of European cities exposed to harmful road traffic noise levels, and it is one of the most affected by nightly leisure noise. Several initiatives have been recently developed to address this problem, following the European regulations on this matter. The city provides a network of sensors to collect noise data at every minute all over the territory. We use noise data from 2017 to 2021 from a significant point of Barcelona. We process this information to transform it into an appropriate input for machine learning models, handling the missing values with the Prophet algorithm. Our multivariate time series problem is the following one: predicting the hourly noise values of the following 10 hours based on the previous 48 hourly values of noise and the values of weather and seasonal variables from the last hour. We compare different modelling approaches, all of them introduced with a theoretical framework. On the one hand, we use AutoML tools, such as TPOT and Keras, to determine optimal models for our problem. On the other hand, we manually tune Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs) and Gated Recurrent Units (GRUs), designed to perform well on long sequences of data. A manually tuned neural network combining RNN, LSTM and GRU layers outperformed all the other approaches with an average test RMSE of 3.412 dB(A) over all prediction horizons. Neural networks, though, are often considered black boxes, because they are so complex that it is very hard for the developers to justify the decisions they make. Therefore, in this work there is a theoretical introduction about the explainability of machine learning and deep learning models, focused on SHAP (Shapley Additive explanation) values. The Deep SHAP method is used to calculate the importance of the features on the predictions of the RNN-LSTM-GRU model. The feature with the highest contribution to the output is a seasonal variable informing the hour range of the day, followed by the noise in the three most recent hours.	en
dc.description.version	versión final
dc.identifier.uri	https://hdl.handle.net/20.500.14468/14198
dc.language.iso	en
dc.publisher	Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial
dc.relation.center	E.T.S. de Ingeniería Informática
dc.relation.degree	Máster universitario en Ingeniería y Ciencia de Datos
dc.relation.department	Inteligencia Artificial
dc.rights	info:eu-repo/semantics/openAccess
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
dc.subject.keywords	noise pollution
dc.subject.keywords	Time Series
dc.subject.keywords	Barcelona
dc.subject.keywords	AutoML
dc.subject.keywords	LSTM
dc.subject.keywords	GRU
dc.subject.keywords	SHAP
dc.title	Prediction of the noise pollution in Barcelona and model explainability using SHAP values	es
dc.type	tesis de maestría	es
dc.type	master thesis	en
dspace.entity.type	Publication

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: JordaMascaro_Marc_TFM.pdf
Tamaño:: 10.64 MB
Formato:: Adobe Portable Document Format

Descargar

Colecciones

Trabajos de fin de máster (TFM)

Publicación: Prediction of the noise pollution in Barcelona and model explainability using SHAP values

Archivos

Bloque original

Colecciones

Publicación:
Prediction of the noise pollution in Barcelona and model explainability using SHAP values