Publicación:
Prediction of the noise pollution in Barcelona and model explainability using SHAP values

dc.contributor.authorJordà Mascaró, Marc
dc.date.accessioned2024-05-20T12:25:11Z
dc.date.available2024-05-20T12:25:11Z
dc.date.issued2022-09-18
dc.description.abstractNoise pollution is the second most important environmental risk factor for health in Western Europe. It affects a large amount of people, it can cause a wide range of serious illnesses, and it is estimated to be the reason for 12000 premature deaths in Europe every year. Barcelona is above the 75th percentile of European cities exposed to harmful road traffic noise levels, and it is one of the most affected by nightly leisure noise. Several initiatives have been recently developed to address this problem, following the European regulations on this matter. The city provides a network of sensors to collect noise data at every minute all over the territory. We use noise data from 2017 to 2021 from a significant point of Barcelona. We process this information to transform it into an appropriate input for machine learning models, handling the missing values with the Prophet algorithm. Our multivariate time series problem is the following one: predicting the hourly noise values of the following 10 hours based on the previous 48 hourly values of noise and the values of weather and seasonal variables from the last hour. We compare different modelling approaches, all of them introduced with a theoretical framework. On the one hand, we use AutoML tools, such as TPOT and Keras, to determine optimal models for our problem. On the other hand, we manually tune Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs) and Gated Recurrent Units (GRUs), designed to perform well on long sequences of data. A manually tuned neural network combining RNN, LSTM and GRU layers outperformed all the other approaches with an average test RMSE of 3.412 dB(A) over all prediction horizons. Neural networks, though, are often considered black boxes, because they are so complex that it is very hard for the developers to justify the decisions they make. Therefore, in this work there is a theoretical introduction about the explainability of machine learning and deep learning models, focused on SHAP (Shapley Additive explanation) values. The Deep SHAP method is used to calculate the importance of the features on the predictions of the RNN-LSTM-GRU model. The feature with the highest contribution to the output is a seasonal variable informing the hour range of the day, followed by the noise in the three most recent hours.en
dc.description.versionversión final
dc.identifier.urihttps://hdl.handle.net/20.500.14468/14198
dc.language.isoen
dc.publisherUniversidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial
dc.relation.centerFacultades y escuelas::E.T.S. de Ingeniería Informática
dc.relation.degreeMáster universitario en Ingeniería y Ciencia de Datos
dc.relation.departmentInteligencia Artificial
dc.rightsAtribución-NoComercial-SinDerivadas 4.0 Internacional
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.subject.keywordsnoise pollution
dc.subject.keywordsTime Series
dc.subject.keywordsBarcelona
dc.subject.keywordsAutoML
dc.subject.keywordsLSTM
dc.subject.keywordsGRU
dc.subject.keywordsSHAP
dc.titlePrediction of the noise pollution in Barcelona and model explainability using SHAP valueses
dc.typetesis de maestríaes
dc.typemaster thesisen
dspace.entity.typePublication
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
JordaMascaro_Marc_TFM.pdf
Tamaño:
10.64 MB
Formato:
Adobe Portable Document Format