Automated IoT vulnerability classification using Deep Learning

Sernández Iglesias, DanielEnrique Fernández Morales,Garcia Merino, Jose CarlosTobarra Abad, María de los LlanosPastor Vargas, RafaelRobles Gómez, AntonioSarraipa, Joao2025-02-212025-02-212025-07Sernández, D., Fernández, E., García, J.C., Tobarra, Ll., Pastor-Vargas, R., Robles- Gómez, A., Sarraipa, J. (2025); Automated IoT vulnerability classification using Deep Learning; Congreso MadeAI 2025 - Modelling, Data Analytics and AI in Engineering. Oportohttps://hdl.handle.net/20.500.14468/25957Este artículo se publicará en las Actas del Congreso MadeAI 2025 - Modelling, Data Analytics and AI in Engineering que tendrá lugar los días 7-11 Julio 2025, en Oporto, Portugal. This article will be published in the Proceedings of the MadeAI 2025 Conference - Modelling, Data Analytics and AI in Engineering that will take place on 7-11 July 2025, in Porto, Portugal.Technological advancements in the development of low-power chips have enabled everyday objects to connect to the Internet, giving rise to the concept known as the Internet of Things (IoT). It is currently estimated that there are approximately 16 billion IoT connections worldwide, a figure expected to double by 2030. However, this rapid growth of the IoT ecosystem has introduced new vulnerabilities that could be exploited by malicious actors. Since many IoT devices handle personal and sensitive information, threats to these devices can have severe consequences. Moreover, a series of cybersecurity incidents could undermine public trust in IoT technology, potentially delaying its widespread adoption across various sectors.Common Vulnerabilities and Exposures records (also known by their acronym as CVEs) is a public cataloging system designed to identify and list known security vulnerabilities in software and hardware products. This system is developed and maintained by MITRE with the support of the cybersecurity community and sponsored by the U.S. Department of Homeland Security (DHS) through the Cybersecurity and Infrastructure Security Agency (CISA). CVE provides a reference database that enables security researchers, manufacturers, and organizational security managers to more effectively identify and address security issues.In our study, we have focused on CVEs exclusively oriented towards IoT systems, with the aim of analyzing the main vulnerabilities detected from 2010 to nowadays as a basis for detecting the main attack vectors in IoT systems. As part of this effort we have created the following dataset. CVEs records include various metrics such as: - Common Weakness Enumeration (CWE), mainly focused on technical classification of vulnerabilities. - Common Vulnerability Scoring System (CVSS), which reports about different metrics such as the attack vector, the severity of the vulnerability or the impact level of the exploitation of the vulnerability. This is one of the most informative metric. - Stakeholder-Specific Vulnerability Categorization (SSVC), oriented towards help cybersecurity team to handle properly the vulnerability. These metrics allow security teams on the one hand to prioritize, such vulnerabilities within their security program, evaluating efforts to mitigate them. But according to our analysis of our dataset, around the 14% of CVEs records do not contain any metric. Around the 83% of CVEs registries contain CWE metric (an ID or its textual description). This metric, as it is explained before, only reports about the type of vulnerability from a technic point of view. Only the 10% of CVEs registries contain SSVC metrics. And CVSS, in its different versions, appears only in the 40% of the studied CVEs registries. Additionally, most of studied records includes metrics a retrospectively, several weeks or months later the vulnerability is disclosed. Thus, cybersecurity teams must trust their previous knowledge in order to distinguish which vulnerabilities are relevant and which not.To tackled this situation, our proposal is focused in the application of Deep Learning techniques in order to classify the severity of CVE records from its textual description. Textual description is a mandatory field that is present in all CVEs records. To achieve this objective, we trained the BiLSTM algorithm using the CVE records with CVSS metrics and its description field; and performed a comparative study of different hyperparameter configurations to find the optimal configuration. The metrics for model evaluation that have been studied are accuracy, loss and F1-score.eninfo:eu-repo/semantics/closedAccess12 Matemáticas::1203 Ciencia de los ordenadores ::1203.17 Informática33 Ciencias TecnológicasAutomated IoT vulnerability classification using Deep Learningconference proceedingsVulnerabilitiesnatural language processingcybersecurityInternet of Things