Fabregat Marcos, HermenegildoDuque Fernández, AndrésAraujo Serna, M. LourdesMartínez Romo, Juan2024-06-112024-06-1120210933-3657https://doi.org/10.1016/j.artmed.2021.102177https://hdl.handle.net/20.500.14468/22409Background and objectives: The 10th version of International Classification of Diseases (ICD-10) codification system has been widely adopted by the health systems of many countries, including Spain. However, manual code assignment of Electronic Health Records (EHR) is a complex and time-consuming task that requires a great amount of specialised human resources. Therefore, several machine learning approaches are being proposed to assist in the assignment task. In this work we present an alternative system for automatically recommending ICD-10 codes to be assigned to EHRs. Methods: Our proposal is based on characterising ICD-10 codes by a set of keyphrases that represent them. These keyphrases do not only include those that have literally appeared in some EHR with the considered ICD-10 codes assigned, but also others that have been obtained by a statistical process able to capture expressions that have led the annotators to assign the code. Results: The result is an information model that allows to efficiently recommend codes to a new EHR based on their textual content. We explore an approach that proves to be competitive with other state-of-the-art approaches and can be combined with them to optimise results. Conclusions: In addition to its effectiveness, the recommendations of this method are easily interpretable since the phrases in an EHR leading to recommend an ICD-10 code are known. Moreover, the keyphrases associated with each ICD-10 code can be a valuable additional source of information for other approaches, such as machine learning techniques.eninfo:eu-repo/semantics/openAccessA keyphrase-based approach for interpretable ICD-10 code classification of Spanish medical reportsjournal articleMedical recordsICD-10 codesKeyphrase extractionInterpretability