Persona: Castillo-Cara, Manuel
Cargando...
Dirección de correo electrónico
ORCID
0000-0002-2990-7090
Fecha de nacimiento
Proyectos de investigación
Unidades organizativas
Puesto de trabajo
Apellidos
Castillo-Cara
Nombre de pila
Manuel
Nombre
6 resultados
Resultados de la búsqueda
Mostrando 1 - 6 de 6
Publicación A multimodal approach using fundus images and text meta-data in a machine learning classifier with embeddings to predict years with self-reported diabetes – an exploratory analysis(Elsevier, 2024-05-22) Carrillo Larco, Rodrigo M.; Bravo Rocca, Gusseppe; Castillo-Cara, Manuel; Xu, Xiaolin; Bernabé Ortiz, AntonioAims Machine learning models can use image and text data to predict the number of years since diabetes diagnosis; such model can be applied to new patients to predict, approximately, how long the new patient may have lived with diabetes unknowingly. We aimed to develop a model to predict self-reported diabetes duration. Methods We used the Brazilian Multilabel Ophthalmological Dataset. Unit of analysis was the fundus image and its meta-data, regardless of the patient. We included people 40 + years and fundus images without diabetic retinopathy. Fundus images and meta-data (sex, age, comorbidities and taking insulin) were passed to the MedCLIP model to extract the embedding representation. The embedding representation was passed to an Extra Tree Classifier to predict: 0–4, 5–9, 10–14 and 15 + years with self-reported diabetes. Results There were 988 images from 563 people (mean age = 67 years; 64 % were women). Overall, the F1 score was 57 %. The group 15 + years of self-reported diabetes had the highest precision (64 %) and F1 score (63 %), while the highest recall (69 %) was observed in the group 0–4 years. The proportion of correctly classified observations was 55 % for the group 0–4 years, 51 % for 5–9 years, 58 % for 10–14 years, and 64 % for 15 + years with self-reported diabetes. Conclusions The machine learning model had acceptable accuracy and F1 score, and correctly classified more than half of the patients according to diabetes duration. Using large foundational models to extract image and text embeddings seems a feasible and efficient approach to predict years living with self-reported diabetes.Publicación Government plans in the 2016 and 2021 Peruvian presidential elections: A natural language processing analysis of the health chapters(Taylor and Francis, F1000Research, 2022-10-25) Carrillo Larco, Rodrigo M.; Castillo-Cara, Manuel; Lovón Melgarejo, JesúsBackground: While clinical medicine has exploded, electronic health records for Natural Language Processing (NLP) analyses, public health, and health policy research have not yet adopted these algorithms. We aimed to dissect the health chapters of the government plans of the 2016 and 2021 Peruvian presidential elections, and to compare different NLP algorithms. Methods: From the government plans (18 in 2016; 19 in 2021) we extracted each sentence from the health chapters. We used five NLP algorithms to extract keywords and phrases from each plan: Term Frequency–Inverse Document Frequency (TF-IDF), Latent Dirichlet Allocation (LDA), TextRank, Keywords Bidirectional Encoder Representations from Transformers (KeyBERT), and Rapid Automatic Keywords Extraction (Rake). Results: In 2016 we analysed 630 sentences, whereas in 2021 there were 1,685 sentences. The TF-IDF algorithm showed that in 2016, 26 terms appeared with a frequency of 0.08 or greater, while in 2021 27 terms met this criterion. The LDA algorithm defined two groups. The first included terms related to things the population would receive (e.g., ’insurance’), while the second included terms about the health system (e.g., ’capacity’). In 2021, most of the government plans belonged to the second group. The TextRank analysis provided keywords showing that ’universal health coverage’ appeared frequently in 2016, while in 2021 keywords about the COVID-19 pandemic were often found. The KeyBERT algorithm provided keywords based on the context of the text. These keywords identified some underlying characteristics of the political party (e.g., political spectrum such as left-wing). The Rake algorithm delivered phrases, in which we found ’universal health coverage’ in 2016 and 2021. Conclusion: The NLP analysis could be used to inform on the underlying priorities in each government plan. NLP analysis could also be included in research of health policies and politics during general elections and provide informative summaries for the general population.Publicación On the Significance of Graph Neural Networks With Pretrained Transformers in Content-Based Recommender Systems for Academic Article Classification(Wiley, 2025-05-27) Liu, Jiayun; Castillo-Cara, Manuel; García Castro, Raúl; CYTED Ciencia y Tecnología para el Desarrollo and Comunidad de Madrid.Recommender systems are tools for interacting with large and complex information spaces by providing a personalised view of such spaces, prioritising items that are likely to be of interest to the user. In addition, they serve as a significant tool in academic research, helping authors select the most appropriate journals for their academic articles. This paper presents a comprehensive study of various journal recommender systems, focusing on the synergy of graph neural networks (GNNs) with pretrained transformers for enhanced text classification. Furthermore, we propose a content-based journal recommender system that combines a pretrained Transformer with a Graph Attention Network (GAT) using title, abstract and keywords as input data. The proposed architecture enhances text representation by forming graphs from the Transformers' hidden states and attention matrices, excluding padding tokens. Our findings highlight that this integration improves the accuracy of the journal recommendations and reduces the transformer oversmoothing problem, with RoBERTa outperforming BERT models. Furthermore, excluding padding tokens from graph construction reduces training time by 8%–15%. Furthermore, we offer a publicly available dataset comprising 830,978 articles.Publicación Frost forecasting through machine learning algorithms(Springer, 2025-01-17) Pérez Tárraga, Javier; Castillo-Cara, Manuel; Arias Antúnez, Enrique; Dujovne, DiegoAgriculture continues to be one of the world’s main sources of income and provides great environmental, territorial and social value. However, frost is a recurring problem for farmers each year, representing a significant threat to agricultural production. In a matter of hours, temperatures below the freezing point can result in the loss of nearly the entire crop from a producer. In this article, we have analyzed and compared the application of a set of machine learning algorithms to predict the occurrence of frost events in the next 24 hours. The prediction process covers several challenges, such as data capture, processing, extracting each relevant parameter and finally building different prediction models to compared their performance. Furthermore, we have employed the Synthetic Minority Oversampling Technique (SMOTE) methodology to address the issue of imbalanced datasets, given the natural scarcity of frost events during the data sampling period. Our results show that among the machine learning algorithms we compared, the most efficient in terms of Recall score is K-Nearest Neighbor (KNN), while using the Area Under Curve (AUC) criteria, the highest score belongs to the Extra Trees algorithm, with 0.9909. Moreover, by applying the SMOTE balancing process, the AUC score of our models increased 13%, while the Recall score increased from 55% to 82%.Publicación MIMO-Based Indoor Localisation With Hybrid Neural Networks: Leveraging Synthetic Images From Tidy Data for Enhanced Deep Learning(Institute of Electrical and Electronics Engineers (IEEE), 2025-03-31) Castillo-Cara, Manuel; Martínez-Gómez, Jesús; Ballesteros-Jerez, Javier; García-Varea, Ismael; García-Castro, Raúl; Orozco-Barbosa, LuisIndoor localization determines an object’s position within enclosed spaces, with applications in navigation, asset tracking, robotics, and context-aware computing. Technologies range from WiFi and Bluetooth to advanced systems like Massive Multiple Input-Multiple Output (MIMO). MIMO, initially designed to enhance wireless communication, is now key in indoor positioning due to its spatial diversity and multipath propagation. This study integrates MIMO-based indoor localization with Hybrid Neural Networks (HyNN), converting structured datasets into synthetic images using TINTO. This research marks the first application of HyNNs using synthetic images for MIMO-based indoor localization. Our key contributions include: (i) adapting TINTO for regression problems; (ii) using synthetic images as input data for our model; (iii) designing a novel HyNN with a Convolutional Neural Network branch for synthetic images and an MultiLayer Percetron branch for tidy data; and (iv) demonstrating improved results and metrics compared to prior literature. These advancements highlight the potential of HyNNs in enhancing the accuracy and efficiency of indoor localization systems.Publicación Identification of antibiotic resistance profiles in diabetic foot infections: A machine learning proof‑of‑concept analysis(Springer, 2025-04-12) Carrillo Larco, Rodrigo M.; Mori Orrillo, Edmundo de Elvira; Castillo-Cara, Manuel; García, Raúl; Yovera-Aldana, Marlon; Bernabe-Ortiz, AntonioBACKGROUND: Diabetic foot infections (DFIs) are a prevalent diabetes-related complication. Managing DFIs requires timely antibiotic treatment but identifying the best antibiotic often depends on microbiological cultures, which can take days and may be unavailable or prohibitively expensive in resource-limited settings. We aimed to develop a classification model that uses readily available clinical and laboratory data to differentiate between DFIs that are Gram+ resistant, Gram- resistant, or none. METHODS: We used retrospective data from patients treated for DFIs at a hospital in Lima, Peru. Gram+ multidrug-resistant bacteria (MDRB) included MDR species of Staphylococcus aureus, other Staphylococcus and Enterococcus, whereas Gram- MDRB included MDR species of Enterobacteriaceae, Pseudomonas, and Acinetobacter. Twenty clinical (e.g., Wagner classification) and laboratory (e.g., HbA1c) variables were used as predictors in a XGBoost model which was internally validated. RESULTS: 147 patients, predominantly male (75.1%), with a mean age of 59.7 years. Of these, 19.7% had no MDRB, 34.0% had Gram+ MDRB, and 46.3% had Gram- MDRB. The model achieved an overall F1 score of 83.9%. The highest precision (91.8%) was observed for the Gram- class; highest recall (93.3%) was observed for the Gram+ class. The Gram+ class was correctly classified 75% of the time; Gram- class had a correct classification rate of 90%. CONCLUSIONS: Our work suggests it is possible to distinguish between DFIs that are non-MDR, Gram+ MDR, or Gram- MDR using readily available information. Although further validation is required, this model offers promising evidence for a digital bedside tool to guide empirical antibiotic treatment for DFIs.