Thermographic Breast Cancer Detection. Deep Learning with a Small Dataset

Safont Andreu, Anna. (2020). Thermographic Breast Cancer Detection. Deep Learning with a Small Dataset Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial.

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre Descripción Tipo MIME Size
Safont_Andreu_Anna_TFM.pdf Safont_Andreu_Anna_TFM.pdf application/pdf 1.61MB

Título Thermographic Breast Cancer Detection. Deep Learning with a Small Dataset
Autor(es) Safont Andreu, Anna
Abstract According to the World Health Organization (WHO), breast carcinoma is the cancer with highest prevalence among women, with 2.1 million new diagnoses every year. Given the risk of death associated to the metastasis during the late stages of the cancer, early detection is the optimal strategy to reduce the risk of death. Among the numerous tests that can be used in the breast cancer screening, thermography represents a non-invasive, painless, and free of ionizing radiation. The research group within which I have done this research is interested in applying artificial intelligence to analyzing thermographic images for breast cancer screening. Given that the project that this group intends to carry out in collaboration with HM Hospitales has not yet begun, we have used in this master thesis the Database for Mastology Research (DMR) developed at the Visual Lab of the Universidade Federal Fluminense, in Brazil, which is the only dataset of breast thermograms publicly available. It contains 216 patients, with up to 25 image per patient. It has been studied in dozens of research works, most of them using statistical feature extraction and machine learning algorithms for classification. Unfortunately this database has important flaws, such as two different patient having exactly the same image (pixel by pixel), which have not been mentioned in previous works. For this reason we have devoted a significant effort to cleaning the dataset, which reduced it to only 188 images. We have then tried several deep learning models for image classification. We first built from scratch several Convolutional Neural Networks (CNNs), each consisting of n pairs of convolutional-maxpool layers, a flatten layer, and n dense layers, for different values of n. All the CNNs gave poor results: the highest accuracy, obtained for n = 4, was 75%, and the largest area under the ROC (AUC), obtained for n = 5, was 0.70. We also took into account that a false positive, which may cause anxiety and discomfort to the patient and lead to a biopsy, is not as serious as a false positive, which may delay the detection of cancer, thus requiring more aggressive and expensive treatments and drastically reducing the survival rate. After consulting with a radiologist of HM Montepríncipe hospital, we estimate that the relative cost of a false negative is at least 20 times higher than that of a false positive and defined a metric in which a false negative weighs the same as 20 false positives. In our study, the CNN with n = 5 has the smallest weighted error, by far, so we have selected this network as a reference for the next phases of our study. In the second group of experiments we have used three of the most popular pre-trained CNNs available in Keras: VGG16, VGG19, and ResNet50, and optimized their parameters for our dataset; this process is usually called transfer learning. Contrary to other results published in the literature, all these re-trained CNNs performed worse than the optimal network built from scratch, i.e., the one with n = 5. Finally, we have built several hybrid models by replacing the top m layers of the optimal CNN with either a Support Vector Machine (SVM) or a Sum-Product Network (SPN), for different values of m. Again the performance was lower than for the optimal pure CNN. The conclusion is that when the dataset contains a relatively small number of images, large CNNs tend to overfit, thus leading to poor AUCs, contrary to the case of large datasets, for which very deep networks usually perform much better than shallow ones. An additional reason for which transfer learning did not work in our study is that the above-mentioned networks were trained for color images, while in a thermogram every pixel does not represent a red-green-blue (RGB) color, but a temperature, and for this reason in our case the networks built from scratch (at least some of them) performed better than re-trained CNNs.
Notas adicionales Trabajo de Fin de Máster. Máster Universitario en I.A. Avanzada: Fundamentos, Métodos y Aplicaciones. UNED
Materia(s) Ingeniería Informática
Editor(es) Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificial.
Director/Tutor Sánchez Cauce, Raquel
Díez Vegas, Francisco Javier
Fecha 2020-03-06
Formato application/pdf
Identificador bibliuned:master-ETSInformatica-IAA-Asafont
Idioma eng
Versión de la publicación acceptedVersion
Nivel de acceso y licencia
Tipo de recurso master Thesis
Tipo de acceso Acceso abierto

Versión Tipo de filtro
Contador de citas: Google Scholar Search Google Scholar
Estadísticas de acceso: 248 Visitas, 397 Descargas  -  Estadísticas en detalle
Creado: Mon, 20 Sep 2021, 20:03:06 CET