Modification of information reduction processes in Convolutional Neural Networks

  1. Rodríguez Martínez, Iosu
Dirigida por:
  1. Humberto Bustince Sola Director/a
  2. Francisco Herrera Triguero Director
  3. Zdenko Takáč Director/a

Universidad de defensa: Universidad Pública de Navarra

Fecha de defensa: 11 de octubre de 2024

Tipo: Tesis

Resumen

During the last decade, Deep Artificial Neural Networks have established themselves as the state-of-the-art solution for solving complex tasks such as image processing, time-series forecasting, or natural language processing. One of the most studied families of artificial neural network is that of Convolutional Neural Networks (CNNs), which can exploit the local information of data sources such as images by automatically extracting increasingly more complex features in a hierarchical manner. Although plenty of work has been dedicated to the introduction of more complex (or more efficient) model architectures of CNN; to solving the optimisation problems faced by them and accelerating training convergence; or to trying to interpret their inner workings as well as explaining their generated predictions, an important key aspect of these models is sometimes overlooked: that of feature fusion. Feature fusion appears in plenty of forms in CNNs. Feature downsampling is necessary in order to compress the intermediate representations generated by the model, while preserving the most relevant information, a process which also makes models robust to small shifts in the inputs. Combining different sources of data or different feature representations is also a recurrent problem in neural networks, which is usually taken care of by simply allowing the model to learn additional transformations in a supervised manner, increasing its parameter count. In this dissertation, we study the application of solutions of the Information Fusion field to better tackle these problems. In particular, we explore the use of aggregation functions which replace a set of input values by a suitable single representative. We study the most important properties of these functions in the context of CNN feature reduction, and present novel pooling and Global Pooling proposals inspired by our discoveries. We also test the suitability of our proposals for the detection of COVID-19 patients, presenting an end-to-end pipeline which automatically analyses chest x-ray images.