Evolutionary computation for multitask and meta reinforcement learning: new methods and perspectives towards general-purpose Artificial Inteligence

Martínez Quintana, Aritz David

Evolutionary computation for multitask and meta reinforcement learning: new methods and perspectives towards general-purpose Artificial Inteligence

Martínez Quintana, Aritz David

Dirigida por:

Javier del Ser Lorente Codirector/a
Francisco Herrera Triguero Codirector

Universidad de defensa: Universidad de Granada

Fecha de defensa: 14 de abril de 2023

Tribunal:

Óscar Cordón García Presidente
Daniel Molina Cabrera Secretario
María José del Jesús Díaz Vocal
Antonio Jesús Nebro Urbaneja Vocal
Miren Nekane Bilbao Maron Vocal

Tipo: Tesis

Teseo: 793480 DIALNET DIGIBUG editor

Resumen

Currently, Big Data techniques and Deep Learning are changing the way humankind interacts with technology. From content recommendation to technologies capable of creating art, the ubiquity of neural networks is evident today, and is expected to grow in the medium to long term. Given the diversity of fields where Deep Learning is applied nowadays, it is interesting to extrapolate or “reuse” the knowledge generated in one problem to solve other related problems with proficiency, efficiency and speed. This procedure, known as Transfer Learning, is widely used in modeling tasks resorting to Deep Learning models. In this sense, a paradigm in which knowledge transfer between tasks has been shown to be very effective is Reinforcement Learning. Indeed, Transfer Learning addresses several of the inherent weaknesses in the learning process of an agent: the sampling efficiency when exploring the environment to be solved, or the possibility that the agent’s training may get stuck in sub-optimal policies. Besides traditionally used techniques to alleviate these drawbacks, such as the use of multiple agents or mechanisms to induce behavioral curiosity, it has been shown that evolutionary computation can give rise to efficient hybrid training procedures for developing reinforcement learning agents suited to deal with challenging environments. In this context, this Thesis studies how evolutionary computation can help Reinforcement Learning models based on Deep Learning to quickly adapt to new scenarios through the reuse of knowledge generated in previous modeling problems. For this purpose, the research focus is placed on the use of a specific branch of recently appeared in evolutionary computation, known as multi-factorial algorithms. Techniques belonging to this family of evolutionary optimization methods allow solving several problem instances simultaneously, taking advantage of possible synergies existing between their search space and/or solutions. The Thesis departs from the observation that the training process of a Reinforcement Learning model based on Deep Learning can be formulated as an optimization problem, and therefore, is feasible to be tackled by using evolutionary computation. This observation paves the way towards the possibility that, in multitask Reinforcement Learning scenarios, the previously mentioned multi-factorial algorithms can be used to automate the exchange of knowledge modeled for each of the tasks among the agents addressing each of such tasks. This first research hypothesis addressed by the Thesis is complemented by a second idea: the generation of generalizable knowledge to new Reinforcement Learning tasks from the simultaneous training of agents on previous Reinforcement Learning tasks. In particular, the Thesis focuses on the zero-shot assumption, by which it is not possible to know beforehand anything about the new tasks to be addressed, nor to update the model with information collected from these tasks during inference time. This scenario, also tackled through evolutionary computation and multi-factorial algorithms, represents a step forward towards the ability of Artificial Intelligence models to generate knowledge that allows them to adapt autonomously and efficiently to new tasks, advancing steadily towards a new paradigm: GPAI (General-Purpose Artificial Intelligence).