Self-learning robot navigation with deep reinforcement learning techniques

Pintos Gómez de las Heras, Borja

Self-learning robot navigation with deep reinforcement learning techniques

Pintos Gómez de las Heras, Borja. (2022). Self-learning robot navigation with deep reinforcement learning techniques Master Thesis, Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificia

Ficheros (Some files may be inaccessible until you login with your e-spacio credentials)
Nombre			Descripción	Tipo MIME		Size
Pintos_Gomez_delasHeras_Borja_TFM.pdf			Pintos_Gomez_delasHeras_Borja_TFM.pdf		application/pdf	4.27MB

Título	Self-learning robot navigation with deep reinforcement learning techniques
Autor(es)	Pintos Gómez de las Heras, Borja
Abstract	The autonomous driving has been always a challenging task. A high number of sensors mounted in the vehicle analyze the surroundings and provide to the autonomous driving algorithm useful information, such as relative distances from the vehicle to the different obstacles. Some robotic paradigms, like the reactive paradigm, uses this sensorial input to directly create an action linked to the actuators. This makes the reactivate paradigm capable to react to unpredictable scenarios with relatively low computational resources. However, they lack a robot motion planning. This can lead to longer and less comfortable trajectories with respect to the hierarchical/deliberative paradigm, which counts with a motion planning module over a predefined horizon. Although a local optimization of the robot trajectory is now possible under static scenarios, the motion planning module comes at a high cost in terms of memory and computational power. The hybrid paradigm combines the reactive and hierarchical/deliberative paradigms to solve even more complex scenarios, such as dynamic scenarios, but the memory and computational resources needed are still high. This work presents the sense-think-act-learn robotic paradigm which aims to inherit the advantages of the reactive, hierarchical/deliberative and hybrid paradigms at a reasonable computational cost. The proposed methodology makes use of reinforcement learning techniques to learn a policy by trial and error, just like the human brain works. On one hand, there is no motion planning module, so that the computational power can be limited like in the reactive paradigm. But on the other hand, a local planification and optimization of the robot trajectory takes place, like in the hierarchical/deliberative and hybrid paradigms. This planification is based on the experience stored during the learning process. Reactions to sensorial inputs are automatically learnt based on well-defined reward functions, which are directly mapped to the safety, legal, comfort and task-oriented requirements of the autonomous driving problem. Since the motion planification is based on the experience, the algorithm proposed is not bound to any embedded model of the vehicle or environment. Instead, the algorithm learns directly from the environment (real or simulated) and therefore it is not affected by uncertainties of embedded models or estimators which try to reproduce the dynamics of the vehicle or robot. Additionally, the policy is learnt automatically. The state-of-the-art algorithms invert many engineering hours to develop a policy or algorithm to fulfil all given requirements, while the method proposed in this work saves these costs and engineering time. Another interesting advantage of the proposed algorithm is the capability to adapt the logic under unknown scenarios. For that, an online learning process is implemented, but the memory and computational power required for that is high.
Notas adicionales	Trabajo de Fin de Máster Universitario en I.A. Avanzada: Fundamentos, Métodos y Aplicaciones. UNED
Materia(s)	Ingeniería Informática
Palabra clave	deep reinforcement learning self-learning autonomous driving deep deterministic policy gradient Q-learning dynamic environment
Editor(es)	Universidad Nacional de Educación a Distancia (España). Escuela Técnica Superior de Ingeniería Informática. Departamento de Inteligencia Artificia
Director/Tutor	Martínez Tomás, Rafael Cuadra Troncoso, José Manuel
Fecha	2022-09-01
Formato	application/pdf
Identificador	bibliuned:master-ETSInformatica-IAA-Bpintos http://e-spacio.uned.es/fez/view/bibliuned:master-ETSInformatica-IAA-Bpintos
Idioma	eng
Versión de la publicación	acceptedVersion
Nivel de acceso y licencia	http://creativecommons.org/licenses/by-nc-nd/4.0 info:eu-repo/semantics/openAccess
Tipo de recurso	master Thesis
Tipo de acceso	Acceso abierto

Tipo de documento:	master Tesis
Collections:	Máster Universitario en I.A. Avanzada: Fundamentos, Métodos y Aplicaciones (UNED) Set de openaire Set de items trabajo fin de máster

Contador de citas:	Search Google Scholar
Estadísticas de acceso:	287 Visitas, 58 Descargas - Estadísticas en detalle
Creado:	Fri, 15 Sep 2023, 17:45:32 CET

e-spacio

Self-learning robot navigation with deep reinforcement learning techniques