Modelos de visión para tareas de videovigilancia en sistemas empotrados

  1. Fernández Sánchez, Enrique Jaime
Dirigida por:
  1. Eduardo Ros Vidal Director
  2. Javier Díaz Alonso Codirector

Universidad de defensa: Universidad de Granada

Fecha de defensa: 28 de junio de 2013

Tribunal:
  1. Fernando Vargas Martín Presidente/a
  2. Manuel Rodríguez Álvarez Secretario
  3. Ignacio Bravo Muñoz Vocal
  4. Antonio Martínez Álvarez Vocal
  5. Bernd Porr Vocal

Tipo: Tesis

Resumen

This dissertation presents our work with computer vision models applied to video surveillance tasks. This work is focused on key stages of a video analytics system, such as video segmentation and object tracking, and is specially centered on embedded devices. This dissertation is structured in four parts. In the first part, we review the state of the art, paying special attention to background subtraction and object tracking. After a general review of the state of the art, we analyze in more detail the works existing in the literature related to embedded hardware, sensor fusion, and multi-camera object tracking. In the second part of this dissertation we focus on background subtraction algorithms on embedded hardware. We describe the two algorithms implemented on FPGA. We assess the two proposed architectures, performing a comparison with previous alternatives in the literature. We evaluate the performance concerning real-time constraints, hardware resources and energy consumption, as well as quality of the segmentation. The third part consists of our work on sensor fusion applied to background subtraction. We study the integration of depth information in a background subtraction model, provided by multi-camera stereo vision systems and active depth sensors. Different fusion methods are studied and evaluated by means of new datasets. We assess the proposed approaches, achieving considerable improvement over previous algorithms. The fourth part corresponds to multi-camera object tracking by using a smartphone network. We describe a prototype of an architecture to perform object tracking with two cameras, exchanging information in real time between devices. The information shared across the network is used to get basic calibration of the scene and to solve occlusions and recover from lost tracks, by gathering information from every device in a global manner. We also describe an attention model included in the architecture to detect and select objects of interest.