Work in Progress: 20200318

Implementation of an Automatic Camera Operator using Deep Reinforcement Learning

(gl) Implementación dun operador de cámara automático usando Deep Reinforcement Learning
(es) Implementación de un operador de cámara automático usando Deep Reinforcement Learning

Student

Adrián Rodríguez Louzán

Supervision

Luis Omar Álvarez Mures (Cinfo, UDC)
Francisco Javier Taibo Pena (UDC)
Emilio José Padrón González (UDC)

Brief description

Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In the operations research and control literature, reinforcement learning is called approximate dynamic programming, or neuro-dynamic programming [1] [2].

The problems of interest in reinforcement learning have also been studied in the theory of optimal control, which is concerned mostly with the existence and characterization of optimal solutions, and algorithms for their exact computation, and less with learning or approximation, particularly in the absence of a mathematical model of the environment. Reinforcement algorithms that incorporate deep learning can beat world champions at the game of Go as well as human experts playing numerous Atari video games. Although that may sound trivial, it’s a vast improvement over their previous accomplishments, and the state of the art is progressing rapidly.

A game engine is a software-development environment designed for people to build video games. Developers use game engines to construct games for consoles, mobile devices, and personal computers. The core functionality typically provided by a game engine includes a rendering engine (“renderer”) for 2D or 3D graphics, a physics engine or collision detection (and collision response), sound, scripting, animation, artificial intelligence, networking, streaming, memory management, threading, localization support, scene graph, and may include video support for cinematics. Implementers often economize on the process of game development by reusing/adapting, in large part, the same game engine to produce different games [3] or to aid in porting games to multiple platforms. Since agents need an accurate virtual representation of the task at hand, 3D engines come in handy to create computational environments in which we can train them in parallel. This makes exhausting our computational resources possible to accelerate the aforementioned training.

In this project, we will leverage Deep Reinforcement Learning and game engines to model a typical sports scene and teach an agent to capture the action on camera. First, a scene which resembles a sports match will be created using the chosen game engine. Next, we will test out different Deep Reinforcement Learning algorithms (Discrete-Action DQN, Parametric-Action DQN, Double DQN, Dueling DQN, Dueling Double DQN, DDPG (DDPG), Soft Actor-Critic (SAC)) to see which one fits this problem best.

The specific application domains would be broadcasting, games, surveillance, etc. The solutions developed in this project will be integrated in an open source game engine, probably Godot [4].

[1] Reinforcement learning and markov decision processes.
Martijn van Otterlo, Marco Wiering.
In: Wiering M., van Otterlo M. (eds) Reinforcement Learning. Adaptation, Learning, and Optimization,
vol 12. Springer, Berlin, Heidelberg. 2012.
DOI: 10.1007/978-3-642-27645-3_1

[2] Reinforcement Learning: A Survey.
Leslie P. Kaelbling, Michael L. Littman, Andrew W. Moore.
Journal of Artificial Intelligence Research 4, pp. 237-285. 1996.
DOI: 10.1613/jair.301

[3] 3D Game Engine Programming (Game Development Series).
Stefan Zerbst, Oliver Düvel.
Course Technology PTR; 1 edition (June 30, 2004)
ISBN-10: 1592003516

[4] Godot Game Engine.
https://godotengine.org

Specific objectives

  • The main objective of this project is to develop the described environment in a game engine and train a RL agent to solve the aforementioned task.

  • The student will explore new innovative Deep Learning methods.

  • The Automatic Camera Operator implemented will be integrated in an open source game engine.

Methodology

An Agile development method will guide the project, with relatively short sprints to build the different tasks, after a preliminary work of study and documentation.

Development steps

  • Analysis of requirements and project scheduling, according to student disponibility.

  • Study and documentation.

    • Game engines.
    • TensorFlow, Horizon.
  • Incremental, iterative work sequences (sprints) to develop a 3D environment that models an sport.

  • Incremental, iterative work sequences (sprints) to develop a Deep Reinforcement Learning approach for imitating a camera operator.

Material

  • Personal computer with GPU and internet access.
Teaching and Researching in Computer Science/Engineering

My research interests include High Performance Computing (HPC) and Computer Graphics.