22/02: Call for student!

Crop Models for Reinforcement Learning at Scale

(gl) Modelos de cultivo para aprendizaxe automática por reforzo a gran escala
(es) Modelos de cultivo para aprendizaje automático por refuerzo a gran escala

Student

  • Available for a Final year project: BSc or MSc in Computer Science
    (TFG Grao en Enxeñería Informática ou TFM Mestrado en Enxeñería Informática)

Supervision

Romain Gautron (PhD student, SequeL Team, INRIA Lille)
Bruno Raffin (DataMove Team, Inria Rhône-Alpes, Univ. Grenoble Alpes)
Emilio José Padrón González (UDC)

Already a work in progress project

Main repository: https://gitlab.inria.fr/rgautron/gym_dssat_pdi Documentation: https://gitlab.inria.fr/rgautron/gym-dssat-docs

Different TFG/TFM proposals are possible in this project context.

Brief description

Reinforcement Learning (RL) is a branch of Machine Learning dealing with sequential decision-making under uncertainty. In an RL problem, an agent has to learn to maximize a score depending on the sequences of actions he performed. RL learning algorithms are based on trial-and-error. An agent observes some measures from an environment, makes a decision based on those measures and observes the consequences of decision-making. Effect of actions are mostly umpredictable, thus there is uncertainty in decision-making. Depending on how good are the consequences, the agent will later adjust its behaviour to make better future decisions. [1, 2]

Crop-management is the series of cultural operations performed into a field to grow a given crop with given characteristics, such as yield and quality criterion. Crop-management is such an example where sequential decision making leads to uncertain results. An agent learning to perform crop management would make a series of crop management operations (e.g. sowing, fertilization, irrigation) with uncertain consequences (depending for instance on weather and pest events) and observes at the end a score that can be for instance crop yield. RL is based on trial and error and thus requires many failures. The use of RL to support crop-management is of great interest to support highly uncertainty crop-management decisions, such as planting date choice, especially in Sub-Saharan Africa for smallholder farmers. Nevertheless, crop cycle length and experimental cost are high, thus limiting the number of training examples.

Process-based crop models (PBM) are crop growth simulators based on causal models from plant physiology and plant science, allowing robust simulations. The idea is to use crop growth simulators to pre-train and evaluate RL agents learning crop-management. RL requires interactive communication with process-based crop models: reading field’s parameters, making a decision and getting next field’s parameters for each time step of the simulation. Crop models are not directly usable in this configuration: we need to “pause” the simulator to read variables in memory and send data on the fly (e.g. fertilization, irrigation).

We use a Fortran crop model called DSSAT [3]. DSSAT code is sequential. In order to have a widespread, formalized and easy to manipulate RL environment, we use the OpenAI Gym framework [4], which is Python based. The communication between DSSAT and the Gym environment is carried out by the PDI Data Interface [5].

[TODO] To deploy large-scale distributed RL simulations with Rllib [6], an RL library included in Ray, a python framework for building distributed applications.

[TODO] New RL algorithms and further improvements.

References:

[1] Lapan, M. (2018). Deep Reinforcement Learning Hands-On: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more. Packt Publishing Ltd.

[2] Sutton, R. S., Barto, A. G., et al. (1998). Introduction to reinforcement learning, volume 135. MIT press Cambridge.

[3] Hoogenboom, G., Porter, C., Boote, K., Shelia, V., Wilkens, P., Singh, U., White, J., Asseng, S., Lizaso, J., Moreno, L., et al. (2019). The dssat crop modeling ecosystem. Advances in crop modelling for a sustainable agriculture, pages 173–216.

[4] OpenAI’s Gym: https://gym.openai.com

[5] The PDI Data Interface: https://pdi.dev/master

[6] Ray and Rllib: https://ray.io

Specific objectives

  • Leverage Hight Performance Computing (HPC) by running large-scale distributed RL simulations

Teaching and Researching in Computer Science/Engineering

My research interests include High Performance Computing (HPC) and Computer Graphics.