paper

Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards

Rituraj Kaushik, Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret

Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards

Name: Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards
Author: Rituraj Kaushik, Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret

Rituraj Kaushik, Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret

paper2018-06-25English

Start Reading

machine learning financearxiv

Description

The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. However, the current algorithms lack an effective exploration strategy to deal with sparse or misleading reward scenarios: if they do not experience any state with a positive reward during the initial random exploration, it i...