paper

DIP-RL: Demonstration-Inferred Preference Learning in Minecraft

Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller, Nicholas Waytowich

DIP-RL: Demonstration-Inferred Preference Learning in Minecraft

Name: DIP-RL: Demonstration-Inferred Preference Learning in Minecraft
Author: Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller, Nicholas Waytowich

Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller, Nicholas Waytowich

paper2023-07-22English

Start Reading

machine learning financearxiv

Description

In machine learning for sequential decision-making, an algorithmic agent learns to interact with an environment while receiving feedback in the form of a reward signal. However, in many unstructured real-world settings, such a reward signal is unknown and humans cannot reliably craft a reward signal that correctly captures desired behavior. To solve tasks in such unstructured and open-ended environments, we present Demonstration-Inferred Preference Reinforcement Learning (DIP-RL), an algorithm t...