The Optimal Reward Problem: Designing Effective Reward for Bounded Agents
Add to Google Calendar
In an autonomous agent design scenario, the agent designer has goals which must be translated into goals for the agent. In the
Reinforcement Learning (RL) framework, goals are represented using reward functions, and typically a designer provides the agent with his or her own reward function. However, I show that if an agent is
bounded—if it is limited in its ability to maximize the designer's
expected reward—the designer may benefit by assigning the agent its own reward function. Thus, a designer faces the Optimal Reward Problem (ORP): choose the agent's reward function which leads to the greatest expected reward received by the designer.
Agents have a variety of types of limitations arising from their
limited computational resources, such as bounds on the depth of
planning or size of their state or model representation. Good reward functions are chosen by assessing how an agent's limitations interact with the environment. This dissertation proposes novel reward features such as the model-inaccuracy reward feature, which motivates an agent to avoid areas it has trouble modeling. I also derive the variance-based reward bonus, which produces provably efficient exploration in agents that maintain knowledge in the form of a Bayesian posterior but have bounded planning resources.
For empirical optimization, I develop the Policy Gradient for Reward Design (PGRD) algorithm, a convergent gradient ascent algorithm that learns good reward functions online during a planning agent's lifetime. I apply PGRD to UCT, a large-scale planning algorithm, in Othello, a two-player game with a long history in AI. The Othello experiments and others demonstrate that the optimal reward approach outperforms the common practice of applying an evaluation function to the leaf-states of the planning tree. These experiments also demonstrate that reward functions from the popular class of potential-based shaping reward functions are not always optimal. Reward design is a promising, general approach for improving agents with limited computational resources.