39
0
Planning With Uncertain Specifications (PUnS)
attributed to: Ankit Shah, Shen Li, Julie Shah
Reward engineering is crucial to high performance in reinforcement learning
systems. Prior research into reward design has largely focused on Markovian
functions representing the reward. While there has been research into
expressing non-Markov rewards as linear temporal logic (LTL) formulas, this has
focused on task specifications directly defined by the user. However, in many
real-world applications, task specifications are ambiguous, and can only be
expressed as a belief over LTL formulas. In this paper, we introduce planning
with uncertain specifications (PUnS), a novel formulation that addresses the
challenge posed by non-Markovian specifications expressed as beliefs over LTL
formulas. We present four criteria that capture the semantics of satisfying a
belief over specifications for different applications, and analyze the
qualitative implications of these criteria within a synthetic domain. We
demonstrate the existence of an equivalent Markov decision process (MDP) for
any instance of PUnS. Finally, we demonstrate our approach on the real-world
task of setting a dinner table automatically with a robot that inferred task
specifications from human demonstrations.
0
Vulnerabilities & Strengths