14
0
Adversarial Policies: Attacking Deep Reinforcement Learning
attributed to: Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell
Deep reinforcement learning (RL) policies are known to be vulnerable to
adversarial perturbations to their observations, similar to adversarial
examples for classifiers. However, an attacker is not usually able to directly
modify another agent's observations. This might lead one to wonder: is it
possible to attack an RL agent simply by choosing an adversarial policy acting
in a multi-agent environment so as to create natural observations that are
adversarial? We demonstrate the existence of adversarial policies in zero-sum
games between simulated humanoid robots with proprioceptive observations,
against state-of-the-art victims trained via self-play to be robust to
opponents. The adversarial policies reliably win against the victims but
generate seemingly random and uncoordinated behavior. We find that these
policies are more successful in high-dimensional environments, and induce
substantially different activations in the victim policy network than when the
victim plays against a normal opponent. Videos are available at
https://adversarialpolicies.github.io/.
0
Vulnerabilities & Strengths