5. Agent/environment symmetry. Let us turn the tables on the agent: how should we play if we were the bandit? A moment of reflection reveals that the analysis is symmetrical. An agent that does not attempt to maximize the payoff, or cannot do so due to limited reasoning power, will pick its strategy in a way that is indifferent to our placement of the reward. In contrast, a more effective agent will react to our choice, seemingly anticipating it. Furthermore, the agent will appear friendly if our goal is to maximize the payoff and adversarial if our goal is to minimize it.