P. A. Ortega and D. A. Braun. Information, utility and bounded rationality. In International Conference on Artificial General Intelligence. Springer Berlin Heidelberg, 2011.  D. H. Wolpert, M. Harré, E. Olbrich, N. Bertschinger, and J. Jost. Hysteresis effects of changing the parameters of noncooperative games. Physical Review E, 85, 2012.  S. Bubeck and A. Slivkins. The best of both worlds: stochastic and adversarial bandits. In In Proceedings ofthe International Conference on Computational Learning Theory (COLT), 2012.  Y. Seldin and A. Silvkins. One practical algorithm for both stochastic and adversarial bandits. In 31 st International Conference on Machine Learning, 2014.  P. Auer and C. Chao-Kai. An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits. In 29th Annual Conference on Learning Theory, 2016.  M. L. Littman. Friend-or-Foe Q-Learning in General-Sum Games. In Proceedings of the International Conference on Machine Learning (ICML), 2001.  R. Powers and Y. Shoham. New criteria and a new algorithm for learning in multi-agent systems. In Advances in neural information processing systems, pages 1089–1096, 2005.  A. Greenwald and K. Hall. Correlated Q-Learning. In Proceedings of the 22nd Conference on Artificial Intelligence, pages 242–249, 2003.  J. W. Crandall and M. A. Goodrich. Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 82(3):281–314, 2011.  P. Hernandez-Leal and M. Kaisers. Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor, 2017.  W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25:285–294, 1933.  O. Chappelle and L. Li. An empirical evaluation of Thompson Sampling. In Advances in neural information processing systems, 2011.  C. K. Ling, F. Fang, and J. Z. Kolter. What game are we playing? end-to-end learning in normal and extensive form games. arXiv preprint arXiv:1805.02777, 2018.  C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.  I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.