# Information, utility and bounded rationality

\[20] P. A. Ortega and D. A. Braun. Information, utility and bounded rationality. In International Conference on Artificial General Intelligence. Springer Berlin Heidelberg, 2011. \[21] D. H. Wolpert, M. Harré, E. Olbrich, N. Bertschinger, and J. Jost. Hysteresis effects of changing the parameters of noncooperative games. Physical Review E, 85, 2012. \[22] S. Bubeck and A. Slivkins. The best of both worlds: stochastic and adversarial bandits. In In Proceedings ofthe International Conference on Computational Learning Theory (COLT), 2012. \[23] Y. Seldin and A. Silvkins. One practical algorithm for both stochastic and adversarial bandits. In 31 st International Conference on Machine Learning, 2014. \[24] P. Auer and C. Chao-Kai. An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits. In 29th Annual Conference on Learning Theory, 2016. \[25] M. L. Littman. Friend-or-Foe Q-Learning in General-Sum Games. In Proceedings of the International Conference on Machine Learning (ICML), 2001. \[26] R. Powers and Y. Shoham. New criteria and a new algorithm for learning in multi-agent systems. In Advances in neural information processing systems, pages 1089–1096, 2005. \[27] A. Greenwald and K. Hall. Correlated Q-Learning. In Proceedings of the 22nd Conference on Artificial Intelligence, pages 242–249, 2003. \[28] J. W. Crandall and M. A. Goodrich. Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 82(3):281–314, 2011. \[29] P. Hernandez-Leal and M. Kaisers. Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor, 2017. \[30] W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25:285–294, 1933. \[31] O. Chappelle and L. Li. An empirical evaluation of Thompson Sampling. In Advances in neural information processing systems, 2011. \[32] C. K. Ling, F. Fang, and J. Z. Kolter. What game are we playing? end-to-end learning in normal and extensive form games. arXiv preprint arXiv:1805.02777, 2018. \[33] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. \[34] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
