# Information, utility and bounded rationality

\[20] P. A. Ortega and D. A. Braun. Information, utility and bounded rationality. In International Conference on Artificial General Intelligence. Springer Berlin Heidelberg, 2011. \[21] D. H. Wolpert, M. Harré, E. Olbrich, N. Bertschinger, and J. Jost. Hysteresis effects of changing the parameters of noncooperative games. Physical Review E, 85, 2012. \[22] S. Bubeck and A. Slivkins. The best of both worlds: stochastic and adversarial bandits. In In Proceedings ofthe International Conference on Computational Learning Theory (COLT), 2012. \[23] Y. Seldin and A. Silvkins. One practical algorithm for both stochastic and adversarial bandits. In 31 st International Conference on Machine Learning, 2014. \[24] P. Auer and C. Chao-Kai. An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits. In 29th Annual Conference on Learning Theory, 2016. \[25] M. L. Littman. Friend-or-Foe Q-Learning in General-Sum Games. In Proceedings of the International Conference on Machine Learning (ICML), 2001. \[26] R. Powers and Y. Shoham. New criteria and a new algorithm for learning in multi-agent systems. In Advances in neural information processing systems, pages 1089–1096, 2005. \[27] A. Greenwald and K. Hall. Correlated Q-Learning. In Proceedings of the 22nd Conference on Artificial Intelligence, pages 242–249, 2003. \[28] J. W. Crandall and M. A. Goodrich. Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 82(3):281–314, 2011. \[29] P. Hernandez-Leal and M. Kaisers. Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor, 2017. \[30] W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25:285–294, 1933. \[31] O. Chappelle and L. Li. An empirical evaluation of Thompson Sampling. In Advances in neural information processing systems, 2011. \[32] C. K. Ling, F. Fang, and J. Z. Kolter. What game are we playing? end-to-end learning in normal and extensive form games. arXiv preprint arXiv:1805.02777, 2018. \[33] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. \[34] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agi.university/information-utility-and-bounded-rationality.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
