AGI University
  • The AGI Landscape
  • 我们的愿景 Our vision
  • Papers
  • Rationality and intelligence
  • AI safety gridworlds
  • Modeling Friends and Foes
  • Forget-me-not-Process
  • Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
  • Universal Transformers
  • Graph Convolutional Policy Network
  • Thermodynamics as a theory of decision-making with informationprocessing costs
  • Concrete Problems in AI Safety
  • A course in game theory
  • Theory of games and economic behavior
  • Reinforcement learning: An introduction 1e
  • Regret analysis of stochastic and nonstochastic multi-armed bandit problems
  • The nonstochastic multiarmed bandit problem
  • Information theory of decisions and actions
  • Clustering with bregman divergences
  • Quantal Response Equilibria for Normal Form Games
  • The numerics of gans
  • The Mechanics of n-Player Differentiable Games
  • Reactive bandits with attitude
  • Data clustering by markovian relaxation and the information bottleneck method
  • Information bottleneck for Gaussian variables
  • Bounded Rationality, Abstraction, and Hierarchical Decision-Making: An Information-Theoretic Optimal
  • Risk sensitive path integral control
  • Information, utility and bounded rationality
  • Hysteresis effects of changing the parameters of noncooperative games
  • The best of both worlds: stochastic and adversarial bandits
  • One practical algorithm for both stochastic and adversarial bandits
  • An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
  • Friend-or-Foe Q-Learning in General-Sum Games
  • New criteria and a new algorithm for learning in multi-agent systems
  • Correlated Q-Learning
  • Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning
  • Learning against sequential opponents in repeated stochastic games
  • On the likelihood that one unknown probability exceeds another in view of the evidence of two sample
  • An empirical evaluation of Thompson Sampling
  • What game are we playing? end-to-end learning in normal and extensive form games
  • Intriguing properties of neural networks
    • Untitled
  • Explaining and harnessing adversarial examples
  • go-explore
  • The Landscape of Deep Reinforcement Learning
  • 用因果影响图建模通用人工智能安全框架
  • Papers
    • test
    • Measuring and avoiding side effects using relative reachability
Powered by GitBook
On this page

Learning against sequential opponents in repeated stochastic games

[29] P. Hernandez-Leal and M. Kaisers. Learning against sequential opponents in repeated stochastic games. In The 3rd Multi-disciplinary Conference on Reinforcement Learning and Decision Making, Ann Arbor, 2017.

PreviousLearning to compete, coordinate, and cooperate in repeated games using reinforcement learningNextOn the likelihood that one unknown probability exceeds another in view of the evidence of two sample

Last updated 6 years ago