AGI University
Search
⌃
K
The AGI Landscape
我们的愿景 Our vision
Papers
Rationality and intelligence
AI safety gridworlds
Modeling Friends and Foes
Forget-me-not-Process
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
Universal Transformers
Graph Convolutional Policy Network
Thermodynamics as a theory of decision-making with informationprocessing costs
Concrete Problems in AI Safety
A course in game theory
Theory of games and economic behavior
Reinforcement learning: An introduction 1e
Regret analysis of stochastic and nonstochastic multi-armed bandit problems
The nonstochastic multiarmed bandit problem
Information theory of decisions and actions
Clustering with bregman divergences
Quantal Response Equilibria for Normal Form Games
The numerics of gans
The Mechanics of n-Player Differentiable Games
Reactive bandits with attitude
Data clustering by markovian relaxation and the information bottleneck method
Information bottleneck for Gaussian variables
Bounded Rationality, Abstraction, and Hierarchical Decision-Making: An Information-Theoretic Optimal
Risk sensitive path integral control
Information, utility and bounded rationality
Hysteresis effects of changing the parameters of noncooperative games
The best of both worlds: stochastic and adversarial bandits
One practical algorithm for both stochastic and adversarial bandits
An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits
Friend-or-Foe Q-Learning in General-Sum Games
New criteria and a new algorithm for learning in multi-agent systems
Correlated Q-Learning
Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning
Learning against sequential opponents in repeated stochastic games
On the likelihood that one unknown probability exceeds another in view of the evidence of two sample
An empirical evaluation of Thompson Sampling
What game are we playing? end-to-end learning in normal and extensive form games
Intriguing properties of neural networks
Explaining and harnessing adversarial examples
go-explore
The Landscape of Deep Reinforcement Learning
用因果影响图建模通用人工智能安全框架
Papers
test
Measuring and avoiding side effects using relative reachability
Powered By
GitBook
Comment on page
Information theory of decisions and actions
[10] N. Tishby and D. Polani. Information theory of decisions and actions. In Perception-action cycle, pages 601–636. Springer New York, 2011.
Previous
The nonstochastic multiarmed bandit problem
Next
Clustering with bregman divergences
Last modified
5yr ago