# 用因果影响图建模通用人工智能安全框架

**Ramana Kumar, DeepMind**\
**Translated by Xiaohu Zhu, University AI**\
\
**我们写了一篇论文，将用来设计安全通用人工智能（AGI）的各种框架（例如，带有奖励建模的强化学习，合作式逆强化学习 CIRL，辩论 debate 等）表示为因果影响图（CID），以帮助我们比较框架并更好地理解相应的智能体激励机制。**\
\
**我们很乐意收到评论，特别是关于**\
**1. 介绍的框架是否可以被准确表示？**\
**2. CID表示有用吗？**\
**3. 我们没有包含的框架建模成这种模型有用吗？**\
\
**论文的摘要：安全的通用人工智能系统（AGI）的提议通常在框架层面进行，规定了如何训练所提议系统的组件并相互交互。在本文中，我们使用因果影响图来模拟和比较最有希望的 AGI 安全框架。图显示了框架的优化目标和因果假设。统一的表示可以让我们轻松地比较框架及其假设。我们希望这些图可以作为主要 AGI 安全框架的一个易接受和可视化的介绍。**\
\
[**本文对齐论坛地址**](https://www.alignmentforum.org/posts/HE5DL6XeomYxFab74/modeling-agi-safety-frameworks-with-causal-influence-1?fbclid=IwAR3dppEJjDITl-PpAAXdJnlSjrj9xXUB6b0faxXJypnQLI0M3F2lYYCiSNU)<br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agi.university/yong-yin-guo-ying-xiang-tu-jian-mo-tong-yong-ren-gong-zhi-neng-an-quan-kuang-jia.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
