Presenter

Summary:

Keenan Pepper proposes a research direction to address safety and alignment concerns in AI, focusing on embedded agents. Embedded agents exist within their environment, which can impact their cognition and introduce manipulation risks. Studying environments with embedded and intelligent agents is valuable for understanding potential dangers and exploring interpretability. Keenan highlights the need for a safe sandbox to study embedded agents before superintelligence is embedded. Interpretability is crucial, especially in games where success depends on revealing internal states to opponents. Finally, Keenan suggests creating an environment, possibly using a game-like setup, where agents can be trained to perform embedded tasks while revealing their internal states.

Gaming the Future: The Book!

Keenan Pepper | Environments for Empirical Embedded Agency Research @ IC Workshop

Presenter

Keenan Pepper, Salesforce

Summary:

Search Foresight Institute