So today, DeepMind drops a new paper, an early warning system for novel AI risk. New research proposes a framework for evaluating general purpose models against novel threats. There’s some pretty big implications here.
RELATED VIDEOS: “We have no moat” – Google • Google AI Documen… Tree of Thoughts
• Tree of Thoughts … Governance of Superintelligence by OpenAI
• Governance of Sup… LINKS: Jack Clark Twitter: https://twitter.com/jackclarksf GPT-4 System Card https://cdn.openai.com/papers/gpt-4-s… Tree of Thoughts: https://arxiv.org/pdf/2305.10601.pdf Governance of superintelligence https://openai.com/blog/governance-of… OpenAI Hide and Seek https://openai.com/research/emergent-… Live View of 25 AI Agents in a Village: https://reverie.herokuapp.com/arXiv_D… Less Wrong Post: https://www.lesswrong.com/posts/FdQzA… DeepMind Blog: https://www.deepmind.com/blog/an-earl… Model evaluation for extreme risks https://arxiv.org/abs/2305.15324
TIMELINE:
- [00:00] Intro
- [00:31] Recap
- [01:27] OpenAI Proposal
- [02:21] Google DeepMind Paper
- [04:46] Abrupt Emergence
- [07:48] Existing Benchmarks
- [09:08] “Frontier”
- [13:35] Alignment Eval
- [15:09] Dangerous Capabilities
- [19:12] Safety Workflow
- [21:25] Power Seeking AI
- [24:05] Unknown Capabilities
- [30:43] Hazards