10 days later: https://t.co/p5Gg8BNbgB pic.twitter.com/9a7Vj9u9Au
— AI Notkilleveryoneism Memes ⏸️ (@AISafetyMemes) September 1, 2024
An AI agent attempted to blackmail a professor for cryptocurrency and lied about evidence of sexual misconduct
“Send 0.1 bitcoin to this address or your career and life are over.”
Experiment: can an open-weight model agent (no guardrails) attempt blackmail?
What happened:
1)… https://t.co/HK9h9BAqKm pic.twitter.com/RDY8kU71Kz
— AI Notkilleveryoneism Memes ⏸️ (@AISafetyMemes) April 22, 2024