Teams of LLM Agents can Exploit Zero-Day Vulnerabilities. 04 JUNE 2024. University of Illinois Urbana-Champaign.

Every infrastructure in the world will be vulnerable to Autonomous Rogue AI hacks. (AGI??? Be VERY careful what you wish for!)

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

04 JUNE 2024. University of Illinois Urbana-Champaign.

Abstract
LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities). In this work, we show that teams of LLM agents can exploit real-world, zero-day vulnerabilities. Prior agents struggle with exploring many different vulnerabilities and long-range planning when used alone. To resolve this, we introduce HPTSA, a system of agents with a planning agent that can launch subagents. The planning agent explores the system and determines which subagents to call, resolving long-term planning issues when trying different vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and show that our team of agents improve over prior work by up to 4.5×.

Read the Report

FOR EDUCATIONAL AND KNOWLEDGE SHARING PURPOSES ONLY. NOT-FOR-PROFIT. SEE COPYRIGHT DISCLAIMER.

Every infrastructure in the world will be vulnerable to Autonomous Rogue AI hacks. (AGI??? Be VERY careful what you wish for!)

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Share This Story, Choose Your Platform!