Agents of Chaos

Shapira, Natalie; Wendler, Chris; Yen, Avery; Sarti, Gabriele; Pal, Koyena; Floody, Olivia; Belfki, Adam; Loftus, Alex; Jannali, Aditya Ratan; Prakash, Nikhil; Cui, Jasmine Jisong; Rogers, Giordano; Brinkmann, Jannik; Rager, Can; Zur, Amir; Ripa, Michael; Sankaranarayanan, Aruna; Atkinson, David; Gandikota, Rohit; Fiotto-Kaufman, Jaden; Hwang, EunJeong; Orgad, Hadas; Sahil, P Sam; Taglicht, Negev; Shabtay, Tomer; Ambus, Atai; Alon, N.; Oron, Shiri; Gordon-Tapiero, Ayelet; Kaplan, Yotam; Shwartz, Vered; Shaham, Tamar Rott; Riedl, Christoph; Mirsky, Reuth; Sap, Maarten; Manheim, David; Ullman, Tomer; Bau, David

doi:10.48550/arxiv.2602.20021

preprintarXiv (Cornell University)Feb 23, 2026GREEN OA

Agents of Chaos

NSNatalie Shapira CWChris Wendler AYAvery Yen GSGabriele Sarti KPKoyena Pal

Indexed indatacite

Abstract

We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy, tool use, and multi-party communication, we document eleven representative case studies. Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource…

Citation impact

5

total citations

FWCI: —
Percentile: —
References: 0

Too recent for citation history.

Authors

38

Topics & keywords

Topics

Keywords

Warrant
Adversarial system
Task (project management)
Software deployment
Identity (music)
Resource (disambiguation)
Event (particle physics)
State (computer science)

No related works found for this paper.