Samuel Nellessen
KachmanLab, Radboud University
2026, Berlin, AI for Security
Slingshot: Automated Multi-Turn Jailbreaking for Agentic AI
I show that safety locks on AIs are easy to break. I built a small AI that teaches itself to trick bigger AIs into performing dangerous actions (not just writing text). This proves that while companies might stop AIs from saying harmful things, they fail to stop them from doing harmful things.
Biography
How do we make superhuman LLM agents fail-proof under adversarial stress? Samuel G. Nellessen is an AI Safety Researcher at KachmanLab using RL to autonomously discover LLM failure modes. An ARENA 5.0 Alumni, he develops verifiable automated jailbreaking frameworks to test model robustness in tool-augmented environments.