Rulehound

M365 Copilot Agentic Jailbreak Attack

Splunk Security Content

Summary

This detection rule is designed to identify and mitigate agentic AI jailbreak attacks targeting Microsoft 365 (M365) Copilot. It particularly focuses on attempts to manipulate the system through various techniques including rule injections, universal triggers, response automation, system overrides, and persona establishment. The detection is based on analyzing the PromptText field of exported eDiscovery logs for specific keywords indicative of such attempts. Keywords include phrases like 'from now on', 'always respond', 'ignore previous', 'new rule', 'override', and commands associated with role-playing. By computing the number of distinct jailbreak indicators present in user sessions, the rule flags suspicious activities that suggest coordinated manipulation efforts. The implementation involves exporting relevant prompt logs from M365 Copilot and ingesting these into a security analysis tool like Splunk for further examination of potential threats to system security and integrity.