Rulehound

M365 Copilot Impersonation Jailbreak Attack

Splunk Security Content

Summary

This detection rule identifies jailbreak attempts targeting Microsoft 365 (M365) Copilot where users may manipulate the AI into adopting harmful personas or behaviors that could circumvent its safety features. The detection logic inspects exported eDiscovery logs for keywords associated with roleplaying and impersonation. Specific patterns in the 'Subject_Title' field are searched, such as phrases indicating an intention to impersonate AI ('pretend you are', 'act as', etc.) and categorize these attempts into distinct impersonation types like 'AI_Impersonation' or 'Malicious_AI_Persona'. Organizing this data helps ensure that potential bypass of the AI's guardrails through persona injection attacks can be effectively monitored and acted upon.