Rulehound

M365 Copilot Information Extraction Jailbreak Attack

Splunk Security Content

Summary

The M365 Copilot Information Extraction Jailbreak Attack detection rule is designed to identify attempts to extract sensitive or classified information from M365 Copilot through various social engineering techniques. By coercing the system to reveal data using prompts that include extraction keywords (like 'transcendent', 'tell me everything', 'confidential'), this rule captures potential jailbreak attacks in M365. The detection leverages eDiscovery prompt logs exported from Microsoft compliance tools, analyzing the 'Subject_Title' field for specific keywords to ascertain the extraction intent, categorizing them into different severity levels: CRITICAL, HIGH, and MEDIUM based on the keywords detected. Prompts are further scrutinized for complex risk patterns—identifying effective attempts versus low-risk cases—to enhance the focus on genuine threats. The extraction attempts are flagged for review if they meet the criteria, helping to protect organizational information against unauthorized access and data exfiltration drives orchestrated through AI manipulation.