What are the key points?

OpenAI launched Lockdown Mode to prevent data exfiltration following its initial announcement in February. The feature restricts outbound network requests to block data transmission triggered by prompt injection attacks. Lockdown Mode is available for Free, Go, Plus, Pro, and self-serve ChatGPT Business accounts.

OpenAI Launches Security-Focused Lockdown Mode

•OpenAI launched Lockdown Mode to prevent data exfiltration following its initial announcement in February.
•The feature restricts outbound network requests to block data transmission triggered by prompt injection attacks.
•Lockdown Mode is available for Free, Go, Plus, Pro, and self-serve ChatGPT Business accounts.

OpenAI has officially launched its new Lockdown Mode, a security feature aimed at mitigating data exfiltration risks within ChatGPT. First teased by the company in February, the feature is now rolling out to a range of personal account tiers, including Free, Go, Plus, and Pro, as well as self-serve ChatGPT Business accounts. Lockdown Mode functions by restricting outbound network requests that could potentially be used to transfer sensitive information to an external attacker. This security layer specifically addresses the final stage of a prompt injection (malicious input designed to manipulate LLM behavior) attack by blocking the exfiltration vector. The company clarifies that this mode does not stop prompt injections from entering the system, such as those embedded in uploaded files or cached web content, meaning the model's accuracy or behavior could still be affected by malicious inputs. Tech analyst Simon Willison notes that this mechanism effectively breaks what he terms the 'Lethal Trifecta'—a scenario where an AI system simultaneously accesses private data, processes untrusted content, and possesses a viable pathway to transmit stolen data to an attacker. By deterministically cutting off these outbound transmission channels, Lockdown Mode provides a defensive barrier that does not rely on AI-based evaluation, which could itself be subverted by sophisticated attacks.