Anthropic keeps new AI model private after it finds thousands of external vulnerabilities
Anthropic’s decision to private-release a highly capable Claude model after discovering thousands of external vulnerabilities demonstrates a disciplined approach to frontier AI. The initiative—referred to in industry chatter as Project Glasswing—highlights a broader trend toward responsible disclosure, security-by-design, and staged exposure of powerful AI capabilities. The decision to pause access until vulnerabilities are addressed reflects a mature governance posture: safety, transparency, and accountability take precedence over speed to market when potential exploits could undermine users and ecosystems. This choice sends a signal to the AI community about the acceptable pace for releasing the most capable models, especially when the risk surface spans across operating systems, browsers, and enterprise environments.
From a risk-management perspective, the move invites greater collaboration across the AI ecosystem: the need for shared vulnerability databases, coordinated disclosure, and standardized incident-response protocols. It also raises policy questions about model licensing, safe-use guidelines, and how to ensure end users understand the limitations and safety boundaries of advanced models. For developers, Mythos and its private status underscore the importance of building with secure defaults, robust testing, and clear governance criteria that guide when and how to expose new capabilities to customers. The story epitomizes the ongoing tension between openness and security in frontier AI—an evolution that will shape future governance frameworks, product roadmaps, and regulatory expectations across the AI industry.