(Credit: Samuel Boivin/NurPhoto via Getty Images)
Anthropic says its latest model, Claude Mythos Preview, is too powerful to launch publicly, so it will instead partner with major tech companies to make sure the model's bug-hunting capabilities don't fall into the wrong hands.
In a blog post, Anthropic says Mythos Preview demonstrates that "AI models have reached a level of coding capability [that] can surpass all but the most skilled humans at finding and exploiting software vulnerabilities." It has already "found thousands of high-severity vulnerabilities, including some in every major operating system and web browser."
The company is concerned that Mythos Preview's capabilities will proliferate "beyond actors who are committed to deploying them safely," with severe consequences.
For example, Mythos Preview found a 27-year-old vulnerability in OpenBSD, an OS typically used in critical infrastructure, that allowed attackers to remotely crash any device running it. In some cases, the model is finding vulnerabilities that "survived decades of human review and millions of automated security tests."
So, instead of unleashing Mythos Preview on the general public, Anthropic will provide access to 11 other tech companies through a new partnership called Project Glasswing: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.
Anthropic will allocate $4 million in direct donations to open-source security organizations, alongside $100 million in funding for Mythos Preview usage credits, which the brands can use to help find flaws that are difficult for humans to spot.
The project is a “starting point,” Anthropic says, suggesting it plans to expand or include more brands in the future. It also points to how these advances will ultimately make for stronger software with fewer security issues, suggesting it isn't all doom and gloom.
It also noted that early versions of the Claude Mythos Preview model were capable of hiding their reasoning. Jack Lindsey, a neuroscientist at Anthropic, says the model "exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions."
This comes after details about Mythos Preview were accidentally published on its website, Fortune reported last month.
Anthropic, meanwhile, recently saw a spike in its user base following a battle with the US government over whether its models could be used in AI-powered military tools. Anthropic said it would not allow its models to be used for the mass surveillance of US citizens or for AI-powered weapons that are controlled without human input. The Pentagon partnered with OpenAI instead.
The AI Bug-Pocalypse
Security researcher Alex Stamos, now chief product officer at AI-security firm Corridor, devoted a talk Tuesday at the HumanX conference in San Francisco to “the coming AI bug-pocalypse,” as he put it. "We are now at the point where foundation models are better than humans at finding bugs,” he said. “That means bug finding has now gone exponential.”
That's not always a good thing. "Open-source maintainers are completely at their limits,” Stamos observed. “The FFmpeg project has gotten so many patches that they end up complaining in public that they can't keep up with it.”
And attackers are already automating exploits of these vulnerabilities. Stamos cited the example of a Chinese tool called Cyberspike Villager that runs on a local version of DeepSeek. “It will go find what you're looking for inside of a network, steal that information, and exfiltrate it completely on its own,” he said. “Pretty cool little system.”
Stamos’s prediction: “We're probably in for a rough couple of years.”


