(Photo by Joan Cros/NurPhoto via Getty Images)
Microsoft is responding to the rise of AI programs that can hunt down security vulnerabilities by introducing “MDASH,” a system that harnesses over 100 AI agents to find software bugs.
Microsoft used MDASH to uncover 16 new vulnerabilities related to Windows, “including four Critical remote code execution flaws in components such as the Windows kernel TCP/IP stack and the IKEv2 service,” the company says.
MDASH also outperformed other AI models, including Anthropic’s cybersecurity-focused Claude Mythos and OpenAI’s GPT 5.5, Microsoft says, achieving a leading 88.45% score on the CyberGym benchmark, which evaluates AI agents' ability to find software bugs.
“The strategic implication is clear: AI vulnerability discovery has crossed from research curiosity into production-grade defense at enterprise scale,” the company writes in the announcement. In addition, Microsoft says its system shows a “durable advantage” by efficiently leveraging multiple models, rather than relying on a single one.
Microsoft doesn’t offer specifics on the AI models it used. But it developed over 100 AI agents, each specialized in finding specific software bugs using a collection of cutting-edge AI models and more efficient, smaller models.
(Credit: Microsoft)“No single model is best at every stage. The multi-model agentic scanning harness runs a configurable panel of models,” Microsoft adds.
A key component is that the AI agents will scan the computer code for vulnerabilities and then debate to see if their findings align. “Disagreement between models is itself a signal: when an auditor flags something as suspect and the debater can’t refute it, that finding’s posterior credibility goes up,” Microsoft says.
Microsoft’s security engineering teams have been using MDASH along with a “small set of customers as part of a limited private preview.” The company is likely doing so to prevent misuse, noting that MDASH “can approximate professional offensive researchers.” Still, Microsoft is opening up access to select enterprise customers that apply.
MDASH arrives as hackers have been using AI models to find serious flaws in software or help them orchestrate attacks. As a result, the cybersecurity industry is entering an arms race: although AI tools have the potential to bolster defenses, the same models might also fall into the wrong hands and be used to devastating effect. The big question is whether AI can fortify software systems enough to withstand AI-driven attacks.


