PCMag editors select and review products independently. If you buy through affiliate links, we may earn commissions, which help support our testing.

Microsoft AI Employee Accidentally Leaks 38TB of Data

A software repository on GitHub dedicated to supplying open-source code and AI models for image recognition was left open to manipulation by bad actors thanks to an insecure URL.

 & Michael Kan Principal Reporter

Our team tests, rates, and reviews more than 1,500 products each year to help you make better buying decisions and get more from technology.

Our Expert
LOOK INSIDE PC LABS HOW WE TEST
65 EXPERTS
43 YEARS
41,500+ REVIEWS

A misconfigured link accidentally leaked access to 38TB of Microsoft data, opening up the ability to inject malicious code into its AI models.

The finding comes from cloud security provider Wiz, which recently scanned the internet for exposed storage accounts. It found a software repository on Microsoft-owned GitHub dedicated to supplying open-source code and AI models for image recognition. 

On the affected GitHub page, a Microsoft employee had created a URL, enabling visitors to the software repository to download AI models from an Azure storage container. “However, this URL allowed access to more than just open-source models,” Wiz said in its report. “It was configured to grant permissions on the entire storage account, exposing additional private data by mistake.” 

Scans from Wiz Research also indicated the Azure storage container held 38TB of data, including “passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees.”

The URL to the storage container was also created using a powerful “Shared Access Signature” or SAS token, which gave anyone visiting the link—including potential attackers—the ability to view, delete, or overwrite those files. 

“This is particularly interesting considering the repository’s original purpose: providing AI models for use in training code,” Wiz said. “Meaning, an attacker could have injected malicious code into all the AI models in this storage account, and every user who trusts Microsoft’s GitHub repository would’ve been infected by it.”   

Wiz reported this to Microsoft in June, and the company promptly plugged the leak. "No customer data was exposed, and no other internal services were put at risk because of this issue," Microsoft said in its own report.

The company also said the exposed storage container contained backups and internal Microsoft Teams messages belonging to two former Microsoft employees. To prevent any further leaks, Microsoft has been scanning for SAS tokens “that may have overly permissive expirations or privileges” on GitHub.

“This system detected the specific SAS URL identified by Wiz in the ‘robust-models-transfer’ repo, but the finding was incorrectly marked as a false positive,” Microsoft said. “The root cause issue for this has been fixed and the system is now confirmed to be detecting and properly reporting on all over-provisioned SAS tokens.”  

Still, the incident is a reminder to securely configure access to cloud storage accounts, especially those housing large data sets. "As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards,” Wiz added.

The company’s report goes on to detail some of the alleged security pitfalls with SAS tokens on an Azure account. But Microsoft says, “like any key-based authentication mechanism, a SAS can be revoked at any time by rotating the parent key. In addition, SAS supports fine-grained revocation at the container level, without having to rotate storage account keys.”

About Our Expert

Michael Kan

Michael Kan

Principal Reporter

My Experience

I've been a journalist for over 15 years. I got my start as a schools and cities reporter in Kansas City and joined PCMag in 2017, where I cover satellite internet services, cybersecurity, PC hardware, and more. I'm currently based in San Francisco, but previously spent over five years in China, covering the country's technology sector.

Since 2020, I've covered the launch and explosive growth of SpaceX's Starlink satellite internet service, writing 600+ stories on availability and feature launches, but also the regulatory battles over the expansion of satellite constellations, fights with rival providers like AST SpaceMobile and Amazon, and the effort to expand into satellite-based mobile service. I've combed through FCC filings for the latest news and driven to remote corners of California to test Starlink's cellular service.

I also cover cyber threats, from ransomware gangs to the emergence of AI-based malware. In 2024 and 2025, the FTC forced Avast to pay consumers $16.5 million for secretly harvesting and selling their personal information to third-party clients, as revealed in my joint investigation with Motherboard.

I also cover the PC graphics card market. Pandemic-era shortages led me to camp out in front of a Best Buy to get an RTX 3000. I'm now following how the AI-driven memory shortage is impacting the entire consumer electronics market. I'm always eager to learn more, so please jump in the comments with feedback and send me tips.

The Best Tech I've Had:

  • My first video game console: a Nintendo Famicom
  • I loved my Sega Saturn despite PlayStation's popularity.
  • The iPod Video I received as a gift in college
  • Xbox 360 FTW
  • The Galaxy Nexus was the first smartphone I was proud to own.
  • The PC desktop I built in 2013, which still works to this day.

Read full bio