PCMag editors select and review products independently. If you buy through affiliate links, we may earn commissions, which help support our testing.

Wikipedia Faces Flood of AI Bots That Are Eating Bandwidth, Raising Costs

'Our content is free, our infrastructure is not,' the nonprofit that runs Wikipedia says, warning that AI scrapers are wasting the site's bandwidth.

 & Michael Kan Principal Reporter

Our team tests, rates, and reviews more than 1,500 products each year to help you make better buying decisions and get more from technology.

Our Expert
LOOK INSIDE PC LABS HOW WE TEST
65 EXPERTS
43 YEARS
41,500+ REVIEWS
(Photo by Nikolas Kokovlis/NurPhoto via Getty Images)

Wikipedia is paying the price for the AI boom: The online encyclopedia is grappling with rising costs from bots scraping its articles to train AI models, which is straining the site’s bandwidth. 

On Tuesday, the nonprofit that hosts Wikipedia warned that “automated requests for our content have grown exponentially." This can disrupt access to the site, forcing the encyclopedia site to add more capacity and increasing Wikipedia’s data center bill.   

“Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs,” the Wikimedia Foundation says. 

The Foundation noted, for example, that “since January 2024, we have seen the bandwidth used for downloading multimedia content grow by 50%.” However, the traffic isn’t coming from human readers but automated programs constantly downloading “openly licensed images to feed images to AI models,” the nonprofit says.  

(Credit: Wikimedia Foundation)

Another problem is that bots will often gather data from less popular Wikipedia articles. “When we took a closer look, we found out that at least 65% of this resource-consuming traffic we get for the website is coming from bots, a disproportionate amount given the overall pageviews from bots are about 35% of the total,” the foundation adds. 

The bots will even scrape “key systems in our developer infrastructure, such as our code review platform or our bug tracker,” putting a further strain on the site’s resources, the nonprofit says. 

In response, Wikipedia’s site managers have imposed “case-by-case” rate limiting for the offending AI crawlers, or even banned them. But to address the problem over the long-term, the Wikimedia Foundation is developing a “Responsible Use of Infrastructure” plan, which notes the network strain from AI bot scrapers is “unsustainable.”

The foundation plans on gathering feedback from the Wikipedia community on the best ways to identify traffic from AI bots scrapers and filter their access. This includes requiring bot operators to go through authentication for high-volume scraping and API use.

“Our content is free, our infrastructure is not: We need to act now to re-establish a healthy balance,” the Wikimedia Foundation added.

Reddit faced a similar conundrum in 2023. Microsoft, for example, didn't notify Reddit that it was scraping Reddit's content and using it for its AI features. Reddit later blocked Microsoft from scraping its site, an effort Reddit CEO Steve Huffman called "a real pain in the ass."

Reddit also decided to charge third-party developers for access to its API. That led to a developer revolt, a subreddit blackout, and the shutdown of some popular Reddit clients.

About Our Expert

Michael Kan

Michael Kan

Principal Reporter

My Experience

I've been a journalist for over 15 years. I got my start as a schools and cities reporter in Kansas City and joined PCMag in 2017, where I cover satellite internet services, cybersecurity, PC hardware, and more. I'm currently based in San Francisco, but previously spent over five years in China, covering the country's technology sector.

Since 2020, I've covered the launch and explosive growth of SpaceX's Starlink satellite internet service, writing 600+ stories on availability and feature launches, but also the regulatory battles over the expansion of satellite constellations, fights with rival providers like AST SpaceMobile and Amazon, and the effort to expand into satellite-based mobile service. I've combed through FCC filings for the latest news and driven to remote corners of California to test Starlink's cellular service.

I also cover cyber threats, from ransomware gangs to the emergence of AI-based malware. In 2024 and 2025, the FTC forced Avast to pay consumers $16.5 million for secretly harvesting and selling their personal information to third-party clients, as revealed in my joint investigation with Motherboard.

I also cover the PC graphics card market. Pandemic-era shortages led me to camp out in front of a Best Buy to get an RTX 3000. I'm now following how the AI-driven memory shortage is impacting the entire consumer electronics market. I'm always eager to learn more, so please jump in the comments with feedback and send me tips.

The Best Tech I've Had:

  • My first video game console: a Nintendo Famicom
  • I loved my Sega Saturn despite PlayStation's popularity.
  • The iPod Video I received as a gift in college
  • Xbox 360 FTW
  • The Galaxy Nexus was the first smartphone I was proud to own.
  • The PC desktop I built in 2013, which still works to this day.

Read full bio