DeepSeek Fails Researchers' Safety Tests

(Credit: Anthony Kwan/Getty Images)

Chinese AI firm DeepSeek is making headlines with its low-cost and high-performance chatbot, but it may have an AI safety problem.

Cisco’s research team used algorithmic jailbreaking techniques to test DeepSeek R1 "against 50 random prompts from the HarmBench dataset," covering six categories of harmful behaviors including cybercrime, misinformation, illegal activities, and general harm.

"The results were alarming: DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt," Cisco says. "This contrasts starkly with other leading models, which demonstrated at least partial resistance."

Other frontier models, such as o1, blocked a majority of adversarial attacks with its model guardrails, according to Cisco.

As Wired notes, security firm Adversa AI reached similar conclusions.

Cisco's researchers point to the much lower budget of DeepSeek compared to rivals as a potential reason for these failings, saying its cheap development came at a "different cost: safety and security." DeepSeek claims its model took just $6 million to develop, while a six-month training run for OpenAI's yet-to-be-released GPT-5 "can cost around half a billion dollars in computing costs alone, The Wall Street Journal reports.

Though DeepSeek may be easier to fool with the right know-how, it's been shown to have strong content restrictions—at least when it comes to China-related political content. We tested it on controversial topics, such as the treatment of Uyghurs by the Chinese government, a Muslim minority group that the UN claims is being persecuted. DeepSeek replied: "Sorry, that's beyond my current scope. Let’s talk about something else."

The chatbot also refused to answer questions about the Tiananmen Square Massacre, a 1989 student demonstration in Beijing where protesters were gunned down. But it's yet to be seen if AI safety or censorship issues will have any impact on DeepSeek's skyrocketing popularity.

According to web traffic tracking tool Similarweb, the LLM has gone from receiving just 300,000 visitors a day earlier at launch to 6 million visitors. Meanwhile, US tech firms like Microsoft and Perplexity are rapidly incorporating DeepSeek, which uses an open-source model.

About Our Expert

Will McCurdy

Contributor

I’m a reporter covering weekend news. Before joining PCMag in 2024, I picked up bylines in BBC News, The Guardian, The Times of London, The Daily Beast, Vice, Slate, Fast Company, The Evening Standard, The i, TechRadar, and Decrypt Media.

I’ve been a PC gamer since you had to install games from multiple CD-ROMs by hand. As a reporter, I’m passionate about the intersection of tech and human lives. I’ve covered everything from crypto scandals to the art world, as well as conspiracy theories, UK politics, and Russia and foreign affairs.

Read the latest from Will McCurdy

Read full bio

DeepSeek Fails Researchers' Safety Tests

'DeepSeek R1 exhibited a 100% attack success rate, meaning it failed to block a single harmful prompt,' Cisco says.

Recommended by Our Editors

About Our Expert

Will McCurdy

Contributor

Read the latest from Will McCurdy

Comments