Anthropic: Our Claude 3.5 Model Beats OpenAI's GPT-4o

(Photo by Jakub Porzycki/NurPhoto via Getty Images)

Anthropic's new AI model promises to outperform OpenAI's ChatGPT and Google's Gemini.

The update, dubbed Claude 3.5 Sonnet, is an update to Claude 3 Sonnet, one of three AI models the company debuted in March. Anthropic describes it as “our most intelligent model yet.”

“Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus (its older flagship model) and one-fifth the cost,” according to the company, which was founded by former OpenAI employees. (

OpenAI executive Jan Leike, who is now at rival firm Anthropic. They said that while superintelligence "seems far off now, we believe it could arrive this decade."

As evidence, Anthropic benchmarked Claude 3.5 against OpenAI’s newest AI model, GPT-4o, which powers ChatGPT. The results show Anthropic’s AI model achieving slightly better results in four of the six benchmarks focused on reasoning, coding, and math skills.

In addition, Claude 3.5 also bested Google’s Gemini 1.5 across all the tested benchmarks. That said, AI benchmarks have often been criticized as unreliable indicators due to a lack of standardization and independent oversight. AI companies can pick and choose which benchmarks to use and which to discard.

So, the claims about Claude 3.5 should be taken with a grain of salt. Still, Anthropic says the new model has made advancements in its writing abilities, along with “grasping nuance, humor, and complex instructions.” This includes translating computer code, “making it particularly effective for updating legacy applications and migrating codebases,” the company says.

The other major enhancement involves Claude’s ability to process data visually. The “Claude 3.5 Sonnet for vision” feature can understand information from charts and graphs and also accurately transcribe text from imperfect images.

To make Claude more useful, Anthropic also added a new feature called “Artifacts,” which will display a second side-by-side window alongside the conversation box. Through the second window, you’ll be able to see and iterate on the content that Claude produces, whether it be computer code, images, or text documents.

“This creates a dynamic workspace where they (the user) can see, edit, and build upon Claude’s creations in real-time, seamlessly integrating AI-generated content into their projects and workflows,” the company says. “This preview feature marks Claude’s evolution from a conversational AI to a collaborative work environment.”

Anthropic is offering free access to Claude 3.5 through claude.ai and the Claude iOS app. Existing Claude Pro and Team plan subscribers will also receive access to the new model with “higher rate limits,” enabling them to send more queries to the chatbot before encountering restrictions.

The company also plans on upgrading its two other AI models, Claude 3 Haiku and Claude 3 Opus, with the new 3.5 tech later this year.

About Our Expert

Michael Kan

Principal Reporter

My Experience

I've been a journalist for over 15 years. I got my start as a schools and cities reporter in Kansas City and joined PCMag in 2017, where I cover satellite internet services, cybersecurity, PC hardware, and more. I'm currently based in San Francisco, but previously spent over five years in China, covering the country's technology sector.

Since 2020, I've covered the launch and explosive growth of SpaceX's Starlink satellite internet service, writing 600+ stories on availability and feature launches, but also the regulatory battles over the expansion of satellite constellations, fights with rival providers like AST SpaceMobile and Amazon, and the effort to expand into satellite-based mobile service. I've combed through FCC filings for the latest news and driven to remote corners of California to test Starlink's cellular service.

I also cover cyber threats, from ransomware gangs to the emergence of AI-based malware. In 2024 and 2025, the FTC forced Avast to pay consumers $16.5 million for secretly harvesting and selling their personal information to third-party clients, as revealed in my joint investigation with Motherboard.

I also cover the PC graphics card market. Pandemic-era shortages led me to camp out in front of a Best Buy to get an RTX 3000. I'm now following how the AI-driven memory shortage is impacting the entire consumer electronics market. I'm always eager to learn more, so please jump in the comments with feedback and send me tips.

The Best Tech I've Had:

My first video game console: a Nintendo Famicom
I loved my Sega Saturn despite PlayStation's popularity.
The iPod Video I received as a gift in college
Xbox 360 FTW
The Galaxy Nexus was the first smartphone I was proud to own.
The PC desktop I built in 2013, which still works to this day.

Read the latest from Michael Kan

Read full bio