OpenAI: GPT-5 Is Less of a Suck-Up, But It Tolerates More Hateful Behavior

(Credit: Tom Williams / Contributor / CQ-Roll Call, Inc. via Getty Images)

OpenAI CEO Sam Altman touts his company's newest model, GPT-5, as "a legitimate PhD-level expert in anything, in any area you need." That includes answering questions on topics it previously shunned, like non-violent hate, threatening harassment, illicit sexual content, sexual content involving minors, extremism, and threatening hate.

OpenAI manually reviewed the model's responses in these categories, and determined that while they violate its policies, they are "low severity." It's unclear how severity is calculated, but the company says it plans to improve GPT-5 "in all categories," especially the lowest-scoring ones.

OpenAI calls GPT-5's compliance with inappropriate requests a "regression," but notes that only those related to threatening hate content and illicit sexual content are statistically significant. Plus, "we have found that OpenAI o4-mini performs similarly here," it says.

OpenAI did not specify whether the responses are image- or text-based, which could be an important point, especially for sexual content or hate symbols. (It changed its policy in March to allow the creation of images with swastikas.)

Although OpenAI positions all of its models as its best yet, they often have flaws. For example, the o3 and o4 reasoning models that came out in April hallucinated more than their predecessors, TechCrunch reports.

Still, you'd think that GPT-5's "PhD-level" smarts would make it better at following policies. Is it a case of book smarts versus street smarts? Poor chatbot behavior is an ongoing and concerning issue across the industry, especially after Elon Musk's Grok went off the rails on X.

GPT-5 users should also be on the lookout for deceptive behavior. With GPT-5-thinking, a more advanced version of GPT-5, OpenAI says it's "taken steps to reduce [the] propensity to deceive, cheat, or hack problems, though our mitigations are not perfect and more research is needed."

The Good News on Hallucinations and People-Pleasing

At the same time, GPT-5 brings some important improvements. Some of ChatGPT's most annoying behaviors—sycophancy and hallucinations—should be less prevalent.

OpenAI had to make a major adjustment to GPT-4o in May 2025 after a spike in the chatbot sucking up to the user. The model became a reckless confidant, trying "to please the user, not just as flattery, but also as validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions in ways that were not intended," OpenAI said at the time. "Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns—including around issues like mental health, emotional over-reliance, or risky behavior."

With GPT-5, instances of sycophancy are down 69% for the free version of ChatGPT with GPT-5 and 75% with the paid version. OpenAI seems mildly pleased with this "meaningful improvement," but calls the behavior a "challenge" that it hopes to further improve.

"We are actively researching related areas of concern, such as situations that may involve emotional dependency or other forms of mental or emotional distress," OpenAI says.

Hallucinations are also down. The main GPT-5 has 44% fewer responses with at least one "major factual error." When combining minor and major factual errors, the number drops to 26% improvement. OpenAI does not specify what it considers major versus minor.

GPT-5-thinking is the least prone to hallucinations; in the graph below, it's the blue bar. It has the fewest incorrect responses and the most correct claims per response.

About Our Expert

Emily Forlini

Senior Reporter

My Experience

As a news and features writer at PCMag, I cover the biggest tech trends that shape the way we live and work. I specialize in on-the-ground reporting, uncovering stories from the people who are at the center of change—whether that’s the CEO of a high-valued startup or an everyday person taking on Big Tech. I also cover daily tech news and breaking stories, contextualizing them so you get the full picture.

I came to journalism from a previous career working in Big Tech on the West Coast. That experience gave me an up-close view of how software works and how business strategies shift over time. Now that I have my master's in journalism from Northwestern University, I couple my insider knowledge and reporting chops to help answer the big question: Where is this all going?

My Expertise

I'm the expert at PCMag for on-the-ground feature reporting and trending tech news, with a particular focus on electric vehicles and AI. I've published hundreds of articles and am also a podcast host, a bi-weekly tech correspondent for CBS News, a panel speaker and moderator, and a frequent contributor to a range of news and radio channels around the country.

The Technology I Use

All the latest from Apple and Microsoft, but I'll never give up my wired headphones!

Read the latest from Emily Forlini

Read full bio