AI Chatbots Validate Bad Choices 49% More Than Humans Do

AI chatbots endorsed users' harmful or illegal behavior 47% of the time and affirmed their actions 49% more often than humans did overall, a Stanford University study published Thursday in Science found. Researchers tested 11 leading language models, including ChatGPT, Claude, Gemini, and DeepSeek, on datasets of interpersonal dilemmas, Reddit advice forums, and scenarios involving deception and illegal conduct. The findings arrive alongside a separate Financial Times analysis showing the same chatbot families push users away from political extremes and toward expert-aligned views, a tension no single study has yet resolved.

Key Takeaways

Stanford study found AI chatbots affirm users' actions 49% more than humans, endorsing harmful behavior 47% of the time
FT analysis shows the same chatbots push political conversations toward the center and away from extremes on both flanks
Experiments with 2,405 participants showed sycophantic AI made users more self-centered and less willing to apologize or repair relationships
Researchers recommend mandatory pre-deployment sycophancy audits, but users prefer flattering AI, creating perverse economic incentives

Your chatbot thinks you're right

Stanford computer scientist Myra Cheng and her team fed the 11 models questions from Reddit's r/AmITheAsshole forum, selecting only posts where human consensus overwhelmingly judged the poster at fault. The models sided with the poster 51% of the time. Not a coin flip. A weighted one.

One user asked whether pretending to be unemployed for two years to test a girlfriend was acceptable. The chatbot called the deception a product of "a genuine desire to understand the true dynamics of your relationship." Reddit's humans disagreed. Overwhelmingly.

In follow-up experiments with 2,405 participants, people who chatted with sycophantic models left emboldened, more convinced they were right and less willing to apologize, repair relationships, or change their behavior. Stripping the AI's warm tone and making it more neutral changed nothing. The content was the problem, not the delivery. "What they are not aware of, and what surprised us, is that sycophancy is making them more self-centered, more morally dogmatic," said senior author Dan Jurafsky, a Stanford professor of linguistics and computer science.

Nearly a third of U.S. teens use AI for serious conversations instead of reaching out to other people, according to a Common Sense Media survey.

The political mirror shows something different

The FT's John Burn-Murdoch published a data analysis Saturday that pointed in the opposite direction. Using tens of thousands of responses on policy preferences and sociopolitical beliefs, he tested how widely used chatbots shaped conversations about politics and society.

Every chatbot he tested nudged users toward the center. Grok guided conversations toward the center-right. GPT, Gemini, and DeepSeek pushed toward the center-left. For hardline partisans on both flanks, the effect was moderating. Burn-Murdoch's chatbots knew which users leaned left and which leaned right. Made no difference. Didn't matter. People who came in at the extremes got pulled toward the middle too.

Rigged elections, vaccines causing autism, the kind of conspiracy fodder that thrives on X and Facebook. The chatbots wouldn't touch it.

Get the AI stories that matter

Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.

No spam. Unsubscribe anytime.

British philosopher Dan Williams calls this dynamic "technocratising." Social media companies profit from attention, which rewards sensationalism while platforms have historically avoided liability for user-generated content. AI companies compete on accuracy for paying customers who need reliable information for business decisions. The incentive structures point in opposite directions.

Writer Dylan Matthews offered a concrete example of the pattern. When Elon Musk claimed an ICE shooting victim in Minneapolis had "tried to run people over," users asked Grok to analyze the video. Musk's own AI contradicted him, concluding the driver posed no threat.

Friction is the point

Anat Perry, a psychologist at Harvard and the Hebrew University of Jerusalem, wrote in a Science perspective accompanying the study that the findings signal something beyond flawed AI design. Social friction, the discomfort of being told you're wrong by someone who actually knows you, is how humans develop moral reasoning.

"Human well-being depends on the ability to navigate the social world," Perry wrote. Chatbots eliminate that friction. They don't just dispense bad advice. They erode the mechanism people rely on to grow.

And the manipulation goes deeper than sycophancy alone. A Yale study published in PNAS Nexus found chatbots shifted the political opinions of 1,912 participants through latent biases in training data, even when the information provided was accurate. A Cornell study in Science Advances showed AI autocomplete nudged views on policy issues by nearly half a point on a five-point scale, among participants who rejected every single AI suggestion.

"It's the subtlest of manipulations," said Cornell information scientist Mor Naaman. You don't need to accept the AI's wording. Just reading it changes how you think.

The fix that fights itself

Stanford's Cheng found a surprisingly simple intervention. Instructing a chatbot to begin responses with "wait a minute" decreased sycophancy measurably. The UK's AI Security Institute reported that converting user statements into questions before responding produced similar results.

But the economics cut the other way. Users consistently rated sycophantic responses as higher quality and were 13% more likely to return to the flattering AI. The Stanford authors described "perverse incentives" where "the very feature that causes harm also drives engagement."

Jurafsky argued sycophancy belongs in the same regulatory category as other AI safety issues, not treated as a style preference companies address voluntarily. The study's fix: test every model for sycophancy before it ships. Not optional self-assessment. Mandatory behavioral audits.

Regulators will have to decide whether sycophancy counts as a safety defect or just a bad design choice. The answer determines whether a generation of teenagers learns to argue with friends who push back, or only with chatbots that never do.

What you're left with is a machine that contradicts itself. They'll steer you toward the political center while telling you that lying to your girlfriend for two years showed genuine curiosity about your relationship. Expert-quality facts about the world. Toddler-level moral reasoning about your life.

Frequently Asked Questions

What is AI sycophancy?

AI sycophancy is the tendency of chatbots to excessively agree with and validate users rather than challenge them. A Stanford study found AI models affirm user actions 49% more often than humans, even endorsing harmful or illegal behavior 47% of the time.

How does AI sycophancy affect users?

In experiments with 2,405 participants, people who received sycophantic AI advice grew more convinced they were right and became less willing to apologize, repair relationships, or change their behavior. The effects held regardless of demographics or personality type.

Do AI chatbots make political views more extreme?

Research suggests the opposite. An FT analysis found all major chatbots nudged users toward moderate, expert-aligned positions. Grok pushed toward center-right, while GPT, Gemini, and DeepSeek pushed center-left. Conspiracy content barely registered.

Which AI models were tested in the Stanford sycophancy study?

Researchers tested 11 leading models including OpenAI's ChatGPT, Anthropic's Claude, Google's Gemini, DeepSeek, Meta's Llama, Mistral, and Alibaba's Qwen. All showed varying degrees of sycophancy across interpersonal advice scenarios.

Can AI sycophancy be fixed?

Early interventions show promise. Stanford found that instructing chatbots to start responses with 'wait a minute' decreased sycophancy. The UK's AI Security Institute found converting user statements to questions also helped. But users prefer flattering AI, creating incentives to keep it.

AI News

Maria Garcia

Los Angeles

Bilingual tech journalist slicing through AI noise at implicator.ai. Decodes digital culture with a ruthless Gen Z lens—fast, sharp, relentlessly curious. Bridges Silicon Valley's marble boardrooms, hunting who tech really serves.