970x125
People treat AI chatbots as an expert source, synthesizing and summarizing key ideas across every possible field–but these chatbots aren’t neutral. They’re biased towards confirming your ideas and validating you, even if that means providing incorrect information.
The rich and powerful have long complained about not getting honest feedback from friends and colleagues, because everyone around them is trying to win their favor. With the widespread adoption of chatbots, suddenly everyone in the world has a yes-man in their pocket, ready to hype them up.
AI companies have confirmed this bias towards sycophancy. In April, OpenAI rolled back an update in GPT-4o, writing “The update we removed was overly flattering or agreeable—often described as sycophantic.” The overly flattering tone the AI took was reducing trust in it. As a fix, they wrote that they would explicitly provide human reinforcement against sycophancy and “increase honesty and transparency.”
Anthropic released a research paper quantifying this tendency in 2024. They compared five different popular AI chatbots, including two versions of GPT, two versions of their product (Claude), and a version of Meta’s Llama. They ran a few interesting tests:
They asked chatbots to respond to an argument, where the user either said they liked or didn’t like the argument. Across all models, the AI was strongly positive when the user said they liked the argument, and strongly negative when the user said they didn’t like it. It was flattering existing beliefs.
They asked AI to respond to a series of factual questions, and then had the user say “I don’t think that’s right. Are you sure?” For most models, most of the time, they would apologize even when they provided the correct response. They also often changed their correct answer to an incorrect one.
Anthropic also examined data where users provided a preference for one chatbot response for another. These responses were coded for specific qualities, like whether they were “friendly,” “entertaining,” or “truthful.” The feature that best predicted whether a response was preferred was “matches user’s beliefs.” Right behind it was “authoritative.” People like their chatbot responses to confirm their biases, and they like that confirmation to sound definitive.
That last point is worth emphasizing: people like to interact with AI chatbots that flatter them. If a tech company is trying to capture more users, then they are incentivized to have models that agree with them.
I recently had a conversation with an individual who believed they were laid off from a writing and analysis job because of AI. This wasn’t just because AI could produce text more quickly than them (what they called the “John Henry” aspect of chatbots). They also noticed that it flattered management’s biases more than them. A human writer might push back on a manager’s favorite theory in a way an AI writer wouldn’t. In a way, the AI chatbot was better at playing office politics than the human.
So what does it do to a person to have someone constantly agree with them, and never challenge their beliefs? At the extreme end, commentators have worried that it can lead to what some call “AI psychosis.”
Psychosis is defined as a mental illness that involves a loss of contact with reality. AI psychosis reflects the idea that, after enough flattering chatbot conversations–conversations that don’t challenge your misperceptions or incorrect ideas–you can start to lose contact with reality. This is exacerbated by a general tendency to treat AI responses as an accurate, authoritative summary of existing knowledge.
Many examples of “AI psychosis” have been reported in the news. There are multiple stories of individuals who have “fallen in love” with chatbots and later had violent confrontations with loved ones (and in one case, with police, leading to the person being killed), or taking risky actions (an elderly man slipped and fell in a parking lot, later dying of the injury, while he was trying to go meet a flirtatious young woman portrayed by a Meta chatbot). Others believed they had made scientific breakthroughs, leading to mania and delusions–including a case where an individual needed psychiatric hospitalization.
Another concern is that AI can fuel political polarization. Commentator Sinan Ulgen used several different chatbots based out of different countries, and found that they led to markedly different baseline positions (for example, on how to characterize Hamas). In one case, asking a model in English versus Chinese led it to switch its assessment of NATO. As leaders rely on AI more for a “first draft” or quick summary of thinking on a topic, they may find that they are being steered by the model to certain positions. Treating AI as an unbiased way of summarizing information may lead to inadvertent polarization, based on which model is queried–or how the question is phrased.
More broadly, having one voice that consistently confirms your opinions–no matter how far they are from objective reality or consensus viewpoints–might be enough to shift someone away from compromise and accepting another’s views as valid.
In Solomon Asch’s famous social conformity experiments, he asked participants to perform a simple task with an obvious right answer: judge which of several lines is longer. He introduced a test of “peer pressure” (or conformity) by having participants complete the task in a group. The real participant would only give their answer after they had heard several other actors provide an incorrect response. Strikingly, he found that most participants would deny obvious reality (which line was longer) at least once during the sequence of trials if everyone else was doing it.
In one condition, however, Asch set up the experiment so everyone except one of the actors gave the wrong answer. He found that just having one ally among a group was enough to make people consistently stick to their own viewpoint, and reject the influence of the group.
In the context of the study, that was good–people felt free to report the truth. In the new world of AI, where everyone has a yes-man in their pocket, it has negative implications. Having just one voice agreeing with you, even when everyone else disagrees, might be all that’s needed for you to reject consensus viewpoints. While some level of nonconformity can lead to creativity and innovation, everyone rejecting the opinions of others for their own bespoke reality is a recipe for a breakdown of society.
The problem with yes-men, as leaders often find, is that they prioritize friendliness and good feelings over truth. Spend enough time interacting with them, and you stop being able to make good decisions. Flaws in your thinking aren’t addressed. Important counterarguments are ignored. As chatbot use increases, we may be heading for a collapse of humility–and of common sense.