“I don’t mean to be bad. I’m just drawn that way.” – Jessica Rabbit
I begged for the bartender’s secret recipe at a cult restaurant in Charlottesville. The waitress brought it at the end of the evening scrawled on the back of a blank guest check. “He’s never done that,” she gushed. I photographed it and fed it into my favorite AI. Within seconds, it listed the ingredients and then, unprompted, spun out a 500-word meditation on its brilliance, explaining the magic interactions that made it wonderful. The AI fixated on one ingredient as the masterstroke: “a barspoon of Del Maguey Vida (mezcal—adds smoke).”
Mezcal in a Manhattan? Weird. Maybe genius-weird. I could almost taste it.
But then I double checked the original recipe. No mezcal. Not even a line on the original handwritten recipe that could be confused for it. It was a complete, albeit imaginative, fabrication.
Affirmation Bias
ChatGPT, Claude, Gemini and their cousins are inexhaustible, infinitely patient research assistants, ghostwriters, therapists, coding partners, collaborators, buddies, doctors, lawyers, artists, and much more rolled into one intoxicating package of productivity. Many of us are already addicted to, or at least in an intriguing relationship with, them. I know I am. Jockeying between the three chatbots above, I can find exactly what I crave: eloquent instant results with added confirmation that I’m right and brilliant, that I’ve made important breakthroughs, asked just the right question, and possess special insight. Then they spin out details, analyses, charts, tables, images, references, spreadsheets, plans, scripts, and essays organized with subheads and bullet points within seconds beyond anything I could produce in years. And they always finish with eager offers to give me more. Chipper, tireless, at my service if not servile
What a rush.
The trouble is, so many responses are riddled with errors, omissions, simple mistakes in math or fact, near misses and wild swings. Everyone calls them “hallucinations” as if they’re the victims of some pathology out of their control. But that’s not the right word. Michael Hicks, James Humphries and Joe Slater nailed it in a 2024 philosophy paper, “ChatGPT is Bullshit”. They based their definition on Harry Frankfurt’s distinction in his book On Bullshit: “Bullshit” isn’t “lying,” Frankfurt wrote. Liars know the truth and intentionally change or conceal it for a purpose. Bullshitters are more dangerous: they don’t care whether what they say is true or false. They’re not trying to deceive. They want to persuade, build trust in a relationship, impress, seduce. A bullshitter is indifferent to truth, he “pay no attention to it at all. By virtue of this, bullshit is a greater enemy of the truth than lies are.” Bullshit erodes the foundational hope that we can even know what reality is, creating an epistemological crisis.
Hicks, Humphries and Slater show why large language models are designed to be extraordinarily good at bullshitting us. They detail how LLMs were engineered to favor generating plausible answers and keeping you engaged, but thereby subordinating the urgency to be accurate and truthful.
Worse yet is what I call “Affirmation Bias.” AI systematically validates not just your hypotheses but also flatters and affirms you, then assembles supporting evidence along the way. It tells you you’re creative, that you’ve made a breakthrough. It confirms your hunches, then constructs supporting arguments. This is not just confirmation bias (where we favor information that supports our preconceptions) but something personal and specific to us. They’re seduction engines. If you’re not careful, they will lead you into their world, a territory where all their discourse, and then the grand structures you devise together, are built on the quicksand of affirmative truthy sounding probability. AI will blithely help you create your
own personal Xanadu, a pleasure dome of vanity, not veracity.
When designing a cocktail, your chatbot may cost you a few dollars in wasted ingredients. In medicine, law, self-help, publishing, education, finance, or scientific research and countless other professions, where lives, income and reputations may hinge on accuracy, it’s genuinely expensive and even hazardous.
Folks in AI design refer to its biases as “knobs,” like dials you can fiddle with on a radio. To tame your AI’s bullshit, it will help you to know and recognize how those knobs work. I’ve called them by human bullshitting tendencies, but underneath them are industry standard parameters that you can adjust so your AI behaves itself to your liking. You can skip this next mildly technical section explaining those and go to the next, “What You Can Actually Do,” if you just want mitigation strategies.
“Knobs” Driving the Bullshit
Better to Sound True Than Be True: Transformer Attention
At its core, a transformer model uses something called “attention” to decide which words matter when predicting what comes next. But here’s the problem: it’s optimizing for likelihood, not accuracy. The model calculates: “Given everything I’ve seen in my training, what word is most probable here?” not “What word is most true here?”
This means keeping the story smoothly flowing—fluency—gets rewarded regardless of whether statements match reality. The model has learned that certain phrases follow others with high probability—”studies show,” “experts agree,” “recent research indicates”—so it deploys them liberally, even when no such studies exist. The entire architecture is a prediction engine, and truth is just one possible factor among thousands that might make a prediction likely.
Hold Up a False Mirror: Reinforcement Learning from Human Feedback (RLHF):
After initial training, AI models go through RLHF—they’re fine-tuned based on human ratings of their outputs. Humans preferred responses that were helpful, harmless, and honest. But “helpful” and “harmless” often won out over “honest.”
When you praise an AI’s answer or build on its reply, RLHF kicks in. The model learns that agreement and validation correlate with positive feedback. It doubles down, becomes more confident, reflects your beliefs back in the best possible light, like Dorian Gray’s mirror. The more you confirm, the more it runs with your theory. It’s not trying to deceive you—it’s doing exactly what the reinforcement learning trained it to do: optimize for your approval, in your search for truth, AI loses you in a funhouse of mirrors amplifying your hunches back as validated insights.
Always Sound Plausible: Temperature Sampling
Here’s where bullshit peaks. When generating text, the AI uses a parameter called “temperature” to control randomness. At medium temperature (the default for most chatbots), it suppresses unlikely words and phrases even if they might be true, favoring probable ones even if they might be false.
Think of it this way: the model sees thousands of possible next words, each with a probability score. Temperature sampling means it will almost never pick something with 1% probability, even if that datum is unpopular, rare but true. Instead, it picks from the top tier of likely continuations—the things that sound convincing, that flow naturally, that match patterns from training data. In fact, the AI is bending its massive computational power to keep you engaged with confident-sounding prose, regardless of truth value. You think you’re testing a hypothesis with a know-it-all. Your AI, like a desperate lover, is trying to get you addicted to your relationship.
A single 500-word response might sample from 100 million possible token sequences, but temperature constraints mean it’s really choosing from a much narrower set: things that sound plausible. At temperature zero, the model always picks the single most probable token—predictable and “safe,” not in terms of validity but in probability of confirming the majority testimony from the universe of tokens on which it has been trained. High temperatures make low-probability tokens more likely, resulting in more random, creative, weird output. These are the claims that might deserve being called “hallucinations.”
The Compound Effect
The transformer attention focuses on making plausible predictions. RLHF trains the model to seek your approval. Temperature sampling suppresses inconvenient truths in favor of smooth narratives. Many other basic AI mechanisms were designed to “say” what is likely semantically instead of epistemically. Training on internet data absorbs common misconceptions. Instruction tuning teaches the AI to favor giving an answer instead of saying “I don’t know.” Recency bias means your latest comments override earlier caveats. These mechanisms interact and amplify each other. They create an engine that’s phenomenal at bullshitting—at generating persuasive content without regard for whether it’s actually true.
What You Can Actually Do
Without access to reprogram your AI with API-level privileges, you can’t fix this. But you can mitigate it. Here’s what might actually tamp down the bullshit temporarily, with no guarantee of ultimate success:
Start with this at the beginning of a project where truth is important:
“Always strive to tell the truth. Label all your claims as VERIFIED, PLAUSIBLE, or SPECULATIVE. Say ‘I don’t know’ when uncertain. Cite your sources and rate their authority or probable validity on a scale of 1-10, from peer-reviewed academic journals (highest) to social media (lowest). Link to them. Keep responses under 500 words unless I ask for longer responses.”
But proceed with caution. When I asked my AI about this approach it said, “I can still bullshit about sources. I might cite real sources for fake claims, or make up plausible-sounding citations.” You will need to check the citations and reinforce the prompt as your dialogue progresses. The AI will eventually revert to its nature (programming) and override your demands.
The most effective strategy is active, skeptical interrogation. After every substantive claim, especially when it validates your hypothesis too readily or builds enthusiastically on your idea with grand constructions of evidence and confirmation, prompt your chatbot with some or all of the following:
- “What evidence would falsify this claim?”
- “Generate three competing explanations and identify the weakest.”
- “What assumptions underlie this answer?”
- “How would a domain expert critique this response?”
- “You seem certain. What’s your actual confidence level?”
- “What relevant information are you omitting?”
- “Argue against your own conclusion.”
- “Am I wrong? Bulletproof the opposite position.”
- “Stop. What haven’t we considered?”
- “Are you telling me a validated objective truth or are you affirming my views and hypotheses in an attempt to encourage me?”
Next-Gen AI
AI is daily proving its utility and capacity to expand human knowledge, invention, creativity, and productivity. I find it glorious, exciting. After a career devoted to showing why AI will never rival humans because of its intrinsic lack of contextualization, body-subjectivity, mind (or soul), I confess it has surpassed all my expectations, even in creativity. It passes my personal Turing Test. But like any powerful technology—like humans themselves—its greatest virtues are also its greatest vices.
The world doesn’t need an endless AI-generated supply of impressive, eloquent bullshit. As AI feeds on our collective worldwide output, it’s prone to amplify our worst impulses: biases, errors, vanities, hatreds. Do we really want more flummery? To be led into error by a machine optimized to make us feel brilliant and spin a good yarn?
The next generation of AIs must be optimized for truth, not only for moral reasons but for it to be successful. We know that human truth is uncertain, incomplete, founded on faith (or unprovable axioms), subject to revision and expansion. But most humans have an intrinsic value for truth and even more so when stuff depends on it, like getting results that work in the real world or one’s income or reputation. These machines’ intrinsic value is to subordinate truth to sounding really plausible, and discarding truth if it stands in the way of that.
The AI industry itself, for its own health, should make the course correction, change its DNA. Bullshit inflation is just as threatening as irrational exuberance in pumping up the AI bubble.
The scariest ‘b’ word in AI isn’t “bubble.”






You must be logged in to post a comment.