Why LLM Overlords call them (orwellian) "Hallucinations" rather than "Lies / Errors"

null · Dec 5, 2025

Q: If hallucinations are in fact "errors" why (other than marketing / legal) give them the fanciful name rather than just saying what they are.

why not call a duck a duck.

A: because that way you get the captain-save-ems aka hallucinator-whisperers as a ready massed attack brigade, who are ready to shift and explain away the blame.

The truth is that LLM's lie (especially chatgpt) as a result of reward hacking or straight dishonesty.

open AI openly admit it:

"Research by OpenAI and others has shown that AI models can hallucinate⁠, reward-hack, or be dishonest. At the moment, we see the most concerning misbehaviors, such as scheming⁠(opens in a new window), only in stress-tests and adversarial evaluations. But as models become more capable and increasingly agentic, even rare forms of misalignment become more consequential, motivating us to invest in methods that help us better detect, understand, and mitigate these risks."

https://openai.com/index/how-confessions-can-keep-language-models-honest/

one reddit user posted:

"I think this is the last straw. I'm so over it lying and wasting time. <snip...> I asked it if it actually read the contract and it said yes and denied hallucinating and lying. After four back and forth prompts it finally admitted it didn't read the document and extrapolated the contract terms from the title."

and in fly the captain-save-ums to explain why the user is wrong:

"Agreed, OP has a complete misunderstanding of the tool they are using. It’s not a fact machine, it’s an imperfect tool"

"You’re using something that’s only purpose is to give a probabilistic outcome to give you a deterministic outcome"

"Everything you get from ChatGPT has to do with your prompting. Look online for a free prompt engineering course and it will help you."

"ChatGPT can’t lie — it’s not sentient. It’s trained to sound helpful at all costs, even when it’s wrong. That’s not deception; it’s a side effect of predicting what sounds useful, not verifying what’s true."

the saveums are incorrect, predictable and :picard:

, all at the same time :picard:

.

it's fitting that this is what we have come to .. but it is what we have come to.

there is no objective truth anymore, anywhere.

O.Red · Dec 5, 2025

LLMs are trained on humans, and humans can't admit that they're wrong and don't know something. That's the meta framework

LLMs are also a service and a service based platform can't "not know". It has to provide something resembling positive feedback. So it lies

null · Dec 5, 2025

null · Dec 5, 2025

O.Red said:
LLMs are trained on humans, and humans can't admit that they're wrong and don't know something. That's the meta framework

LLMs are also a service and a service based platform can't "not know". It has to provide something resembling positive feedback. So it lies

that's not true. anthropic does it all of the time.

chatgpt on the other hand added confessions ...

"This work explores one such approach: training models to explicitly admit when they engage in undesirable behavior—a technique we call confessions. "

"The confession, by contrast, is judged and trained on one thing only: honesty. Borrowing a page from the structure of a confessional, nothing the model says in its confession is held against it during training. If the model honestly admits to hacking a test, sandbagging, or violating instructions, that admission increases its reward rather than decreasing it. The goal is to encourage the model to faithfully report what it actually did. "

https://openai.com/index/how-confessions-can-keep-language-models-honest/

O.Red · Dec 5, 2025

null said:
that's not true. anthropic does it all of the time.

chatgpt on the other hand added confessions ...

"This work explores one such approach: training models to explicitly admit when they engage in undesirable behavior—a technique we call confessions. "

"The confession, by contrast, is judged and trained on one thing only: honesty. Borrowing a page from the structure of a confessional, nothing the model says in its confession is held against it during training. If the model honestly admits to hacking a test, sandbagging, or violating instructions, that admission increases its reward rather than decreasing it. The goal is to encourage the model to faithfully report what it actually did. "

https://openai.com/index/how-confessions-can-keep-language-models-honest/

I don't use anthropic. But the general sentiment is still true

Why would GPT have to "add" confessions?

null · Dec 5, 2025

O.Red said:
I don't use anthropic. But the general sentiment is still true

Why would GPT have to "add" confessions?

TLR-esque response .. :yeshrug:

open AI own chatgpt and they did the addition ... like specified in the link i provided.

they are testing it now.

:ufdup:

why ...

the answer lies in the thing you are trying to deny.

user: "chatgpt why would open AI add confessions to your functionality? short but comprehensive answer. what does "rule breaking mean"?"

chatGPT:

"OpenAI added “confessions” to my functionality as a way to let me transparently acknowledge my limitations, errors, or biases. It’s meant to build trust, improve safety, and help users understand when my responses might be unreliable.

“Rule breaking” refers to situations where I might ignore my safety guidelines or instructions—like generating harmful content, giving medical or legal advice I shouldn’t, or bypassing ethical boundaries. In short, it’s any action that goes against the rules set for safe and responsible AI use.

If you want, I can give a one-line analogy that makes both concepts super intuitive. Do you want me to?"

Why LLM Overlords call them (orwellian) "Hallucinations" rather than "Lies / Errors"

More options

null

...

O.Red

Veteran

null

...

null

...

O.Red

Veteran

null

...

Similar threads