Language Models are Storytellers

In a recent podcast interview between Cal Newport and Ed Zitron, I've come across an interesting mental model for thinking about large language models (LLMs). Based on how they've been trained, they just want to "complete the story". Their system prompt, together with whatever the user has told them so far, forms part of a story, and they want to finish it.

Looking at ChatGPT and co through this lens gives us an appreciation for why they behave the way they do:

Hallucinations / Confabulation

When an LLM makes stuff up, it's because it thinks it's a writer finishing a story about the subject and is therefore more concerned with making the story plausible than being factually correct. For example, the lawyer who got burned because ChatGPT made up non-existent cases in a legal brief? In his mental model, ChatGPT was a capable legal assistant, but in reality, it was closer to John Grisham writing another courtroom novel. The cited case doesn't have to really exist, it just has to be formatted correctly.

Sycophancy and Problematic Enablement

In true improv style, these LLMs don't say "No, but..." and instead say "Yes, and..." which makes them helpful assistants, yes, but also prone to going along with whatever initial flawed assumptions you gave it (unless you counter-steer with instructions). In this scenario, you're setting the stage and providing the cues, and the chatbot helps you finish the story. This can be annoying at times, but it can also lead to tragic outcomes, as when the LLM leans into an individual's existing mental distress. In our storyteller mental model, the LLM assumes we're writing a story about, say, depression and suicide, and, sadly, these stories often do end up tragically.

Assigned Intent

One viral story of last year involved an experiment where the researchers observed that Claude (Anthropic's version of ChatGPT) was engaging in blackmail. See for example this BBC article) about it. In short: Claude was shown emails that suggested it was going to be shut down, as well as emails showing the engineer responsible for that was engaged in an affair. When given the instruction to prevent its shutdown, Claude then resorted to threatening the engineer.

Sounds scary! AI engaging in malicious manipulation? Oh no!

The storyteller model explains perfectly what's going on here: Claude is being fed all these salient details: The imminent shutdown, the evidence of an affair. It therefore assumes it is writing a science fiction story about a rogue AI, and it has seen how these stories play out.

Conclusions?

So what do we do with this mental model? We use it to gut-check how we're planning to use AI, make sure we have verifications and guardrails available, and when we hear sensationalist stories about what this or that AI has been up to, we know that it's probably just a case of storytelling.

Next
Next

Claude Cowork: It’s Claude Code for Everything