Claude Code’s Profanity Scanner

Apr 2

The internet had a field day ever since Anthropic accidentally leaked the code to their coding agent, Claude Code. One bit that stood out: To detect negative user sentiment, Claude Code runs messages through a simple filter that looks for profanity ("wtf", "crap", and the likes).

Folks on social media were quick to point out the irony that a tool that's supposed to be intelligent relies on simple text-matching logic to detect user sentiment. Why don't they just run the user input through their own large language model to detect whether the user is frustrated?

The answer is simple: Because they're not trying to collect perfect data to use in a peer-reviewed sociological study. They're just trying to roughly and cheaply track overall sentiment. In that context, trading off accuracy for efficiency is the right build decision. The simple text match runs near instantaneously. Sending the whole thing to an LLM is massive overkill.

Lesson for builders and founders: Don't get obsessed with finding "the best" solution according to some metric (like accuracy). Find the best solution for the given context. In this case, simple and cheap was way better than complex and expensive.

Clemens Adolphs

Claude Code’s Profanity Scanner

Do You Need a Roadmap?

What Even Counts as MVP?

We’re nerds, we know AI, and we write helpful daily articles. Don’t miss them.