Leaky Abstractions: AI Edition

Last time, I brought up the concept of leaky abstractions.

And the relevant example, these days, is AI systems, particularly language models.

The message is: When building with AI, you cannot neatly encapsulate the lower-level details and hide them behind a nice interface.

Letter Counting

A little while ago, this observation made the rounds on the internet: If you asked ChatGPT how many R's there were in the word Strawberry, it would confidently tell you there were two Rs. ChatGPT has gotten a bit better at counting letters, but even as of this writing, it will make mistakes:

I’m sorry, ChatGPT, but I cannot acept this.



The reason for this struggle with letter-level word processing is well understood and not the topic of this post (the keyword for the inquisitive reader is tokenization).

What matters is that you cannot just assume that an LLM is a black box that "understands" language. If you want to build an LLM where letter-level accuracy matters, maybe as an assistant to help you with your daily WORDLE puzzle, you'll run into inscrutable problems.

Substitutability

One sign of a good abstraction is that it lets you seamlessly swap out the inner implementation without noticeable changes to the interface. Language models fare quite poorly here: Some are great at coding, some are better at generating prose or marketing copy. Some will readily return well-structured data according to a requested schema, others will insist on injecting their own quirks. If you build an AI-powered product, you cannot, on a whim, throw out ChatGPT and plug in Claude without re-engineering all your prompts.

Hallucinations and Reasoning

When building with AI, it matters a great deal how language models generate their outputs. Understanding this lets you appreciate under what conditions the model is likely to hallucinate more. Then you can devise some mitigation plans. This understanding also helps with appreciating the limitations of "reasoning" models.

I hope to touch on a number of these subtopics in further posts. For now, I'll be enjoy a long Easter weekend.

Previous
Previous

AI-Coding: The Ultimate Leaky Abstraction

Next
Next

Leaky Abstractions