Click the Subscribe button to sign up for regular insights on doing AI initiatives right.
Style Transfer: Solved
Interrupting our regular coverage for what's been lighting up the internet the last few days. Both Google and OpenAI have released multi-modal models that, for the first time, integrate image generation and image editing right into the language model itself. This is a big deal because now, for the first time, the model that's generating the images has a solid understanding of the user's intent. One of the most obvious use cases here is that of style transfer: "Turn this photo into a Simpsons cartoon" or "Make this a Van Gough painting."
A brief history of style transfer
Repurposed Image Recognition Models
The first widely publicized algorithm for style transfer was published almost ten years ago. Gatys et al. recognized that, in an image recognition model, the first few layers mostly recognize style, and the later layers mostly recognize content. By feeding a content image (the photo of me) and a style image (Van Gough's Starry Night) through the model, the algorithm then optimizes for an output image whose activations in the style layers match the style image and the activations in the content layers match the content image.
Clever as it is, this algorithm and subsequent works were mostly good at matching brush strokes of various art styles but not so much the broader artistic implications, like Picasso's way of messing with perspective or Dali's "everything melts" style.
Diffusion Models
Without getting too technical, these models (DALL.E, Stable Diffusion, Midjourney, etc.) use a reverse diffusion process to start from pure noise and slowly generate a target image, guided by a text prompt. Early models famously had huge problems adhering to prompts (and let's not forget the horrific way they generated hands and fingers). They were great at applying styles to a target prompt: "A bored cat in the style of The Girl with the Pearl Earring." and so on. Instead of a text prompt, they could also be prompted with a source image and glean style from there.
They could also be fine-tuned on your images and after that time-consuming and finicky process, you could ask for yourself as, say, a Lego mini figure.
However, they were still bad at:
Text generation
Prompt adherence
Image editing
In simple terms, these models do not have a genuine concept of content, just a vague sense that the text and image embeddings are close to each other.
Multi-modal Models
By baking image generation right into their Large Multimodal Model (LMMM?), both Gemini and GPT4.o have access to all their text-based world knowledge and what they have learned about how objects relate to each other. These models know that the Simpsons have yellow skin and an overbite. They can move parts of the image around while keeping them consistent overall in a way the previous generative models couldn't because they were too focused on individual pixels.
Cheers, and enjoy the weekend!
Agile doesn’t work for AI
Oh, them's fighting words. Or maybe you're in the camp that would say, "Well, duh, Agile doesn't work at all."
But not so fast. The higher principles behind Agile very much apply. Avoid waste. Take small, safe steps. Have tight feedback loops to make sure you're building the right thing.
Where things break down is when rigidly codified "Agile" practices—which might make total sense when applied to standard software development—are applied without modification to AI projects.
Some examples:
AI development is much closer to research than development, so everything's a spike in Scrum terms. You might as well not bother, then.
A large feature does not intuitively break down into smaller pieces. Case in point: to build an image recognition model, you don't start with a model that can recognize one category and then slowly add additional categories. In fact, it's often the opposite: You start with a model that can recognize hundreds of categories, then you throw those all out and replace and fine-tune for those you particularly care about.
There are non-intuitive discontinuities in mapping a user story to the required effort. Minor changes to the requirements turn "sure thing, give me an afternoon" to "uhm, give me millions in VC and five years" (Hat tip to XCKD, whose predicted timeline for image recognition turned out just about right).
Test-driven development (TDD), a fantastic practice for software development, does not productively apply to ML. Sure, you can TDD that your outward plumbing around the machine learning system is correct, but you can't TDD your way to, say, better performance on a relevant model eval.
So what are we going to do about it? I'd say the community at large is still figuring that out. I'm hoping to add my own thoughts to the discussion over time. It would be a shame if the solid principles were to get thrown out due to frustration with the concrete practices.
If it hurts, do it more often*
*At least for things that you should be doing anyway. Don't go stubbing your toe every hour ;)
This is quite obvious in some areas. If you go for a run once a month, you'll be sore for days afterward, but if you go for a run every other day, you'll do just fine. Or, as I try to tell my kids, tidying up once a day is no big deal versus letting the mess pile up for weeks.
In software engineering, releasing a new version to customers once every six months is a big, fraught, painful process where everything has to go right. With continuous deployment, releasing six times per day is a non-event.
The same is true for many things at many scales:
Integrating your code changes with those of your colleagues. It's a big pain with lots of conflicts to resolve if done every few days and it's a trivial exercise if it's done hourly.
Annual planning. So much uncertainty, so much handwringing about which of the many possible futures will come to pass. Much easier to keep the detailed planning for the shorter timescales, in the spirit of Lean and Agile.
Going meta: Writing a long monthly or even just weekly newsletter is a dreadful thought. It better be of the highest quality, jam-packed with top-notch insight, with a carefully chosen topic. Writing every day takes all that hassle and pressure off.
In a way, this is a corollary of the idea of Exponential Shrinking. If you can get away with something much smaller, do that. If the overall quantity can't be reduced, slice it up and deliver more frequently.
What's something painful or annoying that you ought to do? What would happen if you upped the frequency?
Wilderness First Aid, or The Pull to Complexity
Even though we understand and accept that simplicity is better, we repeatedly end up with complex solutions. Why?
It's tempting to use what we know: I once took a comprehensive Wilderness First Aid Course, and for the next couple of outings, a little voice in my head said, "If only someone would sprain their ankle right now. I know exactly how to tape it based on the direction of the sprain. I'd be a hero." These are horrible thoughts, but we all crave to be competent and demonstrate that competence.
So, we read about strategies, processes, and design patterns and can't wait to use them. Resisting this pull goes against our nature.
And if it's not our desire to appear competent that pulls us towards complexity, it's our fear of appearing incompetent in front of peers, bosses, or clients. If we propose a simple solution, won't they think we are simple?
The antidote is a mindset shift from the baroque aesthetic of "more is more" to the minimalist aesthetic of "how little can we get away with?" Take pride in expressing things simply, in finding clarity, in discovering the connection that cuts through layers of complexity.
And alleviate your fear of appearing simple. The right people will appreciate it when you, the expert, give them a simple solution that works.
Vibe Cooking
In the past, I've successfully used GenAI to develop recipes based on available ingredients and what I was in the mood for. The instructions generally made sense, and the result was decent.
So, naturally, we put on our hype goggles and extrapolate to... Vibe Cooking.
Upload a picture of your fridge and pantry to the AI
Ask it what you should cook
Taste in between steps and ask the AI for adjustments
Rinse and repeat
Post on social media that chefs are going out of business and how you'll open a Michelin-star restaurant despite having no professional cooking experience 👩🍳
(Optional: Learn the hard way that running a restaurant involves more than cooking random stuff on a whim 🤷♂️)
And don't stop there. Other professions are bound to get vibed, too. Here are five more.
Vibe architecture. Who needs architects when the AI can spit out blueprints and work breakdown structures, then write emails to coordinate the contractors?
Vibe medicine. (WebMD on steroids)
Vibe law. "Just a second, your honour, I'm feeding the opposing counsel's last remark into my AI..."
Vibe accounting. AI does your taxes. Getting audited? Feed the angry letter right back into the AI.
Vibe engineering: AI generates structural designs. If something collapses, simply input "bridge fell down" and ask for updated blueprints.
Happy Monday and happy vibing.
Comprehensive Tests Between Ecstasy and Agony
Only a short email as I'm deep in the weeds of debugging...
Throughout a big client project, a comprehensive suite of tests—both small-scale unit tests and more extensive integration/functional tests—has saved my bacon countless times and ultimately accelerated development. Good tests let you pinpoint exactly where something went wrong, like where a new feature messed up existing functionality. This is the ecstasy. You can confidently make big strides and sweeping improvements to your code all in the confidence that you have a solid safety net.
But today, I spent agonizing hours fighting with tests that work locally but don't work when running on the GitHub server, where the code gets checked before being integrated into the mainline. The issue is related to arcane details about what you can and cannot do with GitHub Actions, Docker, and network calls, and it's still not solved 🤷♂️.
The lesson: There's a tradeoff in everything, nothing is purely good, and nothing ever works the way you'd hope it would. Our job as software engineers is to find a satisfying path through these tradeoffs that lets us make steady progress.
In my current testing conundrum, I've decided that pragmatism beats purity. A simple "hack" lets me circumvent the issues, but it's not the purist's way.
Anyway, thanks for listening to my rant. Enjoy your weekend and Monday we're hopefully back to our regularly scheduled content :)
Training Wheels vs Balance Bikes
Here's another way to think about the pitfalls of "Vibe Coding" and relying too much on AI: Some assistive technologies are like training wheels, and others are like balance bikes.
As a millennial, I learned to bike the traditional way: tricycle -> bike with training wheels -> Bike. Progress is relatively slow because training wheels do nothing to develop balance, so once the training wheels are off, you're back to zero.
My kids learned biking the new way: balance bike -> bike. A balance bike is a bike without pedals. You push it around like a walker and then learn to roll with it. This is a much faster way to learn biking because the crucial part is learning how to keep your balance, not how to push the pedals.
And what does this all have to do with AI, and AI-assisted coding in general? Aimlessly throwing requests at the AI is like using training wheels. At the moment, it feels comfortable, but once the training wheels come off, you're left floundering.
In contrast, using AI to refine your thinking, validate your approach, generate ideas, etc., is similar to the balance bike approach: You acquire critical skills, so you don't bump into a skill ceiling. Let AI be a stepping stone, not a roadblock!
The Capability-Impact Gap
Writer and Computer Scientist Cal Newport pointed out something interesting in a recent episode of his podcast, Deep Questions:
On the one hand, AI's capabilities are evolving rapidly and frequently prove wrong the nay-sayers:
Oh, it can't do X
Two months later, it can indeed do X
On the other hand, AI's economic impact has been relatively muted so far.
Early on after ChatGPT was released, massive disruptions in every industry related to knowledge work were predicted. With very few exceptions (Chegg, a company that basically lets students cheat on their homework, saw a 99% decline in its stock price) that just hasn't happened.
Why is that? Cal explains that the current dominant paradigm of AI usage, posting questions into a chat box, does not lend itself to such massive disruption, and I agree.
In essence, the capabilities of AI are probably good enough, and what needs to happen now is a painful and slow period of finding that elusive product-market fit for a true killer-app.
One compelling near-term use case is to use AI to augment a user's capabilities. Cal uses the example of Microsoft Excel. Most casual users are not aware of its more powerful features. Lookups, pivot tables, scripting. By conversing with a built-in AI, users can unlock these features more readily than by reading tomes of documentation (especially when they wouldn't even know what to look for or how to translate a feature's plain specs into how it makes their work more manageable).
More generally speaking, thinking beyond the chat box paradigm and focusing on empowerment will be the way to go!
AI vs VAs
Here's a quick sounding board for your fantastic AI idea if it involves outsourcing human labour to AI: Why hasn't that labour been outsourced to cheap overseas virtual assistants (VAs) yet?
A big deal in productivity circles and among entrepreneurs about a decade ago (think Tim Ferris) was that you could hire assistants for cheap from low-wage countries. They can handle admin tasks, content creation, and turning your blog post into a tweet. All sorts of things GenAI could do for you, and likely even cheaper and without hallucinating. Yet while VAs are a growing market, they haven't been adopted universally. There are some failure modes around availability, miscommunication, and trust issues. Still, comparing and contrasting these approaches to "automating" onerous tasks is instructive.
Looking at your specific use case, there might be good reasons why an AI model would be a better fit, but it's not a given. A sweet spot for the AI would be:
A large volume of incoming requests that wouldn't be cost-effective for human assistants to handle, even from low-wage countries
The need to be available 24/7
Dealing with highly sensitive data
Sufficiently narrow tasks so that hallucinations aren't an issue
Straightforward application of specialist knowledge, such as code generation
If those conditions are met, you might be on to something. If not, you might want to do a bit more user/market research.
Vibe Coding: Programming by Coincidence
You may or may not have wondered why there were suddenly no emails for the last two weeks. It turns out that writing a daily email isn't quite feasible when you're down with pneumonia. Now that I'm a week into taking antibiotics, things are looking better.
So, anyway. Right now, it seems like everyone is talking about Vibe Coding:
Just ask the AI coding agent for what you want.
If the program spits out errors, feed those back to the agent.
Accept all changes suggested by the AI.
Rinse and repeat until it sort of works.
Programming by Coincidence
I couldn't help but remember a chapter from one of my favourite books, The Pragmatic Programmer, about Programming by Coincidence.
Quoting the intro paragraphs:
Suppose Fred is given a programming assignment. Fred types in some code, tries it, and it seems to work. Fred types in some more code, tries it, and it still seems to work. After several weeks of coding this way, the program suddenly stops working, and after hours of trying to fix it, he still doesn’t know why. Fred may well spend a significant amount of time chasing this piece of code around without ever being able to fix it. No matter what he does, it just doesn’t ever seem to work right.
Fred doesn’t know why the code is failing because he didn’t know why it worked in the first place. It seemed to work, given the limited “testing’’ that Fred did, but that was just a coincidence. Buoyed by false confidence, Fred charged ahead into oblivion. Now, most intelligent people may know someone like Fred, but we know better. We don’t rely on coincidences—do we?
What's true for manual coding is doubly true for AI-assisted coding. If you never understood why something worked, you're stuck the moment it goes awry.
It's okay to vibe code on some one-off personal-use tool, but certainly not for a moving target like a client-facing web app. Who cares if the AI gets you, say, 50% or 60% or even 80% there if the resulting code is of such low quality that finishing the remaining 20% is near impossible?
Wide and Narrow Thinking
Why are image-generation models fantastic at generating photorealistic content but hilariously bad at text?
And why can ChatGPT figure out challenging coding tasks but not reliably tell you whether 9.11 or 9.9 is larger?
It comes down to narrow versus wide thinking, in the loosest sense. If you ask Dalle or Midjourney for a renaissance painting of a cat dressed like Louis XIV, there are many paths the AI can take to get there. But if you ask it to add a text label, the space of acceptable outputs is vastly smaller.
The same applies to mathematical and logical reasoning. The path of acceptable steps to take is much smaller, and we're expecting quite a lot from an AI if it has to reconcile this very focused, narrow, discrete thinking with its more random, free-flowing nature.
Tools to the rescue
Specifically for language models, the most promising approach to fix this is using tools (like we've seen in the AI Agent case; just realize that there's nothing mystical about them). The "wide thinking" LLM will recognize when dealing with a math problem and can then defer to a calculator or write Python code to solve it. ChatGPT already does that, of course.
Over time, I would imagine more tools getting integrated with an LLM so that it can focus on what it's good at (wide thinking) and defer to tools for the things it's not good at, or where more precision and repeatability is desired (narrow thinking).
I could imagine a few such cases, like matching an AI code assistant with static analysis, automated refactorings and other goodies. Every industry and job will have its own set of narrow tools to enhance AI assistants' usefulness and reliability.
The monkey on a pedestal
I love this story about where to focus your efforts as related by Astro Teller: "Tackle the monkey first." Here’s the short version:
Imagine your boss gives you three months to figure out how to
Get a monkey to recite Shakespeare...
...while standing on a pedestal.
Then after a month or two, the boss checks in on progress, and you say: "Things are going great! Look at this amazing pedestal we made out of genuine Italian marble! Look at the intricate carvings!
That sounds hilarious, but there is a tendency for all of us to procrastinate on hairy, audacious, vague things because, well, they're hard. It's no fun making no progress. Working on the low-hanging fruit gives us a sense of forward motion. But all that effort will be wasted if the uncertain thing turns out to be impossible or more complicated than anticipated.
That's why, on a recent project, we consciously pushed all the easy things to the very end, even though we could have completed them quickly. We didn't want it to take away bandwidth on the uncertain parts. After all, who cares about the font sizes on your web app if it just doesn't work?
It's time to stop polishing the pedestal and start training the monkey.
On AI Snobs
The ongoing hype around generative AI has led to an influx of tech influencers and enthusiasts. This, in turn, has led to an influx of snobs and cynics who will shake their fists at those who dare claim AI expertise without advanced degrees and experience with statistical methods and "standard" machine learning.
These AI snobs will give you a long study plan of all the things you have to master before putting anything AI-related into your LinkedIn headline:
Linear algebra
Vector calculus
Statistics
Classical ML methods (support vector machines, logistic regression, k-means clustering)
Stochastic Gradient Descent and other optimization methods
and on and on.
I call nonsense. First, if it's all about foundations, why stop at the math parts? I'd like to demand that anyone using a computer first learn about the quantum properties of semiconductors. 😬
Second, the best way to achieve valuable outcomes is to take a top-down approach. Use whatever tools are available, and only if you encounter sharp edges will you spend the time and effort to go deeper. (Note: The courses on fast.ai are a masterclass in this principle.)
Concrete examples
No-code platforms to rig LLM workflows and agents together don't require any deep ML expertise. If those fit your bill, you won't need that deep expertise, AI snobs we damned.
On the other hand, if you are contemplating an AI project with an uncharted course and an unknown approach, it's helpful if you can rely on someone who has extensive experience with different techniques.
Even if it's not hallucinating, it's hallucinating
You may have noticed that, occasionally, your trusty AI assistant makes stuff up. I've had Github Copilot invent code libraries that don't exist, for example. And then there was that case of the lawyer who was leaning on ChatGPT to find some precedents and found out the hard (i.e. embarrassing) way that those were made up, too.
Those are called hallucinations and a lot of effort goes into reducing the rate of hallucinations in LLMs.
However, at their very core, all LLMs hallucinate everything. Let me explain (and for more detail, I highly recommend Andrej Karpathy's Deep Dive into LLMs. It's a three-hour video, so watch it over a couple lunch breaks...).
Before you can create a useful AI assistant like Claude or ChatGPT, you need a base model. That base model is a neural network trained to guess the next word (a token, actually, but let's keep it high-level) in the training dataset, which is basically the entire Internet.
For a given sequence of words, the base model returns a list of probabilities for possible follow-up words. These probabilities match the statistics of the training text. In short, we've got ourselves an internet text simulator. This base model hallucinates everything, all the time. All it does is answer the question, "How would a text that starts like this likely continue?"
All the work that's been done on top of this base model is about clever tricks that turn an internet text simulator into something useful:
Post-training with hand-curated examples of what a good answer looks like so that instead of an internet text simulator, we get a helpful assistant simulator
Reinforcement learning where human (or AI) critics provide feedback on the answers
Adding examples to the training set where the AI assistant is allowed to (supposed to, really) say "I don't know"
Enhancing the model through tool use (e.g. internet searches)
It's all about constructing sufficiently narrow reins that ensure the most probable way the underlying text would continue is actually something useful. Even if it's all just made up.
This is useful to keep in mind when evaluating potential LLM use cases.
Keeping up via known unknowns
Happy Monday!
There's never a shortage of new things we have to learn about. New model, new framework, new benchmark, new tool. It's a delicate balance. If we never bother to keep up, we'll get left behind. If all we do is try to keep up, we'll never get anything done.
Here's something I've stumbled on that works for me. It's based on Donald Rumsfeld's much-ridiculed classification of things we know (or don't know).
What we (don't) know
Known knowns: That's just the stuff you know.
Known unknowns: Stuff you don't know. But at least you know that you don't know.
Unknown unknowns: Stuff you don't know. And you don't even know that you don't know.
With unknown unknowns, you don't even know what the question is and that you could be asking it.
Making the unknown known
Loosely keeping up with recent developments means turning the unknown unknowns into known unknowns. The first step here is just to pay attention and keep your eyes open. You're most likely doing that already! At this level, the question is relatively shallow: "What even is XYZ?" What might prevent us from digging further, though, is the sense that we'd have to spend a lot of time to gain a deep understanding and that this isn't sustainable given the number of new things to explore.
But what if all we do is push a little bit deeper so that our questions are a bit more nuanced? This can easily be achieved by a bit of browsing and skimming:
For a new tool, idly browse the documentation.
For a framework or package, skip quickly through the tutorial.
For a research paper, skim the abstract and call it a day.
It's enough to come out of this experience with many new things you don't know. But at least you'll know that! It'll give your mind something to latch on and, in the future, notice when it becomes relevant to what you're doing. Then, when you have validation that diving deeper will be helpful, you can spend the time and feel good about it.
When AI really wanted to sell me more power drills
Here's another short example of where an "obvious" application of AI doesn't lead to good business outcomes:
I once bought a nice Bosch power drill from Amazon. For a long time after, Amazon would relentlessly push more power drills:
"Here are some power drills you might want to buy. Check out our deals on power drills. Hey, have you seen these latest power drills?"
But given that I had just bought my new drill, yet another drill would be among the least likely things I'd buy!
A typical recommendation algorithm, in straightforward terms, works like this:
Look at all the stuff user A has bought
Find other users whose purchase history is _similar_ to user A's.
Identify things those users have bought that user A hasn't yet bought and recommend those.
Of course, Amazon’s recommendation system is more complex than just what I've described (so-called collaborative filtering). Still, this misfire shows that even sophisticated AI can get things wrong. (They do apply more sophisticated content-based recommendations these days.)
This type of recommendation works great for books, CDs, movies: Categories with a wide range of items that can be sorted by genre or other matters of taste. In that regard, it mimics human recommendations: If I love historical fiction and you love historical fiction, we can share book recommendations, with additional purchases being likely.
However, collaborative filtering fails for categories like power tools or consumer electronics, where purchases are one-time and driven by need. If I buy a Bosch drill, I don't want another drill. I want things to help me get the most out of the one I just bought.
What Amazon should have been recommending:
Books on DIY projects
Bandaids 😬
A set of drillbits (square ones like we use here in Canada)
Instead of spamming me with drills, Amazon could have predicted what I needed next, turning a one-time sale into a series of useful purchases.
And to tie it all back to making AI work for you: Think all the way through to the intended outcome and don’t lazily stop short.
Agents demystified
Don't you hate it when someone tries to make you feel stupid by wrapping a simple concept in layers of manufactured complexity?
Case in point these days: AI Agents. Oh, Large Language Models (LLMs) are so 2024. But 2025 is the year of agents. Agentic AI this, Agentic AI that. You better sign up for all these Gartner webinars or, better yet, pay a million bucks to Accenture to build you a custom proof-of-concept project (true story). Don't miss out, or they'll be coming for you and your business!
Here, I lay out the simple logical steps that take you from an LLM to an Agent:
Step 0: The LLM as a chat bot
That's the ChatGPT and co we all know. We type in our query and get an answer back.
Step 1: Integration: Calling the LLM from another program
My to-do app has a feature that takes a large task and uses generative AI to break it down into smaller steps. Behind the scenes, the to-do app just posts a well-crafted question to the LLM and uses the answer to provide that functionality.
Step 2: Workflows
Like step 1, but with more complexity in the control logic, and with more than one call to the LLM involved. Includes examples like:
Using one call to an LLM with one set of instructions to create some output and using another call with another set of instructions to critique or improve that output
Posing the same question to multiple LLMs, then aggregating their responses
Asking one LLM to break down a larger query into multiple steps, then feeding those steps to specialist LLMs
Step 3: Agents
In the previous step, we hard-coded the control logic. In this step, we hand the reins over to the LLM. In addition, we provide access to tools:
An extra layer around the LLM checks its output for special commands where the LLM says that it wants to do something, like search the web, check something in your database, or run some code.
Part of the prompt involves telling the LLM what tools it has access to and how it should invoke them.
Then you just let it do its thing.
And that's the whole secret. Let's end with a quote from the excellent article by Anthropic on Building effective agents, because it fits so well with our general theme of simplicity:
Consistently, the most successful implementations weren't using complex frameworks or specialized libraries. Instead, they were building with simple, composable patterns.
AI content and fake hair
“All toupées look fake; I’ve never seen one that I couldn’t tell was fake.”
This is the Toupée Fallacy, and I was reminded of it by the endless stream of social media posts that want to teach you how to spot AI-generated content:
Watch out for fancy punctuation—like the em-dash. Who even writes like this?
Suspicious words: Let's rather not delve into those.
🚀 Too 👏 many 💠 emojis.
However, the biggest tell of bad AI-generated content is that it's so bland and formulaic.
But if someone were to use AI to
brainstorm
critique
point out gaps
suggest additional examples
iteratively edit their writing
and then incorporates that feedback in their own words, you could never tell that they used AI. It would be like figuring out if I used a calculator when answering 12 x 12 = 144.
Shouldn't we instead focus on quality? The main problem with lazily generated AI content is that it is bland, generic, and tepid. But so is lots of 100% free-range human-created content.
If a text is genuinely inspiring and insightful, and the author crafted it with the help of AI, all the power to them. Isn't a significant promise of generative AI that it elevates our creative powers?
I'd be curious to hear if you've had particularly bad (or awesome) experiences with AI-generated content.
Disfluency
In last week's post about the Underpant Gnomes, one example point was
"Connect your documents and data to an AI chatbot so you can talk to your data."
For example, you could ask your AI, "Hey, what were last quarter's sales numbers in Europe?" and get a nice report without manual clicking and digging. This seems like an obvious use case for Generative AI. Why wouldn't you want to make your data more accessible?
Because easier isn't always better. In Smarter, Faster, Better, Charles Duhigg writes about a concept he calls disfluency:
Sometimes, deep insights can only develop if you have to wrestle with the data. He gives the example of teachers who had access to all sorts of fancy dashboards about student performance. It wasn't until they ditched all that for handwritten notes on stacks of index cards that they made important discoveries about how to help each student.
To use AI to its fullest potential, we'll want to dig deeper and think beyond using it to make access to raw information easier and instead use it to generate insight:
Don't just dig up numbers for me that I could easily dig up myself. Tell me what I'm missing!
Don't just summarize a document. Tell me what's actionable.
Don't just list customer complaints. Identify trends and suggest fixes to improve satisfaction.
Don't just extract key metrics from reports. Flag emerging patterns that signal future opportunities or risks.
Don't just compile meeting notes. Highlight unresolved issues and recommend next steps for resolution.
Figuring out how to connect the raw information to the actual insight is the crucial Phase 2 in using AI productively.
What’s your Phase 2?
Season 2, Episode 17 of the animated TV Series South Park introduces us to the Underpants Gnomes. At night, they sneak into children's bedrooms and steal their underpants. Eventually, they are confronted by the kids, who want to know what the gnomes do with all those underpants.
Their explanation has become pop culture history:
Phase 1 - Collect underpants
Phase 2 - ???
Phase 3 - Profit
Those silly gnomes. But it serves us well to remember that the temptation to gloss over the details, that pesky Phase 2, is all too strong. We assume that a certain input activity should eventually lead to profit (or other good things).
Replace phase 1 with any of the following, and it's all too common that our mind leaps straight to phase 3.
Build a shiny mobile app.
Hire the "best" people.
Publish a thought leadership article.
Invest in the latest AI tools.
Launch a social media campaign.
Raise venture capital.
Connect your documents and data to an AI chatbot so you can talk to your data.
What's your Phase 2
To be fair, the examples above make more sense than collecting underpants. There is probably a way to profit with a great team, a great product, lots of resources, etc. It's just that you still have to do the strategic work to connect those inputs to the desired outcome.
Ask yourself: What's my phase 2?
