Click the Subscribe button to sign up for regular insights on doing AI initiatives right.
That’s Not an Agent
There are two places where I've seen people misuse the term "agent". One of them is benign, the other not so much.
First, the benign version. Talking with potential clients, they're genuinely curious about AI but aren't necessarily familiar with all the fine distinctions. So they have an idea for where AI might help them, and they call that solution an "agent". That's not the place to barge in with a "Well, actually... ". What matters more is their intent and the problem they're facing, as well as what "solved" would look like for that problem. Once we design and present a solution, we'll explain that the final product may or may not end up being an agent. What matters is that the problem gets solved.
Now for the not-so-nice version: Folks who sell something software-related, knowing full well that it's not actually an agent, but they call it that to tap into hype and fear. I've seen simple automations ("Post a message to the team chat when a user files a bug") described as "our customer support agent". Ouch. If it's not a large language model (or multiple, at that) embedded in a system with a feedback loop, autonomously invoking tools to achieve an outcome, it's not an agent.
Why does it matter there, and not in a client conversation? Because if we're selling a service and positioning ourselves as experts, we have to be precise in our communications. We have to stand for what we advertise. You get what we say you get, and it won't be dressed up in colourful, hyped-up language.
Needless to say, if you're looking for someone to blow smoke and sound fancy, you can go somewhere else. But if you're after someone who'll solve challenging problems with what’s appropriate instead of what’s hip with the tech influencers, we're right here.
Don’t Distrust The Simple Approach
Phew, it's been a while. Summer, start of school, travels. Anyway.
I've recently come across multiple situations where simple is best, but gets overlooked in favour of something complex:
I've had discussions about diet and exercise. Simple (eat less, move more) is best, but people don't trust it, so they have a 12-step exercise routine and a complicated schedule of exactly which food group is verboten at what time of day.
I've had finance folks reach out about investing. Simple (index funds matched to your risk tolerance) is best. Still, people don't trust it, so they want a complicated, actively managed portfolio that gets adjusted every time the US president sends a tweet.
I've chatted about strategy and consulting with a friend. For exploratory work and new initiatives, the best approach is to just start and iterate on the feedback. But, of course, that just seems too simple, so instead we ask a big consulting company to make a deck with 60 slides, complete with SWOT analysis, 2x2 matrices, stakeholder personas, ROI projections, a RACI chart, a change management framework, risk register, industry benchmarking, and an executive summary that uses 'synergy' unironically.
We're all smart people here, so we have domain experience that's genuinely complex. That can bias us to distrust simple solutions. What we should adopt is a mindset that distrusts complexity and isn't ashamed to select and defend the simple approach.
Do You Have Experience With…?
It's a running gag among programmers that job descriptions, often created without input from technical team members, will ask for five years of experience in a technology that hasn't been around for even three years yet. And recently, nowhere has the fallacy in that been more apparent than with generative AI. In a sense, we're all newbies here. By the time you've become proficient in working with one particular model, the next one gets released. If we take this narrow, "HR needs to check the boxes"-style view of skill, then everybody is a bloody beginner at this.
This applies not just to individual job seekers, but consultants and their companies as well. How many years of GenAI-productization experience does Delloite have? Accenture? AICE Labs, for that matter? In every case the answer is, "as long as those things have been around, which isn't really that long".
Explicit experience, measured in years, with exactly one piece of technology or its subdomains is a poor measure of the likelihood that the hire will get you what you need. What changes it the new and shiny tool they get to wield. What stays the same is their methodical approach (or lack thereof...) to the underlying challenges. At the end of the day, it's engineering: Solving complex challenges with the best tools available under multiple competing constraints. Once you've got a knack for that, the actual tools become much more fluid, and checking how much time a practitioner has racked up in tool A versus tool B becomes much less relevant.
For instance, take someone with twenty years of programming experience but no prior JavaScript knowledge, who has deeply internalized the principles of good coding, then run them give them a one-hour overview to the language. Then pit them against a programming novice who spent three months in an intensive JavaScript bootcamp. I'd bet money the veteran will write better JavaScript.
With AI, we certainly have lots of new kids on the block who poured hours into prompting and vibe coding tutorials. They'll all be outperformed by those with solid engineering principles.
A quick personal note, it's the end-of-summer, almost-back-to-school chaos, so while I try, just for myself, to keep posting regularly, it's a bit more challenging than usual. :)
LLM Recommendations
Here's another area where large language models can do a fantastic job: Content recommendations.
It's no secret that the recommendation algorithms of YouTube, Spotify, TikTok, etc have come under scrutiny. One issue is that they're blindly optimized to drive engagement, which has been shown to lead viewers down rabbit holes of increasingly "extreme" content. TikTok, for example, is frighteningly good at sensing if content pushes your buttons, and before you know it, your feed is nothing but that, dialled up to 11 by the content creators vying for relevance on the platform.
But even if the algorithms were mostly harmless in their recommendations, they're also exceptionally bland. I have yet to make mind-blowing discoveries purely via the "you might also like" feature. These features are, for the most part, just presenting you with the average stuff people like you would listen to or watch. That inevitably pulls it into mediocrity and the least common denominator.
Recently, I tried out just asking ChatGPT. I told it about the artists and styles I generally like, and that I needed music for a road trip. We went back and forth with it, putting forth albums to listen to, and I told it what I liked/didn't like about each one. We ended up in pretty unexpected places, and I discovered several new bands that I'll keep following.
Now, music picking isn't the most world-changing use case, but the implications are larger. The lack of real understanding of traditional recommender algorithms means that their usefulness is limited, leading you either down rabbit holes or in circles. With a better understanding of the underlying subject, a recommender can unearth gems that are just what you wanted.
AI Coding Sweet Spots
It's no secret that I, along with many other seasoned software engineers, dislike the unbridled "vibe" approach to coding with AI. I have, however, found it immensely useful in several cases. Here are a two:
Fixing small issues in open source projects
While a recent study among AI users on open source projects found that it does not actually make them faster, on the whole, those findings are extremely context-dependent. Here's an example from my own experience: An extension to my code editor of choice (VS Code, currently) was lacking a feature that I really wanted. In the past, I would have filed a feature request with the tool's maintainer and hoped for the best. But today, I just downloaded the code, asked Claude Code to implement the feature, tested it (obviously!) and submitted the changes for the maintainer to review.
AI, in this case, rapidly sped up the part where you slowly familiarize yourself with an existing codebase to find the place(s) where a change must be made. Given that I have no intention to become a permanent maintainer of that tool, that upfront time investment would be prohibitive if it weren't for AI.
Understanding and Improving Legacy Code
This is an area I'm pretty excited about. The world is replete with legacy code (and one can argue that vibe coding is only going to add to that pile). First, a definition. Legacy code isn't code that's old. Instead, according to the guy who wrote the book on legacy code (Working Effectively with Legacy Code by Michael Feathers):
Legacy code is code without tests
Simple as that. Why is that? Without tests, you can't safely change or extend the code, and you lack a true documentation of the code's intent. And that's when you're in a case of "well, this works for now but we better don't touch it because Bob wrote this five years ago and has since left the company and we don't know how it works".
In that book, Feathers describes strategies for slowly turning legacy code into well-tested code. These steps are somewhat mechanical and deterministic but not quite as straightforward as automated refactorings. That makes them excellent for AI coding agents:
AI can scan a large codebase and figure out dependencies that span many files. If you ask the right questions, it'll do the tedious "look in this file, find a function being used, so go look for the file where that function is defined, find another rabbit hole to go down" etc etc
AI can write, or at least suggest useful tests. Once you have tests, you can move on to more aggressive refactorings and code improvements
The existing code and its behaviour are a narrow source of truth that keeps the AI honest: If all you do is add a test, that new test must pass. If all you do is a structural change (aka refactoring), the existing tests must pass.
I'd be curious to see how far one could push autonomous agents to overhaul a legacy codebase; there are probably too many idiosyncrasies in each case to allow for a fully automated one-click solution. Still, it's a great chance where the costs of continuing to tolerate a legacy codebase might finally be larger than the costs of tackling them (and all without a full from-scratch rewrite!)
This was a longer and more technical post than usual, but I wanted to provide some positive counterweight to my bashing of vibe coding. The tools are what they are, and it's up to us to use them to good effect.
Fast Fashion, Fast Software
OpenAI's CEO, Sam Altman, recently tweeted (X'ed? 🤷♂️) that we were about to enter the "Fash Fashion" era of Software. The implication: With tools like Lovable, Bolt, Replit and co, people will just quickly vibe-code a piece of software, only to discard it soon thereafter.
And I'm sitting here and don't want to live in that world. It's bad enough that clothes these days barely last beyond the third wash. But what does that even mean for software? With clothes, the idea is that it is so ridiculously cheap to make them that there's no point caring for them and maintaining them, especially when they'll go out of fashion in a second. But with software? Unless we're talking about disposable entertainment like the umpteenth clone of Candy Crush, these things hold a lot of our data, which we'd then have to migrate. I don't want my to-do app to disintegrate once I add one too many items, or my CRM to implode when I want to import a contact with a spécial chäracter.
Vibe Coding was never meant for things that seriously see the light of day. It's neat to be able to throw together quick one-offs when they're truly one-offs, like a one-time script for a one-time maintenance task.
In my view, the integration of AI on the user-facing side makes it even more important that the backend has rock-solid engineering.
The other part where the analogy breaks down is in terms of quantity. People passionate about fashion own countless garments, for each mood and occasion. Fine. But would install twenty different to-do apps and use alternate between their usage? That makes no sense.
So instead of fast fashion, we should look at, of course, slow fashion. Buy less, but better. Same with software. Build it well so you don't have to throw it away.
Outrunning a Bear
A friend and reader of the newsletter asked me whether I'm using AI much for the writing of these daily posts. As a general rule, I don't (maybe a topic for another post as to why) but today I'm feeling cheeky (and a bit off thanks to a stomach bug), so what follows below is 100% generated by Claude Opus 4.1 with no edits on my part. Enjoy this one, inspired by our recent travels in the Canadian Rockies.
Why Your AI Strategy Doesn't Need to Outrun the Bear
Remember that old joke about bear encounters? "You don't have to outrun the bear, just your slowest friend."
After dodging imaginary bears on vacation last week (everyone had bear spray, nobody saw bears), I realized this perfectly captures what I'm seeing with AI adoption right now.
Most companies are paralyzed, waiting for the "perfect" AI strategy. They're trying to outrun the bear.
But here's the thing: You don't need to be OpenAI. You don't need a $10M AI transformation. You just need to be moving faster than your competition—who, let's be honest, are probably still debating whether ChatGPT is a fad.
The bar is surprisingly low:
While they're forming committees, you're running small experiments
While they're waiting for perfect, you're shipping good enough
While they're protecting old processes, you're questioning why those processes exist
The bear isn't AI replacing your business. The bear is standing still while everyone else starts jogging.
And unlike the wilderness, in business you can actually help your "friends" run faster too. Rising tide, boats, etc.
P.S. - If you're the slowest friend reading this, hit reply. Let's get you moving. No bear spray required.
Just a few closing thoughts:
It's not bad, but it's cliche. "Here's the thing" and all
The title is too long and clunky
It tries hard to be friendly and ethical, so in the end, it's about helping your slow friend run faster instead of letting the bear eat them
Only a single em-dash. Take that, ChatGPT!
That final analogy about "the bear is standing still while everyone else starts jogging" makes no sense
Feel free to hit reply and let me know your own thoughts about where it lands and where it doesn't.
Almost There But Not Quite
Looking at the steep mountain faces on my recent vacation to the Canadian Rockies, I got a strange thought: If you could teleport yourself to anywhere along the mountain to a (relatively) secure spot, but you didn't have any rock climbing experience, you'd be just as stuck thirty meters off the ground as you'd be 30 meters from the top. For your stated goal of making it up the mountain alive, it's actually irrelevant how close to the top you'd be. Yet the providers of the teleportation device would tout that their's gets you 90% there while the competition only gets you 85% there. Irrelevant. It only becomes relevant if you're a seasoned rock climber with the right equipment and with the route ahead within your capabilities. Then, yes, it totally makes a difference how close to the top you can get teleported.
Such are the dangers of vibe coding and other vibe disciplines. If the AI gets you an app that's almost there, and you don't have that engineering background, then it doesn't matter how "almost" it is. You'll still need to determine what's missing and how to fill those gaps. Better to focus on your area of expertise and use AI to accelerate you on that journey. In short, don't use it to shore up your weaknesses. Use it to double down on your strengths.
Exponential vs S-Curve
GPT-5 has been out for a few days now, and apart from marketing hype, the response has been a resounding "meh". Some say it's great for their particular use case, others say it's mediocre at best.
In the endless cycle of whose company's AI model is the current best, we can get the impression that there are huge strides being made. The whole "accelerationist" movement tells us that we can expect an exponential growth of model capabilities, just because the early steps (GPT-1 to 2 to 3 and 4) were so monumental. They'd tell us that, before we'd know it, the AI would design better versions of itself and then we'd really be off to the races, towards the so-called singularity, super-human intelligence and, depending on your mood, annihilation or abundance for all.
Well, just like many other promises of endless growth, this one doesn't quite seem to pan out as well. Instead, progress levels off to incremental gains and diminishing returns. Just like medicine in the 1900s has made tremendous strides and made it look like life expectancy was on an exponential growth curve didn't mean that life expectancy would grow indefinitely (don't tell Peter Thiel or Ray Kurzweil I said that, though), there are natural limits and constraints.
So, what does that mean? It means it's crunch time for engineers. We can't just sit around with the same old systems and same old prompts and just wait for better models. Models will keep getting better, but not at a rate that excuses laziness. Now's the time to tweak the prompts, the model selection, the vector databases and the scaffolding. Now's also the time to be less forgiving of products and tools that seemed to bank too hard on vast improvements in bare LLM capabilities. If it's nowhere near useful right now, don't let them tell you to "just wait until GPT-6 hits".
It's okay for bare LLM progress to slow down. It's not like in classical software engineering we write low-performance software and then say, "oh well, we'll just wait for the next version of PostgreSQL to make our bad queries execute faster". (Though there was that glorious time in the 90s where CPU speed doubled every time you blinked...)
Long story short, GPT-5 was underwhelming but that fact itself is also underwhelming. Lets just get back to the actual work of engineering working solutions to real problems.
The Eagle’s Call
Back from vacation, here's a nature-inspired newsletter edition with a fun nature fact: The sound many movies use for a bald eagle's cry is actually the cry of the red-tailed hawk (Red-tailed hawk - Wikipedia). You know, the high-pitched, long and echo-y scream. Real eagles calls sound more like dolphin chatter. Bald eagle - Wikipedia
Because movies use the wrong call so much, those of us who don't live in an area with abundant bald eagles end up thinking that that's their real sound. Now, the consequences for this misunderstanding are benign, but it points to a certain danger when presented with plausible but wrong information which AI has a chance to amplify. The analogy goes like this:
I have no idea what a bald eagle sounds like because I've never heard one in the wild.
I come across a movie with the wrong eagle sound.
It sounds plausible enough. Powerful and piercing. I now think that that's what eagles sound like.
With AI:
I have no idea how to write good marketing copy because I'm not a marketing specialist.
I ask AI to write me some good marketing copy.
To me, it sounds good enough. Punchy and engaging. I now think AI makes marketers and copywriters obsolete.
Just because the AI result seems good to me doesn't mean it's actually good (unless it is meant purely for my own consumption). It takes an expert at a given craft to judge whether the result is truly good. Which reiterates the point: In the hands of an expert, AI can be a great productivity boon. In the hands of a hack, it can be a dangerous delusion.
The Review Trap
Consider this scenario: An AI tool generates some output. Maybe it's code. Perhaps it's a marketing communication to be sent out. A human reviews it before it gets sent out. This works and makes sense in scenarios where reviewing is faster than creating. Luckily, there are many such scenarios. Writing a good joke can take a comedian hours or days. Judging whether a joke is funny happens instantly. The same dynamics apply to writing and many creative endeavours. In such a scenario, the AI does not need to get it 100% right in one shot. The review will be brief, and if the output is 90% accurate, the subsequent changes will be rapid.
But there are scenarios where this dynamic doesn't apply, and that's reviews themselves. To judge whether an AI-generated review is correct, you have to do the whole review yourself. Otherwise, how can you be sure the AI didn't miss something critical? So in this scenario, where you want to use AI-generated reviews (or summaries, or style-transformations, ...), you're not better off unless you have complete trust in the tool's accuracy. That immediately sets a much higher bar for AI tools used in these scenarios.
In such a scenario, your current best bet is to use the AI review as a jumping-off point for your own thorough review: Let's say you use GitHub's Copilot Review functionality to provide an initial review of suggested code changes. Great. Take a look at those and then check the code thoroughly yourself, looking specifically for subtle things the AI might have missed. Just don't trust them blindly. And when thinking about AI use-cases in your org, pay attention to which scenario we're dealing with: Generation or review, and don't fall in the trap of automating something that needs to be double-checked by a human anyway.
P.S.: It's a beautiful summer here, and I'm off on another camping trip, returning in mid-August with more AI-related posts.
GRR Martin Does Not Need a Faster Typewriter
Fans of the Game of Thrones book series have been waiting over a decade now for the next installment. Distractions like the TV show certainly didn't help in the writing process. Either way, I don't know what exactly the author, G.R.R. Martin, needs to finally finish "Winds of Winter", but it's definitely not a faster typewriter. The bottlenecks show up elsewhere, and the speed of typing is just small noise in the grand scheme.
When considering AI and its potential to accelerate your organization, keep this in mind: It's a comprehensive system you're trying to optimize, not just a single component. In many traditional software organizations, for example, any changes to production code undergo multiple stages of review and quality gates. If you can suddenly generate code at double the speed, that won't matter if you keep the speed of reviews the same. Conversely, you can probably speed up your delivery by a significant factor if you optimize these handoffs and gates first, before implementing an advanced AI solution.
I'm starting to see a pattern: AI brings outsized benefits to organizations that, even before AI, were agile, nimble, and well-organized, whereas AI will struggle and spin its wheels in an organization that's dysfunctional, brittle, and messy. My hope is that the promised benefits from AI will serve as a sufficient wake-up call for organizations to clean up their act.
Onboarding for AI
In an old episode of his podcast, productivity expert Cal Newport tackles a listener question: "We're drowning in email at work and everything's a chaotic mess. Should we bring in an assistant to help with that?"
Makes sense on the surface. There's too much to do, and it prevents you from focusing on your core work, so why not bring in help? But Cal cautions against it: If your organization is beset by chaos and lacks clear processes, throwing another person into the mix does not help. Instead, he suggests to first get clarity: What needs to happen and how? Write your Standard Operating Procedures (SOP). Once you've got those, you can reassess: it might be that standardizing your processes has brought enough sanity to the organization that further assistance is no longer required. If that's not the case, and there's still too much of work you'd rather not do yourself, then at least a new helping hand has everything they need for maximum success.
In short, don't just hire someone, hand them the keys to your inbox and say, "good luck, kid."
And now to AI
The same holds true, even more so, for AI tools. The AI equivalent to letting a poor unsuspecting soul loose on your inbox and docs would be to just hook up ChatGPT or Claude to your Gmail and Google Drive and claiming victory. Maybe that's enough for a few simple workflow enhancements. But chances are you'll get much better results if you identify and map out the process you want to delegate to the AI. The numerous benefits include:
You'll be forced to articulate what "good" looks like for each of those tasks, and lay out which information sources must be consulted.
You can set up concrete evals that let you iterate and experiment with different AI tools, prompts, parameters etc to get hard data on what works and what doesn't.
You can identify and isolate those parts of the workflow that are deterministic and can therefore be more effectively handled by non-AI software solutions, and rely on the slower and more expensive AI tools for the parts that only they can perform.
Lacking human-level discernment about what information truly matters in a given context means throwing all the data, all the time, at the AI can degrade performance compared to only providing the context that's required for the task.
Just as a human assistant needs proper onboarding documentation to find success in their role, so do AI assistants need help and guidance to do their best.
AI Adoption for the Skeptical
A contact in the mining industry recently shared something fascinating with me: "I've got this client, they've always been anti-tech, but now they feel they need to do something with AI."
This is happening everywhere. Industries that spent decades perfecting their healthy skepticism of technology vendors are suddenly worried they're missing out. And honestly? That skepticism might be their biggest advantage.
Here's what I've learned: The companies that succeed with AI aren't the ones making headlines. They're not spending millions on "digital transformation initiatives" or hiring armies of consultants to build proof-of-concepts that gather dust.
Instead, they're asking better questions:
Where do our engineers waste hours searching through old reports?
Which compliance tasks eat up days but follow predictable patterns?
What knowledge is walking out the door when our veterans retire?
The $75K Pilot Beats the $2M Transformation
Big consultancies will tell you AI requires fundamental transformation. New systems! New processes! New everything! (New invoices!)
But generative AI actually works pretty well with your existing mess. Those thousands of PDFs collecting digital dust? That equipment manual from 1987? The handwritten inspection notes? Modern AI can (probably) read all of it.
You don't need perfect data. You need a specific problem.
Start Where It Doesn't Hurt (Much)
The best entry point? Read-only applications. Let AI search and summarize before it creates. Think of it as hiring a really fast intern who's read every document your company ever produced:
"What were the soil conditions in Block 7 in 2019?"
"Show me all safety incidents involving conveyor belts"
"Which permits mentioned groundwater contamination?"
Nobody's job is threatened. Nothing breaks if it's wrong (if you actually read what the AI digs up, of course). But suddenly, answers that took hours take seconds.
The Trust Ladder
Once people see AI finding information faster than their best document wizard, you can climb the ladder:
Search (no risk, high value)
Summarize (low risk, saves time)
Draft (medium risk, human reviews everything)
Integrate (only after proving value)
Most valuable applications never need to go past step 3. And that's fine.
Why Traditional Industries Have an Edge
That "anti-tech" instinct? It's actually perfect for AI adoption. You won't fall for hype. You'll demand ROI. You'll ask uncomfortable questions about what happens when it hallucinates.
Your skepticism forces vendors to prove value early on, not only after a big transformation for $2M. Your caution means you'll start small, fail fast, and scale what works.
The mining executive who said, "We need to do something with AI."? They're right. But that something should be specific, measurable, and boringly practical. Leave the moonshots to companies with venture capital to burn.
What's the most annoying document search in your organization? That's where your AI journey should start.
AI, the Genie
When talking about AI tools, they're often called "assistants". That word conjures up someone imminently capable, who will do their best to fulfil the requests made of them in letter and spirit. This is aspirational. Kent Beck, legendary programmer and co-creator of the original Agile manifesto, has picked a more apt metaphor: That of a genie.
In mythology, genies have to grant the wishes of their captors. But they're more than happy to grant them in letter only, and cause much mischief within those parameters. In coding tasks, that manifests itself in obviously nonsensical behaviour like, "I can make the tests pass by removing them!"
Armed with this metaphor, we can make better choices in using these tools: How do we instruct the genie so that it can't mess things up for us? How can we design a system where unleashing a genie behind the scenes ultimately creates something helpful and aligned? How can we build mechanisms of trust and verification into the system?
The genie won't go back in the bottle, so we might as well make the most of the wishes it grants us.
AI Slows You Down—Or does it?
As reported in a recent study, Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity - METR, developers using AI were, on average, 19% slower than those not using it. What's more, this observed result was contrary to how the developers felt about AI tools, both during and after the study. In short, developers felt 20% more productive but were actually 19% less so.
Take that, vibe coders! Robots 0, humans 1.
Or...?
I want to unpack two parts here. One is the discrepancy between experienced versus actual productivity, the other a larger view on what such results mean.
Motion is better than idleness
Once coming back from a weekend trip, Google Maps showed a lengthy slowdown on the typical highway route. It cheerfully told me that this was, still, the best route, despite a 30-minute slowdown. With tired kids in the backseats, I did not want to stuck in a jam, though, so I opted for a route that took 45 minutes longer but allowed me to keep moving. My hypothesis is that, with an AI tool at your fingertips, you're never at a standstill. Who cares that you have to prompt and re-prompt it to get to your desired outcome if at least it feels like code is being written.
The takeaway here: Feels matter—we want to keep our tool-users happy—but in the end, it must be cross-checked against real data.
Adoption and the initial drop
You do things a certain way, you get good at it, and eventually you reach a peak. To then ascend to another, even higher peak, you first have to come down, at least partway, from your current peak. This effect is near-universal. Whether it's your golf swing, tennis serve, chess opening repertoire, programming style/language, any change that will ultimately make you better results in an initial dip. Given how early we still are with AI coding tools, I'm not surprised that we observe initially decreased performance. As the tools and our knowledge of how to use them improve, we should come out ahead. (The study, incidentally, shows one outlier who was vastly more efficient when using AI. Maybe that individual was a bit ahead of the general learning curve.)
The takeaway here: Whatever change you introduce, with the hope of making things better, will initially make things worse. Prepare yourself and your team for it and have clear expectations about how to assess progress.
How to Kill Your AI Initiative Before it Starts
In last week's podcast interview, one point that came up was the idea that "market validation" applies to internal AI initiatives just as much as it applies to outward-facing product development. You, as a business leader, might have a great idea for tools and enhancements, but that will fall flat if the (internal) market does not adopt it.
Sure, you can mandate from the top that all employees must use a given tool. However, if your people perceive too much friction or if you fail to gain buy-in and present a clear ROI (return on time and effort invested), adoption will be perfunctory at best and actively undermined at worst.
I've heard of concrete examples of this: AI-assisted work scheduling software was rolled out with considerable fanfare and promised significant savings, only to be undermined by employees who didn't want to lose out on the now-eliminated overtime bonus payments. Another example, from older times, is the outright revolting of manufacturers when automation threatened to take their jobs. Your economic reasons notwithstanding, people can't be expected to embrace something they perceive as an existential threat: If the CEO of a company publicly talks about introducing AI to replace his workforce, don't be surprised if the workforce doesn't comply; this is the equivalent to, "We're firing you and will replace you with someone cheaper but can you please make sure to onboard them properly so they know everything you know about this critical area?"
Both for technical and ethical reasons, I'm much more enthusiastic about models of working where AI enhances what humans are capable of. If you build something for humans, it pays to build it with humans.
The Role of Agility in Getting AI Investments Right
The amazing Yuval Yeret had me on his podcast "Scaling w/ Agility". We discuss agile, AI, and their combination, exploring where both can go off the rails.
From Yuval's episode description:
In this episode of the Scaling With Agility podcast, host Yuval Yeret welcomes Clemens Adolphs, co-founder of AIce Labs, a company specializing in helping organizations successfully implement AI initiatives. Their conversation dives into the intersection of AI and agility, exploring how to avoid the all-too-common proof-of-concept trap and instead focus on delivering genuine outcomes. From internal market validation to adapting agile methods to the unique context of AI projects, this discussion is essential listening for leaders who want to scale AI initiatives without falling into process theater.
Notable Quotes
“The proof of concept is where AI projects often go to die. You need to design for internal market validation early on.” — Clemens Adolphs
“Agile is not a one-size-fits-all recipe—especially when you’re dealing with the inherent uncertainty of AI.” — Yuval Yeret
“Metrics are useful, but if they’re not tied to actual adoption or impact, you’re just performing success, not achieving it.” — Clemens Adolphs
Check out the full episode here or wherever you get your podcasts.
The AI - Human Feedback Loop
You might have already seen Andrej Karpathy's recent keynote to the 2025 AI Startup School —if not, catch it here—but what stood out for me was that he puts a strong emphasis on partial autonomy where the AI does some of the work for the human while the human provides verification and guidance.
I've written before about the types of tasks that are good for AI automation, with Guided Intelligence_—a task that is highly dependent on specific context _and very open-ended—being the toughtest of them.
High-context, open-ended tasks are precisely those where, with the current state-of-the-art models, we can hope at best for this partial autonomy. To make such a symbiotic relationship between human and AI work, Karpathy points out that the cycle between prompting the AI and getting outputs must be short.
A bad example is an AI coding agent that, after a single prompt, drops 1000 lines of code on you to review and verify. That creates lots of friction and slows you down immensely. It also gives the agent ample time running at top speed in the wrong direction.
A good example is the way Claude Code splits a prompt into several small tasks, asks if those make sense, then presents small code snippets to accept or refine. Instead of firing and forgetting, we are rapid-firing.
So, when designing an AI tool that's meant to assist human workers in a complex task, don't ask "how can we get everything right in one shot?" Don't even ask "how can we do as much work in one shot as possible and then iterate?" Instead, ask "how can we create a tool that affords a low-friction fast feedback cycle"?
Is anyone NOT looking into AI these days?
Turns out, yes. According to an opinion piece in the Communications of the ACM, a survey among business executives had some shocking facts. I won't rehash the whole article here, just a few juicy quotes:
Most companies are struggling to articulate AI plans
This is an extraordinary—and dangerous—"wait-and-see" mind-set.
Despite all this publicity [about GenAI], only 20% of companies defined AI initiatives as high priority, and more than 47% defined them as insufficient or "unknown".
77% said that [they] had only looked at GenAI briefly or not at all.
To those of us surrounded by AI through our work, this feels baffling. In my immediate bubble, it seems like everybody and their bog is either building a custom GPT or vibe-coding a business. Yet this survey reveals a vast number of companies that look at this technology and shrug their shoulders.
Now, I don't want to pull the FOMO (Fear of Missing Out) card. Beware the hype. By all means, do things sensibly and measuredly. But do things. And don't treat it like a box-checking exercise where you bring in an expensive consultancy to run a one-day workshop on the usage of GenAI for your employees and call it a day.
I've assumed it as a given that every company would want to do something with AI, i.e., articulate an AI plan and explore where it might speed up critical workflows, take low-value drudgery away from high-paid specialists, finally make use of the vast amounts of messy data collected by the business, or any of the other emerging use-cases. It turns out that many companies haven't yet reached that level. So maybe it is time to play the FOMO card after all? Because these companies are missing out.
So here's a humble ask: If you know someone who runs a business, or a department inside a business, and they fit the profile of this post—haven't looked into AI, or don't know where to start—kindly connect them with me because I'd love to ask them questions to refine my understanding of how business leaders are thinking about AI.
