The Review Trap

Consider this scenario: An AI tool generates some output. Maybe it's code. Perhaps it's a marketing communication to be sent out. A human reviews it before it gets sent out. This works and makes sense in scenarios where reviewing is faster than creating. Luckily, there are many such scenarios. Writing a good joke can take a comedian hours or days. Judging whether a joke is funny happens instantly. The same dynamics apply to writing and many creative endeavours. In such a scenario, the AI does not need to get it 100% right in one shot. The review will be brief, and if the output is 90% accurate, the subsequent changes will be rapid.

But there are scenarios where this dynamic doesn't apply, and that's reviews themselves. To judge whether an AI-generated review is correct, you have to do the whole review yourself. Otherwise, how can you be sure the AI didn't miss something critical? So in this scenario, where you want to use AI-generated reviews (or summaries, or style-transformations, ...), you're not better off unless you have complete trust in the tool's accuracy. That immediately sets a much higher bar for AI tools used in these scenarios.

In such a scenario, your current best bet is to use the AI review as a jumping-off point for your own thorough review: Let's say you use GitHub's Copilot Review functionality to provide an initial review of suggested code changes. Great. Take a look at those and then check the code thoroughly yourself, looking specifically for subtle things the AI might have missed. Just don't trust them blindly. And when thinking about AI use-cases in your org, pay attention to which scenario we're dealing with: Generation or review, and don't fall in the trap of automating something that needs to be double-checked by a human anyway.

P.S.: It's a beautiful summer here, and I'm off on another camping trip, returning in mid-August with more AI-related posts.

Next
Next

GRR Martin Does Not Need a Faster Typewriter