The XY Problem
Here's a frustrating experience commonly experienced by novice programmers on the popular Q&A site StackOverflow:
Q: "Hey, I'm trying to do Y, but I'm running into these issues and can't figure out how to do it."
A: "Are you sure you want to do Y? Y sounds weird. You can't do Y anyway. Anyway, here are some things you might try to get Y working."
... many iterations later ...
Q: "None of these work. :-/"
The issue is that the asker wants to do X. They have determined that one of the required steps to achieve X is Y, and now that's why they're asking about it Y without revealing that, ultimately, they want to achieve X.
An example inspired by [this XCKD comic] would be:
Q: "How can I properly embed Youtube videos in a Microsoft Word document?"
The real question is: "How can I share Youtube videos via email?"
XY AI
Of course, we're all smart here and avoid such ridiculous situations. But when we jump too quickly to an imagined solution, we get stuck on trying to solve Y when there'd be a much simpler way to solve X. Especially with AI, where things are not as clear cut as with standard programming and minor tweaks can lead to large differences in the outcome, we can fall prey to this thining:
Do you need to fine-tune a large-language model on custom data, or do you need to develop a better prompt for the base model and just provide relevant custom data in when querying the model?
Do you even need AI? Maybe the answer to "How do I stop this regression model from over-fitting my data?" is to use some standard rules-based programming.
Any question around loading a large language model and hosting it yourself, distributing it over multiple GPUs for parallel procssing etc might become mute if you just use a managed service that does all that for you.
Good AI engineering takes a step back and takes time to explore and evaluate multiple possible Ys for the current X. That's how you don't get stuck with a suboptimal approach and loads of frustration.