How to Actually Use AI for Research

Yesterday I wrote about why AI gets things confidently wrong. Today, the practical follow-up: if AI is fundamentally a text prediction engine rather than a knowledge retrieval system, how do you actually use it for research without getting burned?

A physicist friend of mine put it well: he asks AI for references, and it "tends to hallucinate references, or infer content from paper abstracts and book chapter headings that doesn't actually exist." He's tried asking for "non-hallucinated references without making any inferences," which helps somewhat, but not enough.

His instinct is exactly right. He's trying to constrain the output space, which is the single most important principle for getting useful results from an LLM. But he's applying it to a task that's structurally wrong for the tool.

Asking an LLM to find you real references, here's what you get: The LLM knows what a bibliography looks like. It knows roughly what topics your paper covers. So it'll generate something that looks like a bibliography and sounds relevant, because that's what plausible text looks like in that context. The fact that "Smith et al., 2019" doesn't exist is a detail the prediction engine doesn't track.

So what does work?

Use AI where hallucination can't hurt you. The best AI research tasks are ones where you'd recognize a wrong answer. "Explain the intuition behind renormalization group flow" is a good prompt for a physicist, not because the AI will be perfectly accurate, but because the physicist can spot where it goes wrong and still benefit from the parts it gets right. "List all papers published on X in the last five years" is a terrible prompt, because you can't verify the output without doing the exact work you were trying to avoid.

Feed it, don't ask it. Instead of asking AI to retrieve information, give it information and ask it to work with what you've provided. Paste in a paper you've already read and ask for a summary, a critique, or connections to another concept. The model is much better at synthesizing material you supply than conjuring material from its training data.

Keep the tasks small and specific. "Help me think through the implications of X for Y" works better than "Tell me everything about Z." The more open-ended the question, the more room the model has to wander into plausible-but-wrong territory. Give it guardrails.

Treat it as a thinking partner, not a search engine. AI is excellent at helping you articulate vague ideas, explore conceptual connections, and pressure-test your reasoning. These are tasks where the AI's tendency to produce plausible completions actually helps, because you're not looking for ground truth, you're looking for intellectual scaffolding.

Use proper tools for proper tool jobs. For actual literature search, use Google Scholar, Semantic Scholar, or arXiv. All tools that retrieve real documents from real databases. Use AI afterward: to help you read faster, compare approaches across papers, or draft the "related work" section of your own paper with references you've verified. (Bonus tip: Create skills for Claude, or your LLM of choice, to teach the LLM how to look for papers on these platforms.)

The pattern here is simple: AI is strongest when your expertise is the quality filter. If you know the field, AI accelerates your thinking. If you don't, it accelerates your confusion.

That might sound like a limitation, and it is. But it's also the right way to think about a tool that's genuinely useful, as long as you stop asking it to be something it's not.

Next
Next

Your AI Isn't “Wrong”; It Never Knew.