Are Your Business Secrets Safe With AI?


Had a call with a founder today where I asked my standard question: How are you using AI these days?

The answer was that they'd use it here and there for research and finding relevant academic papers to their work, but nothing beyond that because they were worried about leaking sensitive company information to the outside world. That's good business sense. You don't want ChatGPT to leak your company's strategy to your competitors.

The flipside is that AI could help the founder with some of the more tedious tasks, like collating information from scattered sources into a nice board update slide deck. So I thought I'd use this opportunity for a quick overview on data security among the various Generative AI providers.

The Concern

Large language models (LLMs) need to be trained on loads of text. Once all the publicly available text out in the wild has been vacuumed up, another big source of text are the very conversations we're having with the AI chatbots. For that reason, LLM providers would love to train their models on these conversations. If they do, it's then possible that sensitive information from these chats gets spat out when someone else asks the AI just the right question.

The Lay of The Land

Your strategy, or other better-kept-secret, information can leak to competitors in one of three ways:

  1. Training data contamination. As described above, if your conversations enter the training set of an LLM, they could in principle leak to competitors asking the right question.

  2. Human review. On consumer tiers, platform employees may read your conversations for safety and quality purposes.

  3. Account compromise and platform bugs. Same with any other SaaS or cloud product.

The Good News

If you are on a team/business or higher (i.e., enterprise) plan, none of the mainstream LLM providers (OpenAI's ChatGPT, Anthropic's Claude, Microsoft's Copilot, Google's Gemini) train on your data. These higher-tier plans also severely restrict how employees can access your data.

It's clearest-cut with Gemini in Google Workspace: If your business already trusts Google with board decks on Google Drive and strategy discussions over Google Mail, the same trust can be extended to Gemini.

A quick checklist

Free Tier Consumer Paid (Plus/Pro/Advanced) Team/Business Enterprise/API
Personal/Private ⚠️ Only with training OFF ⚠️ Only with training OFF
Business critical
Legally protected ✅ with BAA/DPA

Hopefully that clears up some concerns. They are legitimate, but they aren't insurmountable.

Next
Next

The Averages Fallacy