Using ChatGPT, Claude, Microsoft 365 Copilot, and Gemini for Workspace at work has moved past the question of "should we adopt them?" to "how do we operate them safely?" Meanwhile, since Samsung's 2023 incident of internal source code being entered into ChatGPT, "ban it for now" combined with "but staff are using it anyway" has driven shadow-IT adoption — actually increasing leakage risk.
This article — written for IT, compliance, and executive readers — covers the categories of leakage that occur with generative AI, the differences in each vendor's official policies, an internal guideline template, operational and technical controls, and the first response when an incident occurs. Citations are limited to each vendor's official terms and primary news reporting.
1. Four Categories of Leakage Risk with Generative AI
Generative-AI risks fall into four broad categories. Defining what your company is actually trying to protect, before writing the rules, keeps the policy coherent.
1-1. Leakage Through Input (Prompts)
The most frequent pattern is staff pasting business information directly into a generative AI.
- Pasting customer personal data for summarization or translation
- Inputting unannounced financial, HR, or M&A information into a slide-deck draft
- Inputting source code (including proprietary logic or auth keys) for debugging
- Asking AI to summarize counterparty contracts or NDA-covered documents
In April 2023, an incident was reported in Samsung Electronics' semiconductor division in which employees entered internal source code and meeting transcripts into ChatGPT (per primary reporting in Bloomberg, Forbes, and others). In May of the same year, Samsung was reported to have notified employees of a temporary ban on generative-AI services on company devices.
1-2. Leakage via Training and Model Improvement
This is the risk that input data is used by the provider to improve the model. As a general pattern, free and consumer plans may be used for training, while business, API, and enterprise plans are not used by default — but the details of opt-out and short-term retention for abuse detection differ by service (we cover the specifics by vendor in the next section).
1-3. Leakage Through Integrations and Plugins
Even when the AI itself is contained, leakage can occur via integrations with external services.
- Custom GPTs or Action features sending business data to external APIs
- Browser-extension "AI assistants" silently sending the work screen to the cloud
- Meeting-notes SaaS forwarding meeting audio to a third-party transcription API
- Free PDF-summarization or image-generation sites receiving uploads of confidential documents
The policies of integration partners are hard to see at a glance, and free services tend to be the loosest. The realistic approach in your guidelines is a whitelist model: list the approved tools, and prohibit anything outside the list.
1-4. Human and Operational Risks
- Accounts of departed employees remain active and can still access business data
- Shared accounts make it impossible to trace who entered what
- Personal-account logins leave business data sitting in personal cloud storage
- Output (including hallucinations) is used in external documents without verification
Human risks are best contained through "training" and "account management (SSO / IDP integration)."
2. Data Handling Across Major Services: Free vs. Business Plans
Each vendor's policy is updated frequently. The summaries below reflect the official terms and privacy-center pages as of May 2026, but always check the latest official page before making operational decisions.
2-1. OpenAI ChatGPT / API
Per OpenAI's privacy policy and Enterprise privacy policy, handling differs as follows:
Plan | Use of input for model training | Retention characteristics |
|---|---|---|
ChatGPT Free / Plus (consumer) | May be used for training by default. Can be disabled in settings (Data Controls). | Chat history retained by default. 30-day temporary retention for abuse detection. |
ChatGPT Team | Not used for training by default. | Admin can configure retention. |
ChatGPT Enterprise / Edu | Not used for training (stated in the contract). | SOC 2 Type 2, SAML SSO, custom retention. |
OpenAI API (standard) | Not used for training (API inputs and outputs). | Up to 30 days retained for abuse detection, then deleted (can be shortened with a Zero Data Retention contract). |
"On consumer plans, you can opt out of training in settings" is technically correct, but since the toggle depends on each individual employee, allowing consumer plans for business work is a high-risk configuration. For business, Team or above is the baseline; for systems integration, the API is standard.
2-2. Anthropic Claude
Per Anthropic's privacy policy and commercial terms of service, Claude is handled as follows:
Plan | Use of input for model training | Notes |
|---|---|---|
Claude Free / Pro / Max (consumer) | Used for training by default starting October 8, 2025 (opt-out available; turning it off in privacy settings excludes data from model improvement). | Up to 5 years retained when opted in. Standard 30-day retention applies when opted out. |
Claude Team / Enterprise | Not used for training (commercial terms). | SSO, audit logging, custom retention policy. |
Anthropic API / Bedrock, etc. | Not used for training. | Retained for a fixed period for abuse detection (see terms). |
Note: In August 2025, Anthropic revised its consumer terms of service. From October 8, 2025, conversations and coding sessions on Claude Free / Pro / Max are used for model improvement by default. If consumer plans are used for any business information, opt out in privacy settings, or design the rollout assuming a move to Team / Enterprise / API (none of which are used for training by default). For business use, Team or above is also the practical answer for "centralized chat-log management" and "SSO-based access control."
2-3. Microsoft 365 Copilot
Per Microsoft's official documentation "Data, Privacy, and Security for Microsoft 365 Copilot," Microsoft 365 Copilot is designed as follows:
- It uses tenant data (email, Teams, SharePoint, OneDrive, etc.), but neither inputs nor outputs are used to train the foundation models.
- Data is processed within the "service boundary" of the Microsoft 365 tenant.
- It only references information that the user already has permission to view (no permission overshoot).
- It complies with regional data requirements such as the EU Data Boundary.
A major advantage of Copilot for Microsoft 365 is that it inherits your existing Microsoft 365 security and compliance foundation (Purview, Entra ID, sensitivity labels, and so on). Note that the separately branded "Copilot (formerly Bing Chat) free version" has different training and data-handling policies, so do not conflate the two for business use.
2-4. Google Gemini for Workspace
Per Google's official help page "Gemini for Google Workspace and your data," paid Gemini for Workspace (organizational accounts) works as follows:
- Prompts and responses targeting Workspace data (Gmail, Drive, Docs, etc.) are not used to train the foundation models.
- It respects the existing access permissions inside the tenant.
- Data is processed within Google Cloud's enterprise-grade security boundary.
By contrast, the Gemini app on a personal Google account (formerly Bard) may, by default, have human reviewers inspect conversations and use them for model improvement — making it unsuitable for business use. "Organizational Gemini vs. consumer Gemini" is easy to confuse because the URL and icon look similar; treat the distinction as critical.
2-5. A Practical Summary on Plan Selection
The details vary, but the three principles for business use are:
- Consumer and free plans should be off-limits. Settings rely on each employee, and governance breaks down.
- Business and enterprise plans default to "not used for training." Adopt them together with SSO, audit logs, and retention settings.
- API access, Microsoft 365 Copilot, and Gemini for Workspace are designed to process data within your own tenant's security boundary. If you want to handle internal knowledge safely, lean in this direction.
For the broader rollout picture, see also A Generative-AI Adoption Guide for SMEs.
3. Internal Guideline Template (10 Articles You Can Copy)
Long guidelines don't get read. The template Mihata proposes to clients fits on one or two A4 pages, in 10 articles. Use the following as a base and trim or extend to match your situation.
Article 1 (Purpose)
The purpose of these guidelines is to reduce the risks of information leakage, personal-data protection, intellectual property, and copyright when generative-AI services are used for business purposes at the company, and to promote safe and effective use.
Article 2 (Scope)
These guidelines apply to all situations in which the company's officers and all employees (including contractors and dispatched staff) use generative-AI services for business purposes.
Article 3 (Approved Services — Whitelist Model)
Generative-AI services that may be used for business purposes are limited to those listed in the attached "Approved Tools List." Use of any service outside the list for business purposes requires prior approval from the IT department.
Article 4 (Information That Must Not Be Input)
The following information may not be input into generative-AI services (including by copy-paste, file upload, or screenshot):
- Personal information as defined in Article 2 of Japan's Act on the Protection of Personal Information (APPI)
- "My Number" identifiers (under Japan's Act on the Use of Numbers to Identify a Specific Individual in the Administrative Procedures)
- Trade secrets as defined in Article 2, Paragraph 6 of Japan's Unfair Competition Prevention Act
- Confidential information of customers and counterparties (including NDA-covered material)
- Unannounced financial, HR, or M&A information
- Authentication credentials (passwords, API keys, tokens)
- Sections of source code containing proprietary logic or authentication credentials
Article 5 (Account Management)
Business use is permitted only with company-issued business accounts (with SSO integration). Business use via personal accounts, and personal use via company accounts, are both prohibited. Upon resignation or transfer, the IT department shall promptly disable the account.
Article 6 (Duty to Verify Output)
Generative-AI output may not be used as-is for external documents, public content, or as the basis for decisions. Factual elements (numbers, proper nouns, laws and regulations, citations) must be verified by the user against primary sources before use.
Article 7 (Copyright and IP Considerations)
When inputting third-party copyrighted works, the requirements for permitted citation (clear attribution, minimum necessary scope) must be met. Use of generated images, code, and text must comply with the terms of service of each service and the company's IP policy.
Article 8 (Logs and Audit)
The company may collect and retain access logs and operation logs for business-use generative-AI services. Users consent to this.
Article 9 (Training and Awareness)
The company shall provide training on these guidelines and related risks to new hires and to all employees at least once a year.