> For the complete documentation index, see [llms.txt](https://docs.raiaai.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.raiaai.com/security/how-we-protect-privacy.md).

# How we protect Privacy

Here’s a practical way to configure raia so customer data stays inside the guardrails of major privacy laws (GDPR/UK GDPR, CCPA/CPRA), and your AI Agents don’t accidentally leak or over-retain personal data.

## 1) Identity & access (who can see/do what)

* Enforce SSO + MFA for all raia admins and builders.
* Turn on **agent-level RBAC**: for each Agent, grant the minimum roles and restrict which users/teams can invoke it.
* Create **least-privilege data scopes** per Agent (e.g., Support Agent → read-only to tickets; Sales Agent → CRM “leads only”).\
  \
  Why: These align with GDPR’s data-minimisation & integrity/confidentiality principles and UK ICO guidance on proportionality and access control. ([ICO](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/?utm_source=chatgpt.com))

## 2) Data ingestion & indexing (what goes into the model context)

* **Field/record filters at the connector**: exclude sensitive fields (SSN, bank, health) before ingest.
* **PII redaction pipeline** (pre-vectorization): mask emails, phone numbers, national IDs; keep a reversible token only if absolutely required for the use-case.
* **Purpose tags** on data sources (e.g., “Support-answering only”) and bind Agents to allowed purposes.
* **Hot glass vs. copy**: prefer real-time “read through” to systems of record rather than copying full datasets.\
  \
  Why: Enforces purpose limitation & minimisation under GDPR Art. 5. ([GDPR](https://gdpr-info.eu/art-5-gdpr/?utm_source=chatgpt.com), [ICO](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/?utm_source=chatgpt.com))

## 3) Retrieval & prompt guardrails (what the model can pull/use)

* **Retriever allowlists**: restrict each Agent to specific indices/collections and tenants/customer IDs.
* **Top-K and domain fences**: cap results (e.g., K≤5) and disallow open web unless explicitly approved.
* **Grounding-only mode** for regulated Agents: responses must cite retrieved internal docs; otherwise decline.
* **Prompt rules**:
  * “Never reveal secrets or raw PII unless the verified user is the data subject and the task requires it.”
  * “If asked for PII, verify identity, check purpose, log the disclosure.”\
    Why: Maps to GDPR transparency & accountability; NIST AI RMF recommends explicit controls around data use. ([NIST](https://www.nist.gov/itl/ai-risk-management-framework?utm_source=chatgpt.com))

## 4) Tools & actions (what the Agent is allowed to do)

* **Tool allowlist** per Agent (e.g., read-only CRM for Support; no email-send tool unless human-approve).
* **Human-in-the-loop (HITL)** for any action that would disclose or move personal data outside the original system.
* **Rate limits & anomaly detection** on export/download tools to catch mass exfiltration patterns.\
  Why: ICO’s AI guidance emphasises risk-based controls and proportionate mitigations. ([ICO](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/?utm_source=chatgpt.com), [BDO UK](https://www.bdo.co.uk/en-gb/insights/advisory/risk-and-advisory-services/the-ico-s-updated-ai-and-data-protection-guidance-emphasising-fairness-and-transparency?utm_source=chatgpt.com))

## 5) Output filtering (what leaves the Agent)

* **Response DLP**: run an outbound PII scanner; mask sensitive values unless policy & identity checks pass.
* **Safety fallbacks**: if the answer requires PII and checks fail, return a “can’t share—here’s how to proceed” template.\
  Why: Supports GDPR integrity/confidentiality & CPRA restrictions around sensitive personal info. ([ICO](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/?utm_source=chatgpt.com), [California Privacy Protection Agency](https://cppa.ca.gov/faq.html?utm_source=chatgpt.com))

## 6) Logging, retention, and deletion (how long you keep it)

* **Log redaction**: store prompts/outputs with PII masked; keep an unredacted version only in your SIEM with strict access.
* **Short retention**: set chat/event logs to the minimum you truly need (e.g., 0–30 days) and enable **verified deletion** workflows.
* **DSR playbooks**: for access/correction/deletion requests, tag records by subject ID so you can search & purge across raia and connected systems.\
  Why: GDPR storage-limitation & data-subject rights; CPRA rights to know/correct/delete. ([GDPR](https://gdpr-info.eu/art-5-gdpr/?utm_source=chatgpt.com), [ICO](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/?utm_source=chatgpt.com), [California Privacy Protection Agency](https://cppa.ca.gov/faq.html?utm_source=chatgpt.com))

## 7) Cross-border data transfers (EU/UK personal data)

* If you process EU/UK personal data, **pin storage/processing in-region** where possible.
* Where transfers are necessary, use the **European Commission’s Standard Contractual Clauses (SCCs)** and document supplementary measures.\
  Why: GDPR Chapter V requires appropriate safeguards; SCCs are the primary legal tool endorsed by the Commission/EDPB. ([European Commission](https://commission.europa.eu/law/law-topic/data-protection/international-dimension-data-protection/standard-contractual-clauses-scc_en?utm_source=chatgpt.com), [European Data Protection Board](https://www.edpb.europa.eu/system/files/2023-02/edpb_guidelines_05-2021_interplay_between_the_application_of_art3-chapter_v_of_the_gdpr_v2_en_0.pdf?utm_source=chatgpt.com))

## 8) Vendors & model providers (OpenAI Enterprise, etc.)

* Use enterprise tiers that **do not train on your inputs/outputs by default** and allow **customer-controlled retention**; document these settings in your DPIA.
* Prefer providers with **SOC 2 reports** and map their controls to your own.\
  Why: OpenAI Enterprise states it doesn’t train on your business data by default and gives retention controls; SOC 2 demonstrates audited controls over security/confidentiality/privacy. ([OpenAI](https://openai.com/enterprise-privacy/?utm_source=chatgpt.com), [AICPA & CIMA](https://www.aicpa-cima.com/topic/audit-assurance/audit-and-assurance-greater-than-soc-2?utm_source=chatgpt.com))

## 9) Governance & proof (what you show auditors)

* Maintain a **DPIA**/LIA per Agent with: purposes, lawful basis, data categories, retention, vendors, transfers, and mitigations.
* Keep **change control** + **red-team reports** for risky Agents; align with NIST AI RMF generative-AI profile.
* Map your controls to **SOC 2 Trust Services Criteria** for ongoing assurance. ([NIST](https://www.nist.gov/itl/ai-risk-management-framework?utm_source=chatgpt.com), [Contentful](https://assets.ctfassets.net/rb9cdnjh59cm/72xv4p67HVXKp6CjWmjkPk/1cdbfa19f6307e2720396b66a6194dc9/trust-services-criteria-updated-copyright.pdf?utm_source=chatgpt.com))

***


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.raiaai.com/security/how-we-protect-privacy.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
