How we protect Privacy
Here’s a practical way to configure raia so customer data stays inside the guardrails of major privacy laws (GDPR/UK GDPR, CCPA/CPRA), and your AI Agents don’t accidentally leak or over-retain personal data.
1) Identity & access (who can see/do what)
Enforce SSO + MFA for all raia admins and builders.
Turn on agent-level RBAC: for each Agent, grant the minimum roles and restrict which users/teams can invoke it.
Create least-privilege data scopes per Agent (e.g., Support Agent → read-only to tickets; Sales Agent → CRM “leads only”). Why: These align with GDPR’s data-minimisation & integrity/confidentiality principles and UK ICO guidance on proportionality and access control. (ICO)
2) Data ingestion & indexing (what goes into the model context)
Field/record filters at the connector: exclude sensitive fields (SSN, bank, health) before ingest.
PII redaction pipeline (pre-vectorization): mask emails, phone numbers, national IDs; keep a reversible token only if absolutely required for the use-case.
Purpose tags on data sources (e.g., “Support-answering only”) and bind Agents to allowed purposes.
3) Retrieval & prompt guardrails (what the model can pull/use)
Retriever allowlists: restrict each Agent to specific indices/collections and tenants/customer IDs.
Top-K and domain fences: cap results (e.g., K≤5) and disallow open web unless explicitly approved.
Grounding-only mode for regulated Agents: responses must cite retrieved internal docs; otherwise decline.
Prompt rules:
“Never reveal secrets or raw PII unless the verified user is the data subject and the task requires it.”
“If asked for PII, verify identity, check purpose, log the disclosure.” Why: Maps to GDPR transparency & accountability; NIST AI RMF recommends explicit controls around data use. (NIST)
4) Tools & actions (what the Agent is allowed to do)
Tool allowlist per Agent (e.g., read-only CRM for Support; no email-send tool unless human-approve).
Human-in-the-loop (HITL) for any action that would disclose or move personal data outside the original system.
5) Output filtering (what leaves the Agent)
Response DLP: run an outbound PII scanner; mask sensitive values unless policy & identity checks pass.
Safety fallbacks: if the answer requires PII and checks fail, return a “can’t share—here’s how to proceed” template. Why: Supports GDPR integrity/confidentiality & CPRA restrictions around sensitive personal info. (ICO, California Privacy Protection Agency)
6) Logging, retention, and deletion (how long you keep it)
Log redaction: store prompts/outputs with PII masked; keep an unredacted version only in your SIEM with strict access.
Short retention: set chat/event logs to the minimum you truly need (e.g., 0–30 days) and enable verified deletion workflows.
DSR playbooks: for access/correction/deletion requests, tag records by subject ID so you can search & purge across raia and connected systems. Why: GDPR storage-limitation & data-subject rights; CPRA rights to know/correct/delete. (GDPR, ICO, California Privacy Protection Agency)
7) Cross-border data transfers (EU/UK personal data)
If you process EU/UK personal data, pin storage/processing in-region where possible.
Where transfers are necessary, use the European Commission’s Standard Contractual Clauses (SCCs) and document supplementary measures. Why: GDPR Chapter V requires appropriate safeguards; SCCs are the primary legal tool endorsed by the Commission/EDPB. (European Commission, European Data Protection Board)
8) Vendors & model providers (OpenAI Enterprise, etc.)
Use enterprise tiers that do not train on your inputs/outputs by default and allow customer-controlled retention; document these settings in your DPIA.
Prefer providers with SOC 2 reports and map their controls to your own. Why: OpenAI Enterprise states it doesn’t train on your business data by default and gives retention controls; SOC 2 demonstrates audited controls over security/confidentiality/privacy. (OpenAI, AICPA & CIMA)
9) Governance & proof (what you show auditors)
Maintain a DPIA/LIA per Agent with: purposes, lawful basis, data categories, retention, vendors, transfers, and mitigations.
Keep change control + red-team reports for risky Agents; align with NIST AI RMF generative-AI profile.
Map your controls to SOC 2 Trust Services Criteria for ongoing assurance. (NIST, Contentful)
Last updated