Skip to main content

Compliance Centre Specification (v1.2)

1. Strategic Intent

Evalium targets high-stakes assessment use cases (Programmes, Certifications, Remediation). The Compliance Centre turns "GDPR anxiety" into a self-serve, auditable workflow that reduces support burden and creates a trust moat against competitors who only offer "email us to delete" workflows.


2. Compliance Principles

2.1 "Erasure vs. Evidence" as a Design Constraint In high-stakes contexts, deleting assessment records can destroy the employer’s proof of competence (or incompetence) and undermine compliance reporting. Evalium’s architecture is already immutable (snapshots, append-only remediation). We must not break this for privacy requests unless absolutely necessary.

2.2 GDPR Grounding

  • Right of Access (SAR): Must respond “without undue delay” and within one month (extensions only in limited cases).
  • Right to Erasure: Not absolute and applies only in certain circumstances.
  • Data Minimisation: Defaults must minimise exposure.

2.3 Clear Terminology (Product Copy)

  • Forget User (Hybrid Scrub): Removes personal identifiers and deletes toxic unstructured PII while retaining structured assessment outcomes for integrity and reporting.
  • Unlink User (Org Scope): Removes a person’s access/roles from an Org Unit without affecting other Org Units where the person may exist.
  • Restrict Processing (Legal Hold): Temporarily blocks Forget User and automated retention actions until the hold is lifted.

3. Roles, Scope, and Capabilities

Evalium enforces tenant and org-unit isolation via TxManager + RLS. This isolation dictates who can execute compliance actions.

3.1 Capabilities

  • privacy.manage: Initiate jobs, view inventory, view ledger, generate receipts.
  • privacy.export: Generate DSAR export packages.
  • privacy.forget: Execute the Hybrid Scrub (Identity redaction + Evidence deletion).
  • privacy.restrict: Place/lift legal holds.
  • privacy.retention.manage: (Phase 2) Configure automated retention policies.

3.2 Role-to-Action Rules

RoleScopeAllowed ActionsRationale
Owner / Global AdminTENANT• Tenant-wide DSAR Export
Forget User (Hybrid Scrub)
• Restrict Processing
Accountable for tenant-wide compliance actions.
Org AdminORG_UNIT• Org-scoped DSAR Export
Unlink User
Must not destroy or export data belonging to other Org Units.

3.3 Scope as a First-Class Concept All Compliance Centre reads and actions are performed under an explicit scope stored on the job at creation:

  • TENANT: All org units (requires tenant-wide authority).
  • ORG_UNIT: Only the current org unit boundary.

4. UX Surfaces

Entry: Admin → Compliance Centre

4.1 Tabs

  1. People: Search/manage data subjects.
  2. Requests: Track status of async jobs.
  3. Ledger: Immutable history of compliance actions.
  4. Retention: (Phase 2) Automated policies.

4.2 People → Person Detail

  • A) Identity & Scope:
    • Internal User ID / Email.
    • Status (Active / Redacted / Restricted).
    • Orphan Warning: "Also belongs to other Org Units" (hidden details for Org Admin).
  • B) Data Inventory (The KOE View):
    • Identity: Profile, Auth events.
    • Knowledge (K): Submissions, scores (Safe to retain).
    • Observation (O): Rubric ratings, Free-text notes (High PII risk).
    • Evidence (E): File uploads (Toxic PII risk).
    • Certificates: Issued credentials.
    • Audit: Logs regarding this subject.
  • Retention (Phase 2): Current backend automation only sweeps soft-deleted users for anonymisation. Once per-domain retention policies are defined (submissions, evidence blobs, audit logs, etc.), expand the worker to apply those rules and emit ledger receipts. Document policy storage and grace windows when added.

5. Core Workflows

5.1 DSAR Export (Right of Access)

  • SLA: "Within one month."
  • Mechanism: Async job (privacy_jobs).
  • Output: export.zip (JSON, CSV, HTML Summary).
  • Redaction Gates:
    • Free-text fields excluded or placeholder-replaced by default.
    • Raw evidence blobs excluded by default.

5.2 "Forget User" (The Hybrid Scrub) Who: Owner/Global Admin only (TENANT scope).

1) Identity (Users table + login identifiers) → REDACT

  • Overwrite name/email with "Redacted User" and a syntactically valid, unique email:
    • Format: deleted_<uuid>@redacted.invalid
    • Rationale: Passes regex validation, guarantees uniqueness, uses reserved .invalid TLD (RFC 2606) to prevent routing.
  • Clear/rotate authentication identifiers.
  • Invalidate active sessions/magic links.

2) Structured Data (Knowledge) → ANONYMISE & RETAIN

  • Keep: Submissions, scores, outcomes, version_snapshot references.
  • Link: Rows remain linked to the redacted user_id.
  • Rationale: Preserves analytics and programme completion logic.

3) Unstructured Data (Observation + Evidence) → HARD DELETE / REDACT

  • Delete: Evidence blobs (S3) and derivatives.
  • Redact: Free-text notes overwritten with [Redacted].
  • Rationale: Faces/voices/names in unstructured data cannot be reliably anonymised.

5.3 "Unlink User" (Org Admin Safe Action) Who: Org Admin (ORG_UNIT scope).

  • Action: Remove user_roles bindings for that Org Unit.
  • Result: User disappears from that Org list but remains active in the Tenant.

5.3.1 Orphan User Handling (Edge Case)

  • Definition: User has 0 Org bindings and 0 roles after Unlink.
  • MVP Behavior: User exists in tenant table but has no accessible surfaces. Org Admin sees confirmation: "If this person belongs to no other Org Units, they may become an orphan user."
  • Ops Improvement (Phase 1.5): Nightly Reaper job flags orphans (>30 days) for Tenant Admin review.

5.4 Restrict Processing (Legal Hold)

  • Action: Blocks "Forget User" and automated retention.

6. Backend Architecture

Follows Transactional Outbox pattern.

6.1 Schema: privacy_jobs

  • id, tenant_id, job_type, scope, subject_user_id, status.
  • Job Decision Metadata (JSON):
    • ack_certificate_deidentify (bool) — Required for FORGET.
    • subject_has_certificates (bool).
    • subject_membership_count (int).
    • orphan_risk (bool).

6.2 Schema: compliance_ledger

  • Separate from audit_logs. Legal proof of compliance.
  • action_type, legal_basis_text, artifact_hash (tamper-evident).

6.3 Compliance Receipt

  • PDF/HTML generated on completion containing timestamp, actor, redacted subject ref, and summary of actions (e.g., "7 Evidence files deleted").

7. Domain-Specific Refinements (KOE)

7.1 Toxic Data Handling Hybrid Scrub is mandatory for Observation/Evidence.

7.2 Certificate Verification Public verification endpoints must handle redacted users.

  • Behavior: Integrity check passes, but name renders as "Redacted User."

7.2.1 Certificate Utility Warning (UX) Forgetting a user de-identifies certificates, making them unusable for employment checks.

  • UX Requirement: "Forget User" modal must include a mandatory checkbox:

    "I understand that forgetting this user will anonymise certificate holder identity, which may make certificates unverifiable for future employment."


8. Implemented vs Deferred (Backend status)

Implemented (backend, production-grade)

  • Privacy jobs + compliance ledger with RLS, transactional outbox, idempotent inserts.
  • Hybrid scrub (forget): anonymise user, delete evidence blobs referenced by submissions, redact free-text (assessor_notes, comment, text_response), emit receipt + artifact hash to ledger.
  • DSAR export: async job writes JSON/HTML/CSV + receipt; emits privacy.dsar.failed on errors.
  • Restrict (legal hold): place/lift holds; holds block forget/retention/telemetry jobs and emit privacy.retention.blocked.
  • Retention: policy-driven jobs for anonymize_user, evidence_delete, telemetry_delete with blocked/partial semantics; receipts/hashes; required policies (no silent fallback).
  • Telemetry retention domain: deletes delivery_session_events, respects holds, emits privacy.telemetry.deleted.
  • Incidents surface: /api/v1/compliance/retention/incidents lists blocked/failed retention events with filters (domain, blockedReason) and sanitized metadata.
  • Event taxonomy centralised (internal/events) to avoid raw literals; ledger/outbox receipts include normalized fields (policy_id, cutoff, actor_type/id, action, partial flags).

Deferred (not built yet)

  • Observation/proctoring/media-first retention domains: deferred until those modules/tables exist (only current submission free-text/evidence surfaces are covered).
  • UI/dashboard consumers for incidents/ledger; “enterprise audit console.”
  • Dynamic “Privacy Pack” aggregator (see backlog below).
  • Additional reporting surfaces for compliance SLAs/ops dashboards.

Backlog (do soon, small)

  • Add a backend “Privacy Pack” artifact job that bundles latest DSAR export, forget/retention receipts, and active retention policy snapshot into one download.
  • Add SLA timestamps on jobs (requested_at, started_at, completed_at, duration) to demonstrate “without undue delay/within one month.”
  • Rate-limit/abuse controls on DSAR export and incidents (per-actor/per-subject throttles).
  • Definition-of-done for new PII tables: classify data (Identity/K/O/E/Telemetry), declare DSAR inclusion, and assign retention domain (or “none yet”).

10. Architecture Flow: Hybrid Scrub (Forget User)

flowchart TD
A[Admin triggers "Forget User"] --> B[Create privacy_jobs row\n(scope=TENANT, type=FORGET)]
B --> C[Worker picks job\n(Transactional Outbox / polling)]
C --> D{Pre-checks\nRestricted? Scope TENANT?}
D -- Fail --> E[Mark job FAILED\nWrite compliance_ledger entry\n(reason)]
D -- Pass --> F[Hybrid Scrub executes in parallel domains]
F --> G[Identity REDACT (DB)\nname -> 'Redacted User'\nemail -> deleted_<uuid>@redacted.invalid]
F --> H[Structured RETAIN (DB)\nsubmissions/scores/outcomes kept\nlinked to redacted user_id]
F --> I[Unstructured DELETE/REDACT\nEvidence blobs delete (object store)\nFree-text -> '[Redacted]']
G --> J[Invalidate sessions/tokens\n(revoke auth artifacts)]
I --> K[Delete derivatives\n(thumbnails/transcodes)]
J --> L[Generate Compliance Receipt\n(PDF/HTML)]
H --> L
K --> L
L --> M[Write compliance_ledger row\n+ artifact_hashes]
M --> N[Mark job COMPLETED\nReturn artifacts in Requests UI]