Data Inventory & PII Checklist
This inventory tracks PII-bearing domains and the required Definition of Done for any new table/feature that touches PII. Update this file whenever schemas or queries change for PII data.
Definition of Done for new PII
- Classification: Identity / Knowledge / Observation / Evidence / Telemetry / Audit.
- DSAR: Included or excluded (and why). If excluded, state the rationale.
- Retention domain/policy: Which retention policy applies, or “none yet” with rationale.
- Sanitisation: Fields to redact before export/logging (free-text, payloads, error strings, etc.).
- Holds: Confirm behaviour under legal holds (block vs allow).
Current coverage (v1)
| Domain/Table(s) | Classification | DSAR | Retention | Sanitisation / Notes | Holds |
|---|---|---|---|---|---|
users | Identity | Included | anonymize_user | Export sanitises metadata; scrub anonymises all fields. | Block |
assignment_invites | Identity | Included | none yet | email field requires scrubbing on user deletion. | Allow |
assignment_overrides | Operational | Included | anonymize_user (by user_id) | Redact reason text. | Block |
assignments | Operational | Included | anonymize_user | security_config may require redaction. | Block |
attempt_question_exposure | Telemetry | Excluded (operational) | telemetry_delete | None required. | Allow |
audit_logs | Audit | Excluded | Retain (audit) | Redact metadata before export/surfacing. | Allow |
certificates | Evidence | Included | anonymize_user | Redact metadata. | Block |
compliance_ledger | Audit | Excluded | Retain (audit) | metadata sanitised on creation. | Allow |
compliance_ledger_outbox | Audit | Excluded | Retain (audit) | metadata sanitised on creation. | Allow |
delivery_session_events | Telemetry | Excluded (operational) | telemetry_delete | metadata sanitised on creation. | Allow |
delivery_sessions | Telemetry | Included | telemetry_delete | Redact client_ip, client_user_agent, device_fingerprint, metadata. | Block |
group_members | Identity | Included | anonymize_user | Redact metadata. | Block |
groups | Identity | Included | none yet | Redact allowed_domains, membership_rule, metadata. | Block |
magic_links | Identity | Included | none yet | email field requires scrubbing on user deletion. | Allow |
media_assets | Evidence | Included (via submissions) | evidence_delete | Delete object and row; no sanitisation needed for asset refs. | Block |
media_text_tracks | Evidence | Included (via asset refs) | evidence_delete | Redact raw_text. | Block |
ledger_events | Audit | Excluded (system/execution audit trail; actor linkage is provenance, not DSAR subject data) | Retain (audit) | Redact metadata, context_metadata, and any entity-reference payload before export/surfacing. | Allow |
| outbox_events | Operational | Excluded | none yet | Redact payload, headers. | Allow |
| passages | Operational | Excluded (authoring content corpus; no learner/runtime subject linkage by itself) | none yet | Redact title, content, and tags if copied into logs/exports; authored text may contain sensitive material. | Allow |
| passage_versions | Operational | Included (contains created_by actor linkage for authoring provenance) | anonymize_user (by created_by) | Redact title, content, content_plain_text, content_plain_text_all_locales; keep hashes/internal lineage fields internal. | Block |
| privacy_holds | Audit | Included | Retain (audit) | Redact reason. | N/A |
| privacy_jobs | Audit | Excluded | Retain (audit) | Redact payload, decision_metadata, error_message. | Allow |
| program_enrolments | Knowledge | Included | anonymize_user | Redact metadata, program_snapshot. | Block |
| program_progress | Knowledge | Included | anonymize_user | None required. | Block |
| programme_assignment_outbox| Operational | Excluded | none yet | None required. | Allow |
| result_correction_batches | Audit | Excluded | Retain (audit) | Redact reason. | Allow |
| result_corrections | Audit | Excluded | Retain (audit) | Redact note, new_key. | Allow |
| submissions | Evidence | Included | evidence_delete | answers, result, version_snapshot. | Block |
| submission_items | Evidence | Included | evidence_delete | answer, metadata | Block |
| submission_score_versions | Knowledge | Included | evidence_delete | outcome_label | Block |
| content_packs | Operational | Excluded (authoring metadata; no subject PII) | none yet | Redact description; created_by is actor provenance only. | Allow |
| content_pack_items | Operational | Excluded (authoring linkage) | none yet | Redact metadata before export/logging. | Allow |
| content_pack_revisions | Operational | Excluded (authoring snapshot metadata) | none yet | Redact manifest if surfaced externally; contains content structure refs only. | Allow |
| content_pack_revision_items | Operational | Excluded (authoring linkage) | none yet | Redact metadata before export/logging. | Allow |
| content_pack_installs | Operational | Included (contains created_by actor linkage) | anonymize_user (by created_by) | No free text; treat ids as internal provenance data. | Block |
| taxonomy_facets | Operational | Excluded (taxonomy config) | none yet | Redact description if copied into logs. | Allow |
| taxonomy_terms | Operational | Excluded (taxonomy config) | none yet | Redact description if copied into logs. | Allow |
| taxonomy_term_relations | Operational | Excluded (taxonomy graph) | none yet | None required. | Allow |
| taxonomy_skill_mapping_sets | Operational | Excluded (mapping config) | none yet | Redact description if exported to broad audiences. | Allow |
| taxonomy_skill_mapping_set_versions | Operational | Excluded (mapping lifecycle metadata) | none yet | None required. | Allow |
| taxonomy_skill_mapping_rules | Operational | Excluded (mapping config) | none yet | No subject PII; keep rule metadata internal. | Allow |
| question_terms | Operational | Excluded (content-term link table) | none yet | None required. | Allow |
| evaluation_terms | Operational | Excluded (content-term link table) | none yet | None required. | Allow |
| subject_terms | Operational | Excluded (subjects-taxonomy link table; contains only internal subject/term references) | none yet | None required. | Allow |
| passage_terms | Operational | Excluded (passage-taxonomy link table; contains only internal passage/term references) | none yet | None required. | Allow |
| inventory_views | Operational | Included (owner-level saved views via owner_user_id) | anonymize_user (by owner_user_id) | Redact filters if they may include sensitive search predicates. | Block |
| reporting.defensibility_exception_triage | Audit | Included (contains owner_user_id) | Retain (audit/workflow) | Redact free text-like operator fields (reasonCodes remain controlled enums). | Block |
| user_org_units | Identity | Included | anonymize_user | None required. | Block |
| user_roles | Identity | Included | anonymize_user | None required. | Block |
Note on Skipped Tables: Pure lookup tables (locales, capabilities), entity configuration tables (evaluations, programs, roles), and join tables without user context have been omitted as they do not contain PII.