Skip to main content

Data Inventory & PII Checklist

This inventory tracks PII-bearing domains and the required Definition of Done for any new table/feature that touches PII. Update this file whenever schemas or queries change for PII data.

Definition of Done for new PII

  • Classification: Identity / Knowledge / Observation / Evidence / Telemetry / Audit.
  • DSAR: Included or excluded (and why). If excluded, state the rationale.
  • Retention domain/policy: Which retention policy applies, or “none yet” with rationale.
  • Sanitisation: Fields to redact before export/logging (free-text, payloads, error strings, etc.).
  • Holds: Confirm behaviour under legal holds (block vs allow).

Current coverage (v1)

Domain/Table(s)ClassificationDSARRetentionSanitisation / NotesHolds
usersIdentityIncludedanonymize_userExport sanitises metadata; scrub anonymises all fields.Block
assignment_invitesIdentityIncludednone yetemail field requires scrubbing on user deletion.Allow
assignment_overridesOperationalIncludedanonymize_user (by user_id)Redact reason text.Block
assignmentsOperationalIncludedanonymize_usersecurity_config may require redaction.Block
attempt_question_exposureTelemetryExcluded (operational)telemetry_deleteNone required.Allow
audit_logsAuditExcludedRetain (audit)Redact metadata before export/surfacing.Allow
certificatesEvidenceIncludedanonymize_userRedact metadata.Block
compliance_ledgerAuditExcludedRetain (audit)metadata sanitised on creation.Allow
compliance_ledger_outboxAuditExcludedRetain (audit)metadata sanitised on creation.Allow
delivery_session_eventsTelemetryExcluded (operational)telemetry_deletemetadata sanitised on creation.Allow
delivery_sessionsTelemetryIncludedtelemetry_deleteRedact client_ip, client_user_agent, device_fingerprint, metadata.Block
group_membersIdentityIncludedanonymize_userRedact metadata.Block
groupsIdentityIncludednone yetRedact allowed_domains, membership_rule, metadata.Block
magic_linksIdentityIncludednone yetemail field requires scrubbing on user deletion.Allow
media_assetsEvidenceIncluded (via submissions)evidence_deleteDelete object and row; no sanitisation needed for asset refs.Block
media_text_tracksEvidenceIncluded (via asset refs)evidence_deleteRedact raw_text.Block
ledger_eventsAuditExcluded (system/execution audit trail; actor linkage is provenance, not DSAR subject data)Retain (audit)Redact metadata, context_metadata, and any entity-reference payload before export/surfacing.Allow

| outbox_events | Operational | Excluded | none yet | Redact payload, headers. | Allow | | passages | Operational | Excluded (authoring content corpus; no learner/runtime subject linkage by itself) | none yet | Redact title, content, and tags if copied into logs/exports; authored text may contain sensitive material. | Allow | | passage_versions | Operational | Included (contains created_by actor linkage for authoring provenance) | anonymize_user (by created_by) | Redact title, content, content_plain_text, content_plain_text_all_locales; keep hashes/internal lineage fields internal. | Block | | privacy_holds | Audit | Included | Retain (audit) | Redact reason. | N/A | | privacy_jobs | Audit | Excluded | Retain (audit) | Redact payload, decision_metadata, error_message. | Allow | | program_enrolments | Knowledge | Included | anonymize_user | Redact metadata, program_snapshot. | Block | | program_progress | Knowledge | Included | anonymize_user | None required. | Block |

| programme_assignment_outbox| Operational | Excluded | none yet | None required. | Allow | | result_correction_batches | Audit | Excluded | Retain (audit) | Redact reason. | Allow | | result_corrections | Audit | Excluded | Retain (audit) | Redact note, new_key. | Allow | | submissions | Evidence | Included | evidence_delete | answers, result, version_snapshot. | Block | | submission_items | Evidence | Included | evidence_delete | answer, metadata | Block | | submission_score_versions | Knowledge | Included | evidence_delete | outcome_label | Block | | content_packs | Operational | Excluded (authoring metadata; no subject PII) | none yet | Redact description; created_by is actor provenance only. | Allow | | content_pack_items | Operational | Excluded (authoring linkage) | none yet | Redact metadata before export/logging. | Allow | | content_pack_revisions | Operational | Excluded (authoring snapshot metadata) | none yet | Redact manifest if surfaced externally; contains content structure refs only. | Allow | | content_pack_revision_items | Operational | Excluded (authoring linkage) | none yet | Redact metadata before export/logging. | Allow | | content_pack_installs | Operational | Included (contains created_by actor linkage) | anonymize_user (by created_by) | No free text; treat ids as internal provenance data. | Block | | taxonomy_facets | Operational | Excluded (taxonomy config) | none yet | Redact description if copied into logs. | Allow | | taxonomy_terms | Operational | Excluded (taxonomy config) | none yet | Redact description if copied into logs. | Allow | | taxonomy_term_relations | Operational | Excluded (taxonomy graph) | none yet | None required. | Allow | | taxonomy_skill_mapping_sets | Operational | Excluded (mapping config) | none yet | Redact description if exported to broad audiences. | Allow | | taxonomy_skill_mapping_set_versions | Operational | Excluded (mapping lifecycle metadata) | none yet | None required. | Allow | | taxonomy_skill_mapping_rules | Operational | Excluded (mapping config) | none yet | No subject PII; keep rule metadata internal. | Allow | | question_terms | Operational | Excluded (content-term link table) | none yet | None required. | Allow | | evaluation_terms | Operational | Excluded (content-term link table) | none yet | None required. | Allow | | subject_terms | Operational | Excluded (subjects-taxonomy link table; contains only internal subject/term references) | none yet | None required. | Allow | | passage_terms | Operational | Excluded (passage-taxonomy link table; contains only internal passage/term references) | none yet | None required. | Allow | | inventory_views | Operational | Included (owner-level saved views via owner_user_id) | anonymize_user (by owner_user_id) | Redact filters if they may include sensitive search predicates. | Block | | reporting.defensibility_exception_triage | Audit | Included (contains owner_user_id) | Retain (audit/workflow) | Redact free text-like operator fields (reasonCodes remain controlled enums). | Block | | user_org_units | Identity | Included | anonymize_user | None required. | Block | | user_roles | Identity | Included | anonymize_user | None required. | Block |

Note on Skipped Tables: Pure lookup tables (locales, capabilities), entity configuration tables (evaluations, programs, roles), and join tables without user context have been omitted as they do not contain PII.