Data Flow Map
This document outlines the primary data flows within the Evalium backend. It provides a high-level overview of how major domains connect and a detailed view of the critical “attempt → durable submission → reporting projection” boundary, including the defensible audit trail.
1. The Big Picture: Core Data Flows
Key contract (non-negotiable):
- Sessions are runtime state (retention-bound).
- Submissions (+ submission_items) are the durable source of truth for historical reporting.
- Evidence is linked durably to submissions and subject to storage tiering.
- The Ledger (ledger_events) is the immutable audit trail for all state changes.
- Reporting tables are derived projections (rebuildable from durable sources).
graph TD
subgraph "Audit & Defensibility (Immutable)"
L[Ledger Events]
end
subgraph "authoring & Content (Durable)"
Q[Questions & Passages] --> EV[Evaluation Versions]
T[Taxonomy / Skills] --> Q
end
subgraph "People, Groups & Subjects (Durable)"
U[Users] --> G[Groups]
SBT[Subjects] --> SUB
end
subgraph "Assignment & Policy (Durable)"
EV --> A[Assignments]
U --> A
G --> A
A --> EP[Effective Policy]
end
subgraph "Delivery Runtime (Retention-bound)"
EP --> S[Delivery Session]
S --> SE[Session Events / Telemetry]
S --> SI[Submission Items / Findings]
EVD[Evidence / Media Uploads] --> SI
end
subgraph "Durable Outcomes (Source of Truth)"
S --> SUB[Submission]
SI --> SUB
EVD --> SUB
L -.-> SUB
end
subgraph "Post-Submission Lifecycle (Reactive)"
SUB --> REM[Remediation]
SUB --> CLM[Claims / Disputes]
SUB --> PRG[Programme Progress]
REM -.-> L
CLM -.-> L
end
subgraph "Analytics (Derived / Rebuildable)"
SUB --> R[Reporting Projections]
R --> PRG
end
2. Source of Truth by Domain
- Audit Truth (Defensibility):
ledger_events. Every critical state change (publish, submit, review, remediate) MUST append to the ledger. - Content Truth:
evaluation_versions,question_versions,taxonomy_terms. - Targeting + Policy Truth:
assignments+assignment_overrides. - Runtime Truth (Ephemeral):
delivery_sessions,delivery_session_events(telemetry). - Durable Attempt Truth (Historical):
submissions,submission_items,submission_subjects,submission_evidence,version_snapshot. - Analytics Truth (Derived):
reporting.*tables; must be recomputable from durable attempt truth.
3. Detailed Sub-Flow: Defensibility & The Ledger
The Ledger is cross-cutting. Any service mutation that affects a durable record must follow this pattern:
- Validate business rules.
- Execute the database mutation (within a transaction).
- Emit a
ledger_event(within the same transaction). - Notify downstream workers (e.g., reporting projection) via event bus or table triggers.
3.1 The Durability Boundary (Finalisation)
When a session is submitted or an observation is finalised:
sequenceDiagram
participant API
participant ResultsService
participant DB_Public
participant Ledger
participant Worker
API->>ResultsService: Submit(session_id)
ResultsService->>DB_Public: BEGIN TRANSACTION
ResultsService->>DB_Public: INSERT submissions (from session runtime)
ResultsService->>DB_Public: UPDATE submission_items SET submission_id
ResultsService->>DB_Public: INSERT submission_evidence (link uploads)
ResultsService->>DB_Public: INSERT submission_subjects (link subjects)
ResultsService->>Ledger: INSERT ledger_events (type: 'submission.finalised')
ResultsService->>DB_Public: COMMIT
ResultsService->>Worker: Trigger Reporting Projection
4. Observational Assessment Flow
Unlike the candidate-led delivery flow, observational assessments are observer-driven:
- Subject Selection: Observer selects one or more
subjects(users or entities). - Findings Recording: Observer records
findings(which map tosubmission_items). - Evidence Attachment: Observer attaches
evidence(photos, documents) to specific findings. - Finalisation: The observer "submits" the observation, creating a
submissionrecord with the subjects linked viasubmission_subjects.
5. Post-Submission Lifecycle (Remediation & Disputes)
Submissions may be modified after finalisation through formal, audited processes:
- Results Remediation: An admin updates an answer key or drops a flawed question. This triggers a
results.remediatedledger event and queues a reporting re-projection. - Claims & Disputes: A candidate challenges a result. The workflow (claim -> dispute -> resolution) is recorded in
claimsanddisputestables and mirrored in the ledger. - Programme Orchestration: When a submission is finalised and scored, the
programme_orchestratorevaluates if it meets requirements for any enrolledprogrammes. If so, it updates theprogramme_enrolment_progress.
6. Retention & Cleanup Rules
- Runtime Telemetry:
delivery_session_eventsare high-volume and may be purged after 30-90 days. - Evidence Tiers: Fresh evidence is in "Hot" storage (S3/MinIO). After a period (e.g., 1 year), the
evidence_storage_tier_workermoves it to "Cold" storage (Glacier/Archive) and updates the record. - Audit Trail:
ledger_eventsare never purged; they are the permanent record of platform activity. - Durable Submissions: Retained according to the organization's
retention_policy(configured in Compliance Centre).
7. Practical Implications for Feature Design
- Idempotency: All mutations at the durability boundary must be idempotent using
idempotency_keyto handle network retries safely. - Visibility Audit: Every time a user views sensitive data (a submission, evidence, or subject), a
viewedevent should be appended to the ledger (handled byvisibility_helpers). - Reporting is Recomputable: If the
reporting.*tables are corrupted or schema changes, thereporting_projection_workercan be re-run against allsubmissionsto rebuild the analytics state.