🛡️ Delivery Resilience & Hardening Roadmap
Owner: Product Engineering
Status: Draft
Pillar: 4 — Delivery Engine
Related: architecture, FOUNDATION.md, assignments-roadmap.md, roles-and-access-control.md
This document defines the Operational Hardening required for the Delivery Engine (ResultsService).
The core logic (preview, snapshotting, scoring) is functionally correct and the schema is mature.
What is missing is the Resilience Layer required for high-scale, adversarial, and lossy network conditions.
We are moving from “trusting the client” to “enforcing on the server”.
0. Scope & Assumptions
This roadmap assumes:
delivery_sessions,submission_items, andsubmissionsalready exist and are wired to:CreateDeliverySessionRecordAnswerSubmitSession
EvaluationServiceandResultsServicealready:- Resolve buckets deterministically via
seed. - Generate
version_snapshotwhen creating submissions. - Compute scores and outcomes from the snapshot.
- Resolve buckets deterministically via
We also follow the Evalium golden rule:
-
Configuration (The Law)
- Lives in
evaluation_versionsand, for overrides,assignments/assignment_overrides. - It is immutable and snapshotted.
- Lives in
-
State (The Reality)
- Lives in
delivery_sessions(and child tables). - It is mutable and describes what is actually happening in a specific attempt.
- Lives in
-
Enforcement (The Judge)
ResultsServicecompares Reality to The Law on every request.
This work is not about rewriting the engine.
It is about hardening and operationalising it so that:
- timing can never be spoofed by the browser,
- background workers can enforce deadlines efficiently,
- monitoring can distinguish “truly active” from “zombie” sessions,
- audits can explain why a session ended, not just that it ended, and
- the schema is ready for high-stakes delivery features (section timing, linear navigation, lockdown browser) without painful rewrites later.
1. Objectives
-
Zero-Trust Timing
The server never relies on the browser to decide whether a test has expired. -
Scalable Expiry
Deadlines are computed once and stored inexpires_at, enabling cheap, indexed checks. -
Zombie Detection
We distinguish between sessions that are merely “marked active” and sessions with recent activity. -
Forensic Auditability
We capture why a session ended (timeout vs user vs admin vs system) and when the client says an answer was produced (for offline/bulk-upload detection). -
High-Stakes Readiness
The schema can express:- per-section timing,
- linear navigation constraints, and
- secure browser configuration / forensics,
even if UI and full enforcement logic arrive later.
Phase 1 — Schema Hardening
Goal: Equip the database to answer operational questions without complex calculations or repeated joins.
1.1 delivery_sessions Table Upgrades
We add three mandatory and one optional column to move key state from “runtime calculation” into “persistent, indexed fields”.
ALTER TABLE delivery_sessions
-- 1. Performance: the absolute wall-clock deadline for this session.
-- Computed ONCE at creation/resume. Nullable if no time limit applies.
ADD COLUMN expires_at TIMESTAMPTZ,
-- 2. Monitoring: liveness / heartbeat.
-- Updated on every interaction that touches the session (answers, heartbeat, resume).
ADD COLUMN last_active_at TIMESTAMPTZ DEFAULT NOW(),
-- 3. Audit: why did this session end?
-- Preferably use a dedicated enum type in the DB, e.g.
-- 'user_submit', 'auto_expired', 'admin_forced', 'system_error'
ADD COLUMN termination_reason TEXT,
-- 4. (Optional) Idempotency / optimistic locking.
-- Only use once the service layer is prepared to increment and check it.
ADD COLUMN lock_version INT DEFAULT 1;
-- CRITICAL: Index for the Auto-Close Worker and cheap expiry checks.
CREATE INDEX idx_sessions_active_expires
ON delivery_sessions (status, expires_at)
WHERE status = 'active';
Note: If the codebase is ready, consider introducing a proper enum type for
termination_reason(e.g.CREATE TYPE delivery_session_termination_reason AS ENUM (...)) instead of raw text.
1.2 submission_items Forensics
ALTER TABLE submission_items
-- Forensic evidence: when did the CLIENT say this interaction happened?
-- This is for analysis, not for enforcement. Server-side timestamps remain authoritative.
ADD COLUMN client_timestamp TIMESTAMPTZ;
Assumption:
submission_itemsalready has a server-sidecreated_at/updated_at. If not, that should be added and populated as well.
Phase 2 — The Single Source of Truth (Service Layer)
Goal: Centralise deadline logic so it is never re-implemented or drifted in multiple handlers.
2.1 CalculateSessionExpiry (Domain Function)
We must not scatter started_at + time_limit arithmetic across handlers and workers.
We define a single internal domain function, used by:
CreateDeliverySession- Any future
ResumeSession - Any mid-flight “extend time” operations (e.g. late accommodations)
Behaviour:
-
Fetch Baseline
- Read
evaluation_version.time_limit(or equivalent settings) for the chosen version.
- Read
-
Apply Assignment Rules
- Apply any
assignment.time_limit_overridefrom the assignment that spawned this session.
- Apply any
-
Apply Overrides / Accommodations
- Apply any
assignment_overrides.time_limit_extension(or similar “mercy time” field).
- Apply any
-
Compute Final Duration
effective_duration = baseline_limit + overrides + extensions- If no time limit exists, return
nil.
-
Compute Expiry
expires_at = started_at + effective_duration
-
Persist
- Store the result in
delivery_sessions.expires_at.
- Store the result in
Invariant: From this point on, all expiry checks in the system must consult
delivery_sessions.expires_atas the source of truth, never recompute from raw settings.
If a manual override grants extra time mid-flight:
- call
CalculateSessionExpiryagain, - update
expires_atin a single, transactional operation.
Phase 3 — Enforcement (Guardrails in Handlers)
Goal: Make the runtime services actively enforce the rules encoded in the schema.
3.1 Hardening RecordAnswer (PUT /sessions/\{id\}/answers)
Before persisting any answer:
-
Session Load & Status Check
- Load the session by ID under RLS.
- Reject if
statusis notactive.
-
Zero-Trust Expiry Check
-
Use server time only:
if session.ExpiresAt.Valid && now.After(session.ExpiresAt.Add(GracePeriod)) {
return 403, "Session has expired"
} -
GracePeriodis a small server-side constant (e.g. 5–30 seconds) to tolerate minor clock skew and network jitter.
-
-
Liveness Update
- Set
last_active_at = NOW()in the same transaction as the answer write.
- Set
-
Persist Answer & Forensics
-
Upsert
submission_itemsfor(session_id, question_version_id, item_pos):- ensure this remains idempotent using the existing unique constraint.
-
Store:
- server-side timestamps (
created_at/updated_at), client_timestampfrom the payload (if provided).
- server-side timestamps (
-
Important:
client_timestampis never used to override expiry decisions. It is purely for later analysis (offline drift, bulk uploads, etc.).
3.2 Hardening SubmitSession
SubmitSession must be safe under retries and races (e.g. user double-click, network flakiness, concurrent auto-close).
Behaviour:
-
Expiry Check (same as RecordAnswer)
-
Reject manual submit if the session is already past
expires_at + GracePeriod:- optionally treat this as an auto-expire flow instead.
-
-
Idempotent Status Transition
-
Use a guarded update:
UPDATE delivery_sessions
SET status = 'submitted',
termination_reason = 'user_submit',
last_active_at = NOW()
WHERE id = $1
AND status = 'active'; -
Check the affected row count:
- If
0, the session was already submitted or auto-expired; surface a friendly response to the client.
- If
-
-
Submission Snapshot & Scoring
-
If the status change succeeded, run:
buildSubmissionSnapshotcomputeAndPersistMetrics
-
Ensure the submission creation pipeline is idempotent:
- either via a unique constraint on
submissions.session_id, - or by checking if a submission already exists for that session before inserting.
- either via a unique constraint on
-
Note: The same idempotent pattern should be used by the auto-close worker (Phase 4) so user submit and auto-expire cannot both generate submissions.
Phase 4 — Operational Resilience (Background Jobs & Monitoring)
Goal: Automatically clean up stale sessions and express liveness clearly in monitoring.
4.1 The “Reaper” (Auto-Close Worker)
A background worker (cron or internal ticker) that finalises sessions whose deadline has passed.
Query (cheap due to index):
SELECT id
FROM delivery_sessions
WHERE status = 'active'
AND expires_at IS NOT NULL
AND expires_at < (NOW() - GracePeriod);
Action (per session ID):
-
Attempt to transition status:
UPDATE delivery_sessions
SET status = 'submitted',
termination_reason = 'auto_expired'
WHERE id = $1
AND status = 'active'; -
If the row was updated:
-
Run the same submission pipeline as
SubmitSession:- snapshot + scoring.
-
-
If no row was updated:
- Another actor has already closed or submitted the session; do nothing.
Invariant: Submission creation must remain idempotent with respect to
session_idso that manual submit and the Reaper can race safely.
4.2 Liveness Indicators in the Command Centre
The Assignment Monitor (Pillar 3) can now use last_active_at to visualise liveness:
- Green:
last_active_at > NOW() - 30s(very recent activity) - Amber:
last_active_at > NOW() - 5m - Grey (Zombie):
last_active_at <= NOW() - 5mwhilestatus = 'active'
This does not change any backend invariants but provides:
- realistic admin expectations (“this user is idle, not necessarily still present”), and
- a foundation for future alerts or interventions (e.g. admin force-close).
Phase 5 — High-Stakes Delivery Foundation (Schema Only)
Goal: Ensure the schema supports strict constraints for high-stakes delivery:
- section-level timing,
- linear navigation (“no going back”),
- secure / locked-down browser delivery,
without needing disruptive schema changes later. UI and full enforcement logic can follow in later iterations.
5.1 Section State Tracking (delivery_session_section_states)
High-stakes exams often require per-section timing (e.g. Section 1: 20 minutes, Section 2: 20 minutes), where unused time from one section cannot be carried over.
Configuration (The Law):
-
Section time limits live in the evaluation configuration, e.g. in the evaluation JSON / snapshot:
sections: [{ id: "s1", time_limit_seconds: 1200 }, ...]
State (The Reality):
- We track per-section timing in a child table:
-- Tracks the state and timing of specific sections within a session
CREATE TABLE delivery_session_section_states (
session_id UUID NOT NULL REFERENCES delivery_sessions(id) ON DELETE CASCADE,
section_id TEXT NOT NULL, -- Matches the section ID in the evaluation snapshot
status TEXT DEFAULT 'locked', -- 'locked', 'open', 'completed'
started_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ, -- Calculated at the moment the section is opened
PRIMARY KEY (session_id, section_id)
);
Enforcement (The Judge) – future logic:
-
When a section is opened:
-
backend creates/updates the corresponding row with:
status = 'open',started_at = NOW(),expires_at = NOW() + effective_section_limit(respecting any accommodations).
-
-
When
RecordAnsweris called:-
service can check
delivery_session_section_statesfor the relevantsection_id:- reject if
status != 'open'orNOW() > expires_at.
- reject if
-
This keeps section timing laws in the evaluation snapshot, but the enforced reality in delivery_session_section_states, consistent with your broader architecture.
5.2 Navigation State for Linear Delivery
Some high-stakes exams require linear navigation (no revisiting previous items) or controlled progression per section.
We introduce navigation state on the session:
ALTER TABLE delivery_sessions
ADD COLUMN max_viewed_item_index INT DEFAULT 0,
ADD COLUMN current_section_id TEXT;
-
max_viewed_item_index:- Highest item index the candidate has been allowed to see so far.
- Supports rules such as “you cannot go back behind the furthest point reached”.
-
current_section_id:- Optional convenience field to reflect which section is currently open,
- complements
delivery_session_section_statesfor quickly routing and validating navigation.
Future enforcement examples (not implemented in this phase):
-
GetSession:- only returns items up to
max_viewed_item_indexin non-review modes.
- only returns items up to
-
RecordAnswer:-
if a request attempts to answer an item index
< max_viewed_item_indexwhile linear mode is configured:- reject with 403.
-
The key here is schema readiness; strict linear/section logic can be layered on later.
5.3 Security Configuration for Locked-Down Browsers
For “locked-down browser” scenarios (e.g. Safe Exam Browser – SEB), we separate:
-
Configuration (The Law):
- what is expected (allowed SEB keys, IP allowlists),
-
State (The Reality):
- what the backend actually saw during the session.
5.3.1 Assignment Security Configuration
We ensure the assignment can store security and lockdown expectations:
ALTER TABLE assignments
ADD COLUMN security_config JSONB;
-- Example:
-- {
-- "seb_allowed_hashes": ["a7f...", "b2c..."],
-- "ip_allowlist": ["192.168.0.0/24"],
-- "require_seb": true
-- }
This is where we can store:
- expected SEB Browser Exam Keys (hashes derived from SEB config),
- IP allowlists / ranges,
- boolean flags like
require_seb.
Future enforcement:
-
Middleware on sensitive routes (e.g.
/api/v1/sessions/**) will:- read
assignments.security_config, - validate request headers such as
X-SafeExamBrowser-RequestHash, - compare against
seb_allowed_hashes, - reject (
403) requests that do not present valid lockdown proofs.
- read
5.3.2 Session Device Fingerprint (Forensics)
We also record what we actually see during delivery for later analysis:
ALTER TABLE delivery_sessions
ADD COLUMN device_fingerprint JSONB;
-- Example:
-- {
-- "user_agent": "SEB/3.0.1 (Windows 10)",
-- "ip": "203.0.113.42",
-- "seb_hash": "a7f...",
-- "platform": "win32"
-- }
-
This supports:
- debugging “why was a session blocked?”,
- downstream analytics (e.g. distribution of platforms and SEB versions),
- future anomaly detection.
Important: Enforcement still happens in middleware / service guards.
device_fingerprint is for audit and monitoring; security_config expresses the law.
Summary of Changes
| Feature | Old Behaviour | New Resilience Behaviour / Foundation |
|---|---|---|
| Deadlines | Calculated on read (joins, JSON, logic) | Computed once and stored in expires_at |
| Time Checks | Relied on client timer | Server-side rejection via expires_at + grace |
| Activity | Binary (active / submitted) | Includes last_active_at liveness tracking |
| Completion | Ambiguous cause | Explicit termination_reason (why it ended) |
| Cheating View | No visibility | client_timestamp for offline / bulk-upload forensics |
| Concurrency | Implicit, best-effort | Idempotent transitions & submission creation |
| Section Timing | Only global session timing possible | Per-section timing via delivery_session_section_states |
| Linear Nav | No explicit navigation state | max_viewed_item_index / current_section_id schema in place |
| Lockdown Browser | No structured config or device evidence | assignments.security_config + device_fingerprint for SEB-style enforcement |
This roadmap converts the Delivery Engine from a primarily passive storage layer into an active enforcement engine, capable of supporting:
- high concurrency,
- untrusted clients,
- lossy network conditions,
- credible audits, and
- future high-stakes features (section timing, linear navigation, locked-down browsers),
and forms the required foundation for Programmes & Certifications (Pillar 5).