📗 UX-INSIGHTS.md

This document specifies the UX patterns for the Insights category. This is the "Durable Intelligence" layer of Evalium, where raw assessment outcomes are transformed into audit-proof reporting, skill projections, and high-stakes corrections.

Insights UX must prioritise provenance, interpretability, and defensibility so that every score and skill level can be explained.

Capability Baseline (Validated 2026-02-25)

This Insights plan spans both currently live backend capability and future-facing UX targets.

Backend-live now:

Submission-centric insights: koeStatus + dual-time proofReadiness (defensibleAtExecution, readyNow) on submission detail/list surfaces.
Defensibility exceptions queue with triage on submission, engagement, and programme-requirement lenses (including refresh + suppression semantics).
Ledger-derived reporting projections (summary/health/range), snapshot-aware submission retrieval, and remediation ledger flows.

Not backend-ready yet (feature-flag or park in UX):

Skills inference projections and competence profiles.
Deterministic skills explainability backed by persisted skill_evidence_facts provenance facts.
Skills recalculation/backfill operator jobs and progress surfaces.

Related backend gap to account for in UX:

Explicit first-class proctor command endpoints (pause / resume / terminate) are still pending; model proctor history as event/timeline data when present.

0. Insights Doctrine (MUST)

0.1 Durable over Ephemeral

Insights must only use durable truth:

submissions (+ submission_items where needed)
version_snapshot (the “what the candidate saw” contract)
skill_evidence_facts (durable skill contribution facts, once skills inference is enabled)
read models / projections derived from the above

Runtime session data is excluded from Insights to ensure reports survive retention cleanup.

0.2 The Explain Link

Every derived value (skill level, pass/fail outcome, corrected score, rollup metric) must provide an Explain affordance that links to one of:

Evidence Basis (facts/submissions that contributed)
Snapshot Scope (evaluation/rubric/mapping version used)
Calculation Summary (human-readable “how computed”, not code/SQL)

0.3 Version Awareness

All Insights surfaces must show:

Scope indicators (evaluation version, mapping set version, framework version, cohort/run label where relevant)
Freshness indicators (last updated timestamp)
Processing indicators (queued/running/completed/failed), if a worker pipeline is involved

0.4 Projection Lag & Data States

Insights data can be in one of the following states:

Fresh: updated recently; no job pending
Processing: a known job is queued/running
Stale: no job running, but freshness exceeds a threshold OR refresh failed
Failed: job failed; actionable error text shown with a retry path (capability gated)

The UI must visibly distinguish Processing vs Stale.

0.5 Insights Interface Guardrails (MUST)

Every derived value must provide an Explain path or an explicit, non-ambiguous unavailable reason.
Report blocks must always include scope chips, freshness, and processing state where applicable.
Small-sample metrics must render Insufficient data instead of inferred confidence.
Charts require accessible equivalents (table/text summaries) and keyboard operability.
Status/readiness language must use canonical labels; raw backend enums must not be primary UI text.
Mobile/iPad supports read, filter, and explain inspection; complex report configuration may be desktop-optimized with explicit handoff.

1. Core Insights Primitives (Reusable Patterns)

1.1 Insight Block (Dashboard Card)

All dashboards are composed of modular blocks.

Block MUST include

Title + short description (“what this represents”)
Scope chips (version/timeframe/cohort)
Freshness indicator
Explain link (Evidence Basis / Calculation / Snapshot Scope)
Drilldown CTA

1.2 Scope Chips (MUST)

Any derived view must show applicable scope chips. Common chips:

Evaluation Version (e.g., “Eval v3”)
Mapping Set Version (e.g., “Mapping v7”)
Framework Version (if applicable)
Run Label / Cohort (e.g., “Compliance_Q1_2025”)
Time window (e.g., “Last 30 days”)

Only show chips backed by current data contracts for that surface (e.g., mapping/framework chips are hidden until skills projections are live).

A consistent banner pattern used across Insights pages:

Updated: “Updated 2m ago”
Processing: “Processing updates…”
Stale: “Data may be stale — last updated 12m ago”
Failed: “Update failed — retry” (capability gated)

1.4 Evidence Basis Drawer (MUST)

A standard drawer used by all Explain links that point to evidence.

Drawer MUST include

Clear title (“Evidence Basis: [Skill]” / “Evidence Basis: Pass rate”)
Scope chips (mapping/eval/version/timeframe)
Evidence list (facts or submissions) in a timeline
Links to underlying Attempt Viewer(s)
“Open full page” escape hatch where appropriate

2. Competence Profiles (Skill Projections, Planned/Parked)

This section is a target-state contract, not a current implementation contract. Do not ship this surface until backend persists deterministic skill provenance facts (mapping set version, rule IDs, source term IDs) and exposes explain-ready read models.

The Competence Profile is a view-layer projection of a user's abilities based on the KOE model:

K Knowledge (tests)
O Observation (assessor judgements)
E Evidence (uploads / artefacts)

The default experience is heatmap-first to show strengths and weaknesses at a glance.

2.1 The Profile Hub (Heatmap-First)

Primary View: Skill Heatmap (MUST)

A grid that visualises strengths/weaknesses defensibly.

Layout

Skills grouped by framework/category (collapsible sections)
Each skill has a heatmap cell representing attainment
A secondary overlay represents confidence/coverage (evidence sufficiency)

Heatmap Cell Semantics

Attainment (strength/weakness) is the primary dimension.
Confidence/Coverage is separate and must not be confused with attainment:
- Use a compact overlay pattern (e.g., mini meter, dot density, or badge)
- Tooltip shows explicit counts: “Evidence: 3 of 5 required” (or equivalent rule)

Status Chip per Skill Row

Not Started
In Progress
Achieved

KOE Mix Hint (SHOULD) A tiny indicator shows evidence composition:

“K:2 O:1 E:0” (or icons) This helps users understand why confidence is low without opening the drawer.

Last Evidence Date

Shows the most recent contributing evidence date
Click opens Evidence Basis drawer focused on “most recent”

Heatmap Controls (SMB-Simple) (MUST)

Framework selector (if multiple)
Filter: “Needs attention”
- Low attainment OR low evidence coverage
Sort:
- Lowest attainment first
- Lowest coverage first

Strengths & Gaps Summary (SHOULD)

Above the heatmap:

“Top strengths” list
“Key gaps” list

Defensibility rule If coverage is low, label the gap as:

“Low evidence” rather than “Weak”

Each list item is clickable → Evidence Basis drawer.

2.2 Evidence Basis Drill-down Drawer (MUST)

Clicking any skill cell opens a drawer revealing the “why” behind the rating.

Drawer content

Evidence Timeline: list of skill_evidence_facts in chronological order
Source labels: K / O / E (with human-friendly names)
Contribution summary: what each fact contributed (e.g., “+1 achieved”, “meets threshold”, “partial”)
Mapping Provenance:
- Mapping Set Version chip
- Rule name/identifier (human-readable)
- Optional “Calculation summary” link (what the rule does)

Links

Each evidence item links to the Attempt Viewer (Submission) where applicable
Mapping rule detail opens a small “Rule detail” drawer (readable explanation, not raw logic)

2.3 Profile Freshness & Processing (MUST)

Competence Profiles must show:

Updated time
Processing state if recalculation jobs are queued/running
Stale badge if data is old and no job is running

3. Reporting & Analytics

Reporting provides version-scoped summaries of performance and operational outcomes.

3.1 Blocks-Based Dashboards (MUST)

Dashboards are composed of modular Insight Blocks. Recommended baseline blocks:

Hero Block
- Pass rate, average score, completion volume
Funnel Block
- Assignment → Started → Submitted → Graded (if subjective grading exists)
Distribution Block
- Score histogram / bands for difficulty detection
Item Health Block
- Flags items with poor discrimination / unexpected patterns
- Includes data sufficiency guardrails

Data Sufficiency Guardrails (MUST)

If sample size is low, blocks must degrade safely:

Show “Insufficient data” (with threshold) instead of presenting fragile metrics as truth
Explain link clarifies why the block is limited

Filters (MUST)

Common filters should be consistent across dashboards:

Evaluation / Programme
Version (see 3.2)
Run label / cohort
Time window
Org unit (hidden if single org unit tenant)

3.2 Version-Scoped Summaries (MUST)

Because content is immutable and attempts are snapshot-based, reporting must make version scope explicit.

UI pattern

Version picker defaults to Latest Published
Allows:
- switch to prior versions
- compare (optional) across versions (Phase 3+)

Version scope chips must appear on every block.

Provisional Badge (MUST)

If subjective grading is pending for the selected cohort/scope:

Label results as Provisional
Explain link opens “What’s pending” drawer (e.g., count of submissions awaiting subjective marking)

3.3 Drilldowns (MUST)

Every block’s drilldown must land on a consistent “Detail Report” page:

Table + filters
Scope chips + freshness
Explain affordances per row (where applicable)
Links to Attempt Viewer for per-candidate analysis (capability gated)

3.4 Proof Readiness & Exceptions (MUST)

Insights must include a first-class operator view of defensibility readiness, not just scores.

Minimum contract:

Show proofReadiness dual-time states:
- defensibleAtExecution
- readyNow
Show stable reason codes and policy refs used in computation.
Present KOE status alongside readiness so operators can see why action is needed.

Exceptions queue behavior:

Queue rows include lens + subject identifiers and reason summary.
Triage metadata is explicit: state, owner, firstSeenAt, lastSeenAt, suppressedUntil.
Support lens-specific views:
- submission
- engagement rollup
- programme requirement rollup
Default list should hide suppressed items unless explicitly requested.

Status label contract:

readiness labels in UI are Ready, Needs review, Blocked (mapped from backend enums).
triage labels in UI are Open, Acknowledged, Resolved (mapped from backend enums).
do not mix alternate wording (for example Action needed) inside readiness/triage workflows.

Explainability rule:

Readiness/KOE summaries are derived projections; execution truth remains in ledger/snapshots.
External glass-box style surfaces must capability-redact detailed reasons by default.

4. Attempt Viewer (Submissions)

The Attempt Viewer is the standard interface for reviewing a specific candidate’s result. It is the canonical truth page for an attempt.

4.1 Historical Integrity (MUST)

Snapshot View (MUST)

The UI renders the evaluation exactly as it was at the time of the attempt, using the version_snapshot.

“Snapshot Scope” is visible as chips (Eval vX, rubric version, etc.)
“View snapshot detail” opens a drawer (read-only)

Activity Ledger (MUST)

Displays the durable history of the attempt:

timestamps (started/completed)
duration
completion method/reason (where available)
proctor notes/events (explicit command workflow is backend-pending)
overrides applied (effective limits shown)
remediation applied (links to batch)

4.2 Score & Corrections (Ledger UX) (MUST)

Scores are append-only and may be corrected.

UI requirements

Show the current score/outcome
If corrected, show:
- “Corrected” badge
- Original → Current
- Link to remediation batch + reason

Score Version Strip (SHOULD) A compact timeline:

v1 (original) → v2 (corrected) → … Clicking a version shows “what changed” in a drawer.

4.3 Feedback Rendering

The UI respects feedback_mode defined in the snapshot:

none / overall / tags / items

SMB principle

Candidates see only what is configured
Admins always see full diagnostics (tags + item-level) where permitted

5. Results Remediation (Correcting Truth)

Remediation allows authors to fix unfair scoring without deleting data.

5.1 Correction Batch Wizard

Scope First (MUST)

The wizard begins with explicit scope selection:

Evaluation (and version)
Run label / cohort (optional but recommended)
Date range (optional) This prevents accidental broad corrections.

Rule Builder (MUST)

Define correction actions:

Mark correct
Drop item
Replace key (Extensible later)

Impact Preview (SHOULD)

Before applying, show a dry-run summary:

“X candidates move from Fail → Pass”
“Average score changes 72% → 75%”
“Y submissions affected”

If not available yet, show a placeholder state:

“Impact preview not available for this batch” (with rationale)

5.2 Remediation Ledger (MUST)

Every batch creates a permanent record.

UI requirements

Mandatory Reason
Clear status: queued / applying / applied / failed
Link to affected submissions list

Idempotency Feedback (MUST)

If re-applied:

“No changes needed — already applied”
Provide link to prior application record

Revert Visibility (SHOULD)

If revert/compensating corrections exist:

Provide a “Revert” action (capability gated)
Revert must create its own ledger record and reason

6. Skills Recalculation (Backfills, Planned/Parked)

When Skill Mapping Sets change, SMBs can re-project historical data.

This remains parked until skills projection/provenance pipelines are backend-mature.

6.1 Recalculation Job UI

Scope Selection (MUST)

Choose recalculation scope:

Org unit
Evaluation
Date range
Run label (optional)

Processing State (MUST)

Show job states consistently:

queued / running / completed / failed

Progress Tracker (MUST)

Progress bar + counts processed
“Estimated completion” is optional; do not show if unreliable

Result Summary (MUST)

Once complete:

Rows affected
Skills impacted
Link to updated competence profiles
Freshness updated timestamp

7. Export & Defensibility Surfaces

7.1 Exports (MUST)

Exports must preserve scope and provenance:

Include scope chips in export metadata (version/timeframe/run label)
If data is provisional or corrected, exports must label it

Common exports:

CSV (tables)
PDF (Attempt Viewer / summary report)
DSAR inclusion surfaces are governed elsewhere; include competence profiles only when skills projections are active and defensible

7.2 Audit-Proof Reporting (MUST)

Any page that presents “final truth” must make it defensible:

Explain links present
Ledger visible
Version scope explicit
Corrections visible with reason

Insights Task	Pattern	Primary CTA	Link Contract (Peek + Full Page)
View Skills (Future)	Heatmap/Profile Hub	Recalculate	Evidence Basis (facts + submissions)
Review Defensibility Queue	Exceptions Table + Triage	Acknowledge / Suppress / Resolve	Submission/Engagement/Programme detail
Review Result	Attempt Viewer	Download PDF	Version Snapshot Detail
Fix Scores	Remediation Wizard	Apply Batch	Affected Submissions List
Export Data	Table + Filter	Export CSV	Context Drawer (User / Org / Version Scope)

Capability Baseline (Validated 2026-02-25)​

0. Insights Doctrine (MUST)​

0.1 Durable over Ephemeral​

0.2 The Explain Link​

0.3 Version Awareness​

0.4 Projection Lag & Data States​

0.5 Insights Interface Guardrails (MUST)​

1. Core Insights Primitives (Reusable Patterns)​

1.1 Insight Block (Dashboard Card)​

1.2 Scope Chips (MUST)​

1.3 Freshness & Processing Banner (MUST)​

1.4 Evidence Basis Drawer (MUST)​

2. Competence Profiles (Skill Projections, Planned/Parked)​

2.1 The Profile Hub (Heatmap-First)​

Primary View: Skill Heatmap (MUST)​

Heatmap Controls (SMB-Simple) (MUST)​

Strengths & Gaps Summary (SHOULD)​

2.2 Evidence Basis Drill-down Drawer (MUST)​

2.3 Profile Freshness & Processing (MUST)​

3. Reporting & Analytics​

3.1 Blocks-Based Dashboards (MUST)​

Data Sufficiency Guardrails (MUST)​

Filters (MUST)​

3.2 Version-Scoped Summaries (MUST)​

Provisional Badge (MUST)​

3.3 Drilldowns (MUST)​

3.4 Proof Readiness & Exceptions (MUST)​

4. Attempt Viewer (Submissions)​

4.1 Historical Integrity (MUST)​

Snapshot View (MUST)​

Activity Ledger (MUST)​

4.2 Score & Corrections (Ledger UX) (MUST)​

4.3 Feedback Rendering​

5. Results Remediation (Correcting Truth)​

5.1 Correction Batch Wizard​

Scope First (MUST)​

Rule Builder (MUST)​

Impact Preview (SHOULD)​

5.2 Remediation Ledger (MUST)​

Idempotency Feedback (MUST)​

Revert Visibility (SHOULD)​

6. Skills Recalculation (Backfills, Planned/Parked)​

6.1 Recalculation Job UI​

Scope Selection (MUST)​

Processing State (MUST)​

Progress Tracker (MUST)​

Result Summary (MUST)​

7. Export & Defensibility Surfaces​

7.1 Exports (MUST)​

7.2 Audit-Proof Reporting (MUST)​

8. Summary of Navigation Contracts​