Skip to main content

Results Remediation

1. Goals & constraints

What we want from remediation:

  • Fix author mistakes (wrong key, bad item) after delivery.
  • Keep snapshots immutable – never rewrite what the candidate saw.
  • Keep an audit trail – show what changed, who did it, when, and why.
  • Let scores and outcomes change safely – including any future Programme / requirement checks.
  • Be incremental – we can start simple (item-level corrections) and expand if needed.

Given your current design (immutable version_snapshot + results_service scoring), the cleanest pattern is:

Keep submissions and snapshots immutable as records of what happened, but add a correction ledger and a score history, and make “current score” always reflect the latest rules.


2. Data model

2.1 Correction batches (the “what and why”)

A batch groups one or more corrections with a reason and actor.

Table: result_correction_batches

  • id UUID PK
  • tenant_id UUID
  • created_by UUID – admin user ID
  • reason TEXT – “Q3 had wrong answer key”, etc.
  • created_at TIMESTAMPTZ
  • Optional: evaluation_version_id UUID – if it’s scoped to a single test

A batch is your “remediation event”: “On this date, we fixed these questions for this exam”.


2.2 Correction rules (the “how to treat this item now”)

Each rule describes what to do with a specific broken item.

Table: result_corrections

  • id UUID PK

  • batch_id UUID REFERENCES result_correction_batches(id)

  • evaluation_version_id UUID – target eval version

  • question_version_id UUID – the specific item

  • correction_type TEXT – start with a small enum-like set:

    • drop_item – item contributes 0 to max and score.
    • mark_correct – treat all answers as fully correct.
    • replace_key – use new_key instead of the original for scoring.
  • new_key JSONB NULL – only used when correction_type = 'replace_key'.

    • E.g. { "qtype": "mcq_single", "correctIds": ["c2"] }
  • note TEXT – optional explanation

  • applied_at TIMESTAMPTZ

These rules are not applied by mutating snapshots. They are used by the scoring engine as an overlay when re-evaluating affected submissions.


2.3 Score history (the “before and after”)

We want to:

  • know what the original score/outcome were,
  • know what the new ones are, and
  • still have submissions hold the current truth for day-to-day reporting.

Table: submission_score_versions (or submission_score_history)

  • id UUID PK
  • submission_id UUID REFERENCES submissions(id)
  • version_no INT – 1 for original, 2+ for each remediation
  • source TEXTinitial | remediation
  • batch_id UUID NULL – link to result_correction_batches when source = remediation
  • score NUMERIC
  • max_score NUMERIC
  • outcome_code TEXT
  • created_at TIMESTAMPTZ

And a small addition to submissions (optional but helpful):

  • latest_score_version INT – pointer to the latest row in submission_score_versions.
  • Or you can infer “latest” by max(version_no) per submission, but a pointer is cheaper.

Behaviour:

  • On the first scoring (today’s behaviour), you:

    • insert a submission_score_versions row with source='initial', version_no=1,
    • write the same score/outcome into submissions for fast queries.
  • On rescoring, you:

    • add a new row with source='remediation', version_no = previous + 1, batch_id=...,
    • update submissions.score, submissions.max_score, submissions.outcome_code to the new values and bump latest_score_version.

This way:

  • The history is in submission_score_versions.
  • The current truth is in submissions.
  • You never touch the version_snapshot.

3. Correction logic (what a “rescore” actually does)

At a high level, rescoring for a batch would:

  1. Identify all affected submissions.

  2. For each submission:

    • Load its version_snapshot and recorded answers.
    • Apply the correction rules for any impacted questions.
    • Recalculate item scores + total score + outcome.
    • Append a new entry to submission_score_versions.
    • Update submissions with the new score/outcome.

3.1 Finding affected submissions

In practice, a batch may specify one or more (evaluation_version_id, question_version_id) pairs.

To find affected submissions, you can scope to:

  • submissions.evaluation_version_id = <evaluation_version_id> AND

  • either:

    • look up answers in submission_items by question_version_id, or
    • (if needed) inspect version_snapshot.questions – but submission_items will usually be cheaper.

You don’t need to rewrite anything in those rows; you just reuse them as input to scoring.

3.2 How the correction types behave

  • drop_item

    • Treat the item as if it has 0 max_score and 0 score for everyone.
    • Effectively normalises the test as “total score from the remaining items”.
    • This usually means changing the denominator: max_score_new = max_score_old - item_max_score.
  • mark_correct

    • For everyone who attempted the question, give full credit regardless of their selected response.
    • score_item = item_max_score for all.
    • This keeps the denominator the same but pushes everyone up on that item.
  • replace_key

    • Treat new_key as the authoritative marking key.
    • Re-score the item for each candidate using the new key.
    • This can push some scores up and others down.

Because you have full snapshots and answers, the scoring engine can apply these rules deterministically.


4. Interaction with Programmes (future pillar)

Programmes will, at some point, decide:

  • “Has this candidate met Requirement A?” based on:

    • current scores/outcomes for certain submissions.

If submissions.score and submissions.outcome_code always reflect the latest scoring version, then Programmes don’t need to know about corrections explicitly. They will:

  • read the current score/outcome, and
  • evaluate requirements accordingly.

What you may want later, once Programmes exist:

  • A small job or service that, when you apply a correction batch, also:

    • re-evaluates program_progress for any enrolments tied to affected submissions,
    • updates requirement statuses (e.g., someone who just crossed the threshold).

The important bit: do not bake “original score” into Programmes tables. Always derive satisfaction from the current submission score/outcome, so remediation naturally flows through.


5. Phased implementation suggestion

Here’s how I’d phase this so it stays manageable:

Phase 1 – Minimal but safe remediation engine

  • Add:

    • result_correction_batches
    • result_corrections
    • submission_score_versions
  • Implement:

    • Recording of initial scores into submission_score_versions when a submission is first created.

    • A backend-only “apply batch” function that:

      • takes a batch ID,
      • finds affected submissions,
      • re-runs scoring with correction rules,
      • writes history + updates submissions.

Usage at this stage could be:

  • via a CLI or admin-only API; no fancy UI needed yet.

Phase 2 – Operator experience

Later, when you have more UI capacity:

  • Add simple admin tooling:

    • list batches and their status,
    • see how many submissions were affected,
    • see before/after scores for a sample candidate,
    • show a flag “rescored on <date> due to correction <reason>” in result views.

Phase 3 – Programme awareness (once Programmes exist)

When Programmes are in place:

  • Add a small service (or job) that, when a correction batch runs:

    • identifies any program_enrolments whose requirements depend on affected submissions,
    • re-evaluates requirement satisfaction using the current scores,
    • updates program_progress accordingly.

That way:

  • A mis-keyed question in a certification path can be corrected and will retroactively fix the Programme status too.