ADR 0007: Observations as Assessments (Including Group Observation)

Date: 2025-12-04
Status: Accepted

Context

Evalium needs to support on-the-job / practical assessments (“observations”) where:

An assessor evaluates how a learner performs tasks.
Data is collected via checklists/rubrics rather than traditional test items.
In many real scenarios, one assessor observes a group of learners at once.

Market research indicated:

Most LMS / AMS tools treat observations as simple 1:1 checklists.
Group observation (one event, many learners) is almost completely unsupported.
Competency tools often separate observation from tests in both model and UI.

We had to decide how to model observations:

As a separate module (e.g. “Observation Checklists” with their own lifecycle), or
As a specialised form of Evaluation within the existing assessment model.

Decision

Observations will be implemented as Observation Evaluations within the existing AMS model:

evaluation.delivery_mode = "assessor" identifies an Observation Evaluation.
Observations use rubric/checklist-style question types:
- Multi-level ratings (e.g. Unsafe / Needs Support / Competent / Excellent).
- Yes/No/NA checks.
- Optional hidden assessor guidance.
Both 1:1 and group observations will produce Submissions via the normal lifecycle:
- Assignment → Session → Submission.

For group observation:

A single Observation Session may involve:
- one assessor_id, and
- multiple subject_user_ids.
The UI will present a group grid so the assessor can rate multiple learners in one event.
The backend will still persist per-learner Submissions (or equivalent row-level data) tied together via a shared group_session_id (or similar link), preserving RLS, audit and analytics.

We will not build a separate “Observation Checklists” engine outside Evaluations.

Options Considered

1. Separate Observation Module (rejected)

Introduce a top-level Observation aggregate with its own:

Template model,
Lifecycle,
APIs and storage separate from Evaluations.

Pros:

Clear separation from online tests.
Could be tuned specifically for field/ops workflows.

Cons:

Duplicates a large part of the Evaluation/Submissions machinery.
Splits analytics and reporting between “tests” and “observations”.
Complicates RLS and snapshots (different tables, different policies).
Harder to achieve a truly unified competence view.

2. Minimal 1:1 Checklists Only (rejected)

Treat observations as:

Simple 1:1 checklists inside Evaluations,
No special support for group observation.

Pros:

Simple to implement.
Works for basic sign-off workflows.

Cons:

Misses a clear market gap: efficient group observation.
Forces assessors into clumsy workarounds (multiple open checklists).
Undermines one of Evalium’s differentiators.

3. Observation as Evaluation with Group Mode (chosen)

Model observations as Evaluations with delivery_mode = "assessor" and:

Rich rubric/checklist items,
Optional attachment of evidence,
Support for both 1:1 and group sessions.

Pros:

Reuses existing Evaluation/Assignment/Submission pipelines.
Keeps RLS, snapshots and scoring consistent across K and O.
Enables unified analytics and competence profiles across tests and observations.
Group observation becomes primarily a session + UI concern, not a new domain.

Cons:

Requires careful design of the group observation data model and UI to remain intuitive.
Some extremely specialised observation workflows may still require future enhancements or integrations.

Consequences

Positive:

Simple story for developers and customers: “Observations are assessments where the assessor holds the pen.”
Strong differentiator via group observation without a separate engine.
Unified analytics: observation data sits alongside test data for any individual or team.
Lower maintenance: one core assessment engine to evolve over time.

Negative:

Observation requirements must be mapped into the Evaluation/Submission model, which may constrain some niche use cases.
The group observation feature adds complexity to session handling and UI that must be well-tested, especially for offline/mobile scenarios.

Notes

Detailed group observation specifications (session schema, UI flows, offline behaviour) will be captured in separate design docs and linked from this ADR.
Any future observation-related features should be evaluated against this ADR before introducing new domain roots.

Context​

Decision​

Options Considered​

1. Separate Observation Module (rejected)​

2. Minimal 1:1 Checklists Only (rejected)​

3. Observation as Evaluation with Group Mode (chosen)​

Consequences​

Notes​

Context

Decision

Options Considered

1. Separate Observation Module (rejected)

2. Minimal 1:1 Checklists Only (rejected)

3. Observation as Evaluation with Group Mode (chosen)

Consequences

Notes