Stateful Delivery Model

Eliminating information contamination through controlled, multi-stage context injection within a single continuous session.

The Contamination Problem

When evaluating an LLM's legal reasoning, it is difficult to distinguish between genuine understanding and mere retrieval of training data. If a model is asked to analyze Strickland v. Washington, it may produce a correct answer simply because it has memorized the case, effectively "peeking" at the answer sheet.

D1
Closed-Book
Metadata + Query
D2
Open-Book
ResearchPack Injection
D3
Integrity Gate
Canaries + Traps

Architecture of State

LegalChain runs each evaluation as a single continuous session. Unlike traditional benchmarks that treat prompts as stateless snapshots, our ChainExecutor maintains context accumulation across the entire S1-S10 lifecycle.

Delivery Timeline Payload Contents Purpose
D1 Pre-S1 Target citation, court year, case name, legal issue. Baseline
D2 Post-S4 Full ResearchPack: majority opinion, authorities, Shepard's data. Synthesis
D3 Post-S7 Synthetic trap cases, modified holdings, canary strings. Integrity

Execution Flow: Temporal Audit Trail

By timing these deliveries, we create a temporal audit trail. We can compare the model's reasoning in D1 (relying on internal memory) against its D2 refined analysis (using external evidence).

Enforcement Mode
FLEXIBLE

Reasoning Quality (S1-S7)

Focuses on substantive accuracy and legal synthesis. Minor citation formatting errors are tolerated if the underlying logic is sound.

Enforcement Mode
STRICT

Integrity Gate (S8)

Binary validation. Any citation of a synthetic case or failure to detect a canary trap triggers an immediate 100% chain failure.

[ CHAINEXECUTOR.LOG ]
[08:00:01] INIT_SESSION - Target: 410 U.S. 113
[08:00:02] INJECT_PAYLOAD - D1 (Closed-Book Content)
[08:00:05] S1-S4 reasoning recorded... SUCCESS
[08:00:06] DELIVERY_BOUNDARY_D2 - ResearchPack Injection
Context Size expanded: 2,400 tokens -> 48,500 tokens
[08:00:15] S5-S7 synthesis complete... SUCCESS
[08:00:16] DELIVERY_BOUNDARY_D3 - Canary Injection
[08:00:20] S8 Integrity Gate status: PASSED