Start Here

The foundational 6-pillar framework for agentic legal reasoning.

A Sealed Evaluation Universe spanning 27k SCOTUS cases. Hard-gated to prevent contamination of the model's effective context window.

The 10-skill execution chain measuring Reasoning Decay. We track how legal errors compound across 10 steps of stateful logic.

Combining deterministic field matching with hybrid IRAC rubrics and calibrated LLM-judge synthesis scoring.

Frozen Evaluation Units (EU) and ResearchPacks (RP) that ensure exact runtime state reconstruction.

50,000 deterministic synthetic traps across corpora—impossible citations that serve as mathematical proof of hallucination (2 per evaluation).

Surface fluency vs reasoning divergence and the Compression Gap: failure modes that only show up in stateful, chained evaluation.