LegalChain | Agentic Legal Reasoning Benchmark

BENCHMARK ACTIVE [v3.0]

What is LegalChain?

LegalChain is the public benchmark intro site for Legal-10 and the main showcase for AGChain. AGChain is the benchmark authoring platform under development; LegalChain is where the benchmark, methodology, leaderboard, and pitch are published today.

Notices

ALL ->

No notices yet.

"The transition from atomic prompts to stress-tests that evaluate multi-step, complex reasoning in chained, stateful conditions requires new standards."

Read Mission View Pitch Deck

Technical Baseline

10-Skill Chain

AGChain provides the evaluation infrastructure; LegalChain is the public benchmark surface running on top of it. Unlike traditional benchmarks that test isolated questions, this stack evaluates 10-step chained reasoning where errors propagate realistically, verifying citations against a sealed universe of 27,733 Supreme Court opinions and 378,938 extracted citation occurrences. With structural no-leak architecture and deterministic synthetic traps, it distinguishes grounded legal reasoning from hallucination in high-stakes workflows.

First Chained Legal Benchmark

Deterministic Reference Pack

Shepard's as Relevance Oracle

Chain-Faithful Evaluation

Citation Integrity Gate

Structural No-Leak Architecture

Selection Manifest as Contract

Two-Layer Architecture

Top Performance

FULL_LOGS ->

Model	Composite	S8 Integrity	Latency	Cost
Leaderboard preview Open the full leaderboard to view results. VIEW_LEADERBOARD ->

Leader: GPT-4o

91%