SCDB Case Universe
Defining the boundaries of the LegalChain corpus: Anchors of reasoning and Authorities of precedent.
Overview
LegalChain defines two distinct case universes: Anchors (the 27,733 Supreme Court opinions that form evaluation tasks) and Authorities (the 64,548 cases those opinions cite). This boundary is deliberate. Supreme Court opinions provide the gold standard for legal reasoning, while the citation network captures how that reasoning draws on precedent.
The Anchor Universe
Definition: All Supreme Court majority opinions from 1791 to 2021 with available full text.
We chose Supreme Court opinions as anchors because they represent the highest quality legal reasoning in the American system. LegalChain samples across eras to prevent temporal bias:
| Era | Cases | Notes |
|---|---|---|
| 1791-1850 | 1,247 | Early Republic development |
| 1851-1900 | 5,819 | Post-Civil War constitutional shift |
| 1901-1950 | 7,412 | Progressive Era & New Deal |
| 1951-2000 | 9,847 | Modern Rights Revolution |
| 2001-2021 | 3,408 | Digital Age & Roberts Court |
The Authority Universe
Definition: All cases cited by at least one anchor opinion.
This is not arbitrary. This reflects legal practice—a lawyer's relevant universe consists of the cases that matter to the issue at hand. If a case is never cited by a Supreme Court opinion, it is not in our authority universe.
Authority Breakdown by Source
Temporal Boundaries
LegalChain freezes its corpus at 2021. This ensures stability and reduces the risk of data contamination from models trained on ultra-recent legal events.
"The case universe is frozen at dataset build time. No cases are added or removed after the initial build, guaranteeing that evaluations run a year apart are testing against identical foundations."