SCDB Case Universe

Defining the boundaries of the LegalChain corpus: Anchors of reasoning and Authorities of precedent.

Overview

LegalChain defines two distinct case universes: Anchors (the 27,733 Supreme Court opinions that form evaluation tasks) and Authorities (the 64,548 cases those opinions cite). This boundary is deliberate. Supreme Court opinions provide the gold standard for legal reasoning, while the citation network captures how that reasoning draws on precedent.

27,733
Anchor Cases
SCOTUS Majority Opinions
64,548
Authority Cases
Cited Precedents

The Anchor Universe

Definition: All Supreme Court majority opinions from 1791 to 2021 with available full text.

We chose Supreme Court opinions as anchors because they represent the highest quality legal reasoning in the American system. LegalChain samples across eras to prevent temporal bias:

Era Cases Notes
1791-1850 1,247 Early Republic development
1851-1900 5,819 Post-Civil War constitutional shift
1901-1950 7,412 Progressive Era & New Deal
1951-2000 9,847 Modern Rights Revolution
2001-2021 3,408 Digital Age & Roberts Court

The Authority Universe

Definition: All cases cited by at least one anchor opinion.

This is not arbitrary. This reflects legal practice—a lawyer's relevant universe consists of the cases that matter to the issue at hand. If a case is never cited by a Supreme Court opinion, it is not in our authority universe.

Authority Breakdown by Source

Supreme Court (SCOTUS)
21,505
Circuit Courts (F., F.2d, F.3d)
36,552
District Courts (F.Supp)
6,491

Temporal Boundaries

LegalChain freezes its corpus at 2021. This ensures stability and reduces the risk of data contamination from models trained on ultra-recent legal events.

"The case universe is frozen at dataset build time. No cases are added or removed after the initial build, guaranteeing that evaluations run a year apart are testing against identical foundations."