CAP Corpus & Metadata
Integrating the Caselaw Access Project to provide lower federal court context. LegalChain resolves Supreme Court citations to canonical records from the circuit and district court corpus.
Populations
43,043
Resolved Cases
Scale
1.37 GB
JSONL Snapshots
Resolution
98.2%
Match Rate (S2)
The CAP Repository
A Typical SCOTUS opinion cites circuit and district court rulings to establish procedural history. LegalChain maintains a subset of the CAP corpus specifically mapped to these upstream dependencies.
[ DATA TRANSFORMATION ]
STEP 1. NORMALIZE
Strip punctuation -> Canonical string
STEP 2. RESOLVE
Map to CAP_CASES_META in DuckDB
STEP 3. INGEST
Load casebody into S4 Research Pack
Case Bundle Manifest
| Bundle Name | Count | Size |
|---|---|---|
| cap_appellate_text.jsonl | 36,552 | 1.1 GB |
| cap_trial_text.jsonl | 6,491 | 262 MB |
| Corpus Total | 43,043 | 1.37 GB |
Schema Excerpt: cap_case
{
"id": 1403610,
"name": "U.S. v. Jackson",
"citations": ["202 F. 305"],
"casebody": {
"opinions": [{ "type": "majority" }]
},
"pagerank": 0.0000123
}