Federal→CAP Crosswalk
Resolving 55,534 raw citation strings to verified Caselaw Access Project records. This stage bridges the gap between Supreme Court string pointers and canonical federal authorities.
| Input Citation | Normalization | CAP Resolution | Status |
|---|---|---|---|
| 495 F. 2d 1187 | 495 F.2d 1187 | CAP_ID: 1046253 | MATCH |
| 202 F. Supp. 305 | 202 F.Supp. 305 | CAP_ID: 1403610 | MATCH |
| See 105 F.3d 12 | 105 F.3d 12 | CAP_ID: 2209141 | MATCH |
| 999 U.S. 999 | 999 U.S. 999 | NULL | TRAP |
When a Supreme Court opinion cites "495 F.2d 1187," it provides a string pointer. To include this authority in a ResearchPack, we must connect it to a stable identity in the CAP database.
Our crosswalk achieves a 98.2% match rate through deterministic normalization. If a direct lookup against official citation fields fails, the engine falls back to volume-page matching within the reporter series.
Remove punctuation, unify spacing, and standardize reporter abbreviations.
Search the unified CAP/SCOTUS index (Parquet) for the canonical string.
Apply heuristic matching for citations with high OCR noise or non-standard formatting.