Federal→CAP Crosswalk

Resolving 55,534 raw citation strings to verified Caselaw Access Project records. This stage bridges the gap between Supreme Court string pointers and canonical federal authorities.

MAPPING MATRIX: SCOTUS -> CAP
[ 98.2% RESOLVED ]
Input Citation Normalization CAP Resolution Status
495 F. 2d 1187 495 F.2d 1187 CAP_ID: 1046253 MATCH
202 F. Supp. 305 202 F.Supp. 305 CAP_ID: 1403610 MATCH
See 105 F.3d 12 105 F.3d 12 CAP_ID: 2209141 MATCH
999 U.S. 999 999 U.S. 999 NULL TRAP
[ METHODOLOGY ]

When a Supreme Court opinion cites "495 F.2d 1187," it provides a string pointer. To include this authority in a ResearchPack, we must connect it to a stable identity in the CAP database.

Our crosswalk achieves a 98.2% match rate through deterministic normalization. If a direct lookup against official citation fields fails, the engine falls back to volume-page matching within the reporter series.

Resolution Algorithm
01
Normalization

Remove punctuation, unify spacing, and standardize reporter abbreviations.

02
Index Query

Search the unified CAP/SCOTUS index (Parquet) for the canonical string.

03
Fallback Resolution

Apply heuristic matching for citations with high OCR noise or non-standard formatting.

[ ACCURACY REPORT ]
Direct Match 97.0%
Fallback 1.2%
Unresolved 1.8%
[ ARTIFACT ]
scotus_to_cap_map.jsonl
Size: 24 MB. Contains mapping from normalized citations to CAP IDs and decision metadata.