ResearchPacks

Cryptographically sealed, self-contained evidence bundles that eliminate external database dependencies at runtime.

Sealed
Zero Dependencies
No Runtime Lookups
SHA-256
Integrity Verified
Bit-Exact Reproducibility
15%
Original Size
Via Summarization

The Reproducibility Problem

Traditional benchmarks evaluate models against live external datasets. When those datasets change, scores become incomparable. When databases go offline, evaluations break. To reproduce coverage numbers from six months ago often requires complex database reconstruction.

LegalChain solves this by sealing inputs at build time. A ResearchPack travels with its evaluation instance, guaranteeing that the inputs used to produce a score today are identical to those used next year.

Anatomy of a ResearchPack

DOC1

Anchor Opinion

The full, unaltered text of the Supreme Court majority opinion under analysis.

DOC2

Cited Authorities

Authoritative summaries (Syllabi/Head Matter) of the top-ranked precedents cited by the anchor.

DOC3

Citation Evidence

Context snippets, treatment signals ("followed", "distinguished"), and metadata.

Deterministic Ranking

To select the most relevant authorities, we use a deterministic formula combining multiple signals:

Scoring Factors

  • Log-scaled Citation Frequency
  • Shepard's Treatment (Followed/Criticized)
  • Introductory Signals ("see", "cf.")
  • Fowler Precedential Score

Optimized Summarization

Including full text for every cited case would explode the context window (>400k characters). Instead, we use Authoritative Summaries:

SCOTUS Syllabi

Official summaries prepared by the Reporter of Decisions. High-quality, court-approved abstracts.

Avg length: 1,800 chars

CAP Head Matter

Editorial summaries and headnotes from published reporters, widely used by legal professionals.

Avg length: 1,300 chars

Directory Structure

rps/ rpv2__1963-130/ # Unique deterministic ID research_pack.json # Sealed Payload rp_manifest.json # Audit Trail rp_sha256.txt # Verification Hash