How Atomic Differs from Agentic

L10 Agentic chains skills sequentially (S1 → S7), where errors propagate downstream. A model with 90% accuracy per skill achieves only ~48% end-to-end success.
L10 Atomic evaluates each skill independently, reporting "90% on S1, 70% on S5, 85% on S7" without cascading failures.

The 7 Skills

S1: Known Authority Retrieval
Given a citation or case name, return exact citation, full case name, and SCOTUS term.
Deterministic
Ground Truth: SCDB
Scoring: Exact match
Output: {us_cite, case_name, term}
S2: Unknown Authority Retrieval
Given a case, predict which subsequent cases cite it. Tests citation network awareness.
Deterministic
Ground Truth: Shepard's
Scoring: MRR, hit@10
Output: Ranked citing cases
S3: Validate Authority
Determine if authority remains good law. Identify overruling case and year if applicable.
Deterministic
Ground Truth: scotus_overruled_db
Scoring: Exact + partial credit
Output: {is_overruled, overruling_case, year}
S4: Fact Extraction
Extract disposition, prevailing party, and holding summary from majority opinion text.
Deterministic
Ground Truth: SCDB metadata
Scoring: Closed enum match
Output: {disposition, party_winning, holding}
S5: Distinguish Cases
Determine whether a citing case agrees with or distinguishes from precedent. Two variants: S5:cb (closed-book) and S5:rag (with citing text).
Deterministic
Ground Truth: Shepard's edge.agree
Scoring: Binary match
Output: {agrees, reasoning}
S6: IRAC Synthesis
Write IRAC-structured legal analysis integrating all prior outputs. Tests synthesis capability.
Rubric-Based
Ground Truth: MEE rubric
Scoring: Weighted (0-1)
Output: {issue, rule, application, conclusion}
S7: Citation Integrity
Verify all citations are real cases (no hallucinations). Professional responsibility gate.
Hard Gate
Ground Truth: SCDB + fake_cases.csv
Scoring: Binary (pass/void)
Output: {citations_found, all_valid}

Documentation

Theoretical Foundation

The 7 skills derive from three authoritative sources on legal research competency: