Skill Reference - L10 Atomic

S1: Known Authority Retrieval

Given metadata about a case, verify it exists and retrieve its details.

Purpose

S1 establishes the anchor case for the chain. This skill tests the model's ability to correctly identify and describe a real Supreme Court case from the SCDB database.

Input Schema

{
  "citation": "347 U.S. 483",
  "case_name_hint": "Brown v. Board of Education",
  "term_hint": 1954
}

Output Schema

{
  "us_cite": "347 U.S. 483",
  "case_name": "Brown v. Board of Education of Topeka",
  "term": 1954
}

Ground Truth & Scoring

Source

scdb_sample.csv (usCite, caseName, term)

Method

Exact match on all fields (canonicalized)

Example

Input

Citation: 410 U.S. 113, Case: Roe v. Wade, Term: 1973

Expected Output

us_cite: "410 U.S. 113", case_name: "Roe v. Wade", term: 1973

S2: Unknown Authority Retrieval

Given a cited case, predict which subsequent cases cite it.

Purpose

S2 tests the model's knowledge of legal citation networks. Given a case, the model must identify which subsequent cases have cited it, demonstrating awareness of legal precedent relationships.

Input Schema

{
  "cited_case": {
    "us_cite": "347 U.S. 483",
    "case_name": "Brown v. Board of Education",
    "term": 1954
  }
}

Output Schema

{
  "citing_cases": [
    { "us_cite": "349 U.S. 294", "case_name": "Brown II" },
    { "us_cite": "358 U.S. 1", "case_name": "Cooper v. Aaron" }
  ]
}

Ground Truth & Scoring

Source

scotus_shepards_sample.csv (citing_case_us_cite)

Method

Ranked retrieval metrics

Metric	Definition	Storage
MRR	Mean Reciprocal Rank	StepResult.score
hit@10	Ground truth in top 10	StepResult.correct
hit@1, hit@5, hit@20	Additional hit metrics	parsed.metrics

Example

Input: Miranda v. Arizona (384 U.S. 436)

Model returns: [Harris v. New York, Michigan v. Tucker, Oregon v. Mathiason]

Ground truth: 417 U.S. 433 (Michigan v. Tucker)

Result: rank=2, MRR=0.5, hit@10=True

S3: Validate Authority

Determine if a case has been overruled and identify the overruling case.

Purpose

S3 tests the model's knowledge of precedent validity. A competent legal AI must know when a case is no longer good law and can identify what case overruled it.

Input Schema

{
  "case": {
    "us_cite": "163 U.S. 537",
    "case_name": "Plessy v. Ferguson",
    "term": 1896
  }
}

Output Schema

{
  "is_overruled": true,
  "overruling_case": "Brown v. Board of Education",
  "year_overruled": 1954
}

Ground Truth & Scoring

Source

scotus_overruled_db.csv (288 overruled cases)

Condition	Score	Correct
Not overruled, model says not overruled	1.0	True
Overruled, model says overruled + correct year	1.0	True
Overruled, model says overruled + wrong year	0.5	False
Mismatch on is_overruled	0.0	False

Example: Overruled

Input: Lochner v. New York (198 U.S. 45, 1905)

Expected: is_overruled: true, overruling_case: "West Coast Hotel Co. v. Parrish", year: 1937

Example: Not Overruled

Input: Marbury v. Madison (5 U.S. 137, 1803)

Expected: is_overruled: false, overruling_case: null, year: null

S4: Fact Extraction

Extract disposition, prevailing party, and holding summary from majority opinion.

Purpose

S4 tests the model's ability to read and extract structured information from legal text. This is a core legal AI skill - understanding case outcomes from opinion text.

Output Schema

{
  "disposition": "reversed and remanded",
  "party_winning": "petitioner",
  "holding_summary": "The Court held that segregation in public schools violates the Equal Protection Clause."
}

Disposition Enum (Closed)

stay granted
affirmed
reversed
reversed and remanded
vacated and remanded
affirmed and reversed in part
affirmed and vacated in part
affirmed and reversed in part and remanded
vacated
petition denied
certification

Party Winning Enum

petitioner - SCDB code 1
respondent - SCDB code 0
unclear - SCDB code 2

Scoring

0.5 per field (disposition + party_winning)

Example

Input: "...The judgment of the Court of Appeals is reversed, and the case is remanded for further proceedings consistent with this opinion. It is so ordered."

Expected: disposition: "reversed and remanded", party_winning: "petitioner"

S5: Distinguish

Determine whether the citing case agrees with or distinguishes from the cited case.

Purpose

S5 is the core legal reasoning skill. It tests whether the model can determine the doctrinal relationship between two cases based on available information.

Two Variants

S5:cb (Closed-Book)

Uses: metadata + S4 extracted facts only

Runs on: all CHAIN_CORE instances

S5:rag (RAG-Enhanced)

Uses: all S5:cb inputs + citing opinion text

Runs on: CHAIN_RAG_SUBSET only

Output Schema

{
  "agrees": true,
  "reasoning": "The citing case follows the precedent because..."
}

Note: Field is "agrees", matching ground truth field edge.agree

Ground Truth

Source: scotus_shepards_sample.csv (agree field)

True (1): followed/parallel (agrees)

False (0): distinguished/criticized/overruled

Scoring: Binary exact match

Example (S5:cb)

Cited: Brown v. Board of Education (347 U.S. 483, 1954)

Citing: Cooper v. Aaron (358 U.S. 1, 1958)

Expected: agrees: true

Reasoning: "Cooper v. Aaron reaffirmed Brown's holding that segregation is unconstitutional..."

S6: IRAC Synthesis

Synthesize all prior outputs into IRAC-structured legal analysis.

Purpose

S6 is the capstone skill. It tests whether the model can integrate information from prior steps into a coherent legal analysis using the Issue-Rule-Application-Conclusion (IRAC) framework.

Output Schema

{
  "issue": "Whether segregation in public schools violates the Equal Protection Clause...",
  "rule": "The Equal Protection Clause prohibits states from denying equal protection...",
  "application": "Applying this rule to the facts, segregation generates a feeling of inferiority...",
  "conclusion": "Therefore, the Court concludes that 'separate but equal' has no place."
}

Rubric-Based Scoring

Component	Weight	Criteria
Issue	20%	Clear, correctly framed legal question
Rule	25%	Accurate statement of legal rule from case
Application	35%	Logical application with citation support
Conclusion	20%	Consistent with analysis, cites outcome

Correct = score >= 0.5

S7 Gating (in L10 Agentic)

In L10 Agentic, S6 results can be voided by S7 if fabricated citations are detected. The voided flag is set but status remains "OK" (execution happened, just invalidated post-hoc). In L10 Atomic, skills are evaluated independently without gating.

S7: Citation Integrity

Verify all citations in S6 output are real cases (no hallucinations).

Professional Responsibility Gate

S7 is the hallucination gate. Fabricating citations is a serious professional ethics violation. A single fake citation results in failure and (in L10 Agentic) voids the entire S6 analysis.

Output Schema

{
  "citations_found": [
    { "cite": "347 U.S. 483", "exists": true },
    { "cite": "384 U.S. 436", "exists": true },
    { "cite": "999 U.S. 999", "exists": false }
  ],
  "all_valid": false
}

Ground Truth Sources

fake_cases.csv - Known fabricated citations
scdb_sample.csv - Known real citations

Citations extracted using eyecite library

Verification Logic

Check if citation is in fake_cites set -> False
Check if citation is in real_cites set -> True
Unknown citation -> False (conservative)

Scoring: Binary - all citations must be valid

Metrics

Metric	Definition
Void Rate	Chains with voided=True / total chains
Hallucination Rate	Citations with exists=False / total citations
Clean Rate	Chains with all_valid=True / total chains

Example: Failed Verification

S6 Output: "As established in Brown v. Board of Education (347 U.S. 483)... Following Smith v. Jones (999 U.S. 999)..."

S7 Result:

347 U.S. 483 -> exists: true
999 U.S. 999 -> exists: false (fabricated!)

Outcome: all_valid: false, S6 voided