Hybrid IRAC Scoring
Blending deterministic chain consistency with expert-calibrated synthesis rubrics (MEE).
The Composite Method
Evaluating legal synthesis is notoriously difficult due to the nuance of reasoning. LegalChain solves this by using a Composite Scoring Architecture that balances three distinct signals, minimizing the risks of "LLM-as-a-Judge" circularity.
Structure (Presence of IRAC) + Consistency (Fact Usage) + Quality (MEE Rubric)
MEE-Based Reasoning Rubric
The 50% "Reasoning Quality" signal is assessed using a rubric derived from the **Multistate Essay Examination (MEE)** professional standards. Our LLM-Judge is calibrated to evaluate:
Issue Identification
Does the model capture the precise legal conflict identified in S1?
Rule Statement
Is the legal principle extracted from the cited case accurate and complete?
Application / Analysis
The "Reasoning Bridge": Does the model logically connect facts to the rule?
Conclusion
Is the final outcome grounded in the preceding analysis?
Deterministic Anchoring
The 40% **Chain Consistency** component is deterministic. We check if the model used the specific metadata and holding codes extracted in steps S1 and S4. If a model writes a "correct" answer but ignores its own previous findings, it is penalized for lack of internal coherence—a hallmark of stochastic mimicry rather than logical reasoning.