Run Artifacts

Visualizing the exhaustive evidence chain generation for every evaluation execution.

The Trace Lifecycle

In LegalChain, an evaluation "Run" is not just a final score. It is a persistent collection of **Run Artifacts** that allow researchers to audit the specific logic and data used at every turn. This ensures that every high-level finding is backed by low-level transactional evidence.

Pre-Flight

Run Configuration ID

A unique fingerprint of model parameters, prompt versions, and data slices (e.g., CR-01-v3.0).

Execution

Langfuse Traces

Deep observability of every LLM call, including raw tokens, latency, and intermediate S1-S5 outputs stored as JSON objects.

Finalization

Delta Reports

The final synthesis artifact comparing S6 (Closed) and S7 (Open) performance metrics for that specific run.

Governance via Constitution

Run artifacts are governed by the **LegalChain Constitution (v1.0)**, which enforces a strict "Proposed-First" development workflow. This means that run artifacts must be version-pinned and immutable once a Certified Run (CR) is initiated.

Pillar: Static Labs

"The Lab Floor is provisioned with 8.7GB of static ResearchPacks, ensuring unified, open-book evaluation that is 100% executionally reproducible."

Pillar: Governance

"Full enterprise-grade observability is active via Langfuse, providing real-time tracing of the Reasoning Bridge and multiplicative error propagation."

Auditability

Because every step in the chain is logged with its associated **Ground Truth Contract**, researchers can instantly identify where a model diverged from reality. Was it a failure of citation ID (S1), context extraction (S4), or doctrinal logic (S6)? The artifacts provide the answer.