Governance & scorecards

Every learned artifact in Cadenza — a LoRA adapter, a residual policy, a distilled student, a GRD-adapted VLA — passes through the same governance gate before it can be promoted. Governance is what turns “the loss went down” into a deployment decision you can trust.

The verdict

Each governed command measures the artifact locally, sends the raw metrics to the Cadenza API, and the API returns one of three verdicts:

Verdict	Meaning	What happens
`DEPLOY`	Passes the gate.	Promoted as the new baseline (with `--promote` / `--gate`).
`BLOCK`	Fails a safety or regression check.	Rolled back to the previous baseline, never promoted.
`NEEDS_DATA`	Insufficient coverage to decide.	Kept but not promoted — collect more examples and re-run.

Verdicts are computed server-side. The client only measures — it never decides. That’s why the governed commands (lora eval, lora finetune --gate, residual train/eval/bench, distill eval, vla grd/eval) require sign-in: the API needs to authenticate the run and attribute the baseline to your account.

What gets scored

The scorecard dimensions depend on the artifact, but the shape is the same — a mix of fidelity, safety, coverage, stability, and regression:

Stage	Scored on
LoRA adapter (`env lora`)	fidelity · safety · coverage · stability · regression
Residual policy (`env residual`)	success · collision · residual-sanity · regression
Distilled student (`env distill`)	teacher↔student gap · success · regression
GRD / VLA adapter (`env vla`)	governed by the λ / RL-budget / change-cap loop

Promotion, rollback & baselines

Governance is stateful. Each project keeps a promoted baseline per artifact kind. When a new artifact earns DEPLOY, it replaces the baseline and the old one is snapshotted. When a candidate earns BLOCK, the gate restores the previous baseline automatically — so a bad run can never leave you worse off than before.

--gate (on env lora finetune) runs the scorecard inline and promotes or rolls back automatically as part of the training command.
--promote (on the eval commands) deploys the candidate only if it earns DEPLOY; a BLOCK is refused and rolled back.

Steering the next round

For the closed-loop stages (env residual train, env vla grd), the API does more than judge — it steers. Each round it returns the hyperparameters and dials for the next round (the residual’s PPO schedule; GRD’s λ / RL-budget / change-cap), so the loop converges toward a deployable artifact under the gate rather than just optimizing a raw objective.

You don’t need to read or tune the schedule — the client surfaces opaque per-round progress and the final verdict. Governance is the product: a verdict you can act on, not a curve you have to interpret.

Introduction

Configure

Using the CLI

Megan

Reference

Governance & scorecards

The verdict

What gets scored

Promotion, rollback & baselines

Steering the next round

​The verdict

​What gets scored

​Promotion, rollback & baselines

​Steering the next round

The verdict

What gets scored

Promotion, rollback & baselines

Steering the next round