Cadenza gives you three complementary ways to improve a project after you’ve run
it.
env finetune: export VLA training data
Convert a run log into (prompt, action, reward) records for your own
vision-language-action SFT or offline-RL pipeline.
cadenza env finetune rescue-dog .cadenza-env/<run-id>.log.jsonl -o train.jsonl
| Arg / flag | Description |
|---|
<project> | Project directory. |
<log> | A .log.jsonl produced by env run. |
-o <file> | Output path for the rendered records. |
Prompts are rendered with the project’s vla_finetune.prompt_template (see the
schema).
env train: rewrite the system prompt
Runs a Groq LLM-as-Judge over the project’s cached runs and rewrites the
project’s SYSTEM_PROMPT to fix the failure modes it finds.
export GROQ_API_KEY="gsk_..."
cadenza env train rescue-dog
env lora: fine-tune and govern the action head
Fine-tunes the cadenza-lab LoRA action head for a project (on the project’s
own base/VLA if it ships a lora_encoder.py), then governs it with a scorecard.
Once trained, drive a mission with it via env run --policy lora.
Requires the lora extra: pip install -e ".[lora]" (installs torch).
| Subcommand | What it does |
|---|
env lora add <project> "<goal>" --steps '<...>' [--image PATH] | Add a goal→action training example (optionally with visual context). |
env lora data <project> [--finetune PATH] | Show the current training dataset. |
env lora finetune <project> [--epochs N] [--lr LR] [--rank R] [--gate] | Generate goal→action data and fine-tune the adapter. --gate runs the governance scorecard and promotes or rolls back automatically. |
env lora eval <project> [--promote] | Run the governance scorecard on the trained adapter. --promote deploys it if it passes. |
env lora decode <project> "<goal>" | Decode a goal into actions using the trained adapter. |
Example
cadenza env lora add rescue-dog "enter the debris field" \
--steps 'walk_forward 1.0, crawl_forward 0.5'
cadenza env lora finetune rescue-dog --epochs 5 --rank 8 --gate
cadenza env lora decode rescue-dog "search for the victim"
cadenza env run rescue-dog --headless --policy lora
Governance scorecard
env lora eval (and finetune --gate) score the adapter on fidelity, safety,
coverage, stability, and regression, producing a verdict with next-step
guidance:
| Verdict | Meaning |
|---|
DEPLOY | Passes the gate. Safe to promote. |
BLOCK | Fails a safety/regression check. Rolled back, not promoted. |
NEEDS_DATA | Insufficient coverage. Collect more examples (env lora add). |