Examples

Complete, runnable walkthroughs. Each assumes cadenza is installed with the gym extra.

1. Build, run, and export a mission

# scaffold + inspect
cadenza mkdir rescue-dog
cadenza env show rescue-dog

# run headless, score automatically
cadenza env run rescue-dog --headless

# analyze and export training data
cadenza env stats rescue-dog
cadenza env finetune rescue-dog .cadenza-env/<run-id>.log.jsonl -o train.jsonl

You now have train.jsonl: (prompt, action, reward) rows ready for VLA SFT.

2. Author a custom phase

Edit rescue-dog/env.json to add a phase that ends when the robot lingers near the victim marker:

{
  "name": "find_victim",
  "goal_prompt": "search the debris interior and stop next to the victim",
  "success_when": {"near_object": {"tag": "victim", "max_distance_m": 0.4, "for_ticks": 3}},
  "fail_when":    {"flipped": true},
  "reward": {
    "step_cost": -0.05,
    "distance_to_tag": {"tag": "victim", "weight": 0.5},
    "contact_with_rubble_penalty": -0.3,
    "phase_success_bonus": 3.0
  },
  "max_ticks": 40
}

Re-run to see it in action:

cadenza env run rescue-dog --headless

See the full environment schema for every predicate and reward term.

3. Create and reuse an action

cadenza login ada my-token

# a macro that enters and searches
cadenza env action create rescue-dog enter_and_search \
  --group --steps 'walk_forward,crawl_forward,turn_left'

cadenza env action list        # confirm it synced to your account

4. Fine-tune a LoRA policy and drive with it

# requires the lora extra: pip install -e ".[lora]"
cadenza env lora add rescue-dog "enter the debris field" \
  --steps 'walk_forward 1.0, crawl_forward 0.5'
cadenza env lora finetune rescue-dog --epochs 5 --rank 8 --gate
cadenza env lora eval rescue-dog
cadenza env run rescue-dog --headless --policy lora

5. Adapt a frozen base with residual RL, then distill it

# requires the rl extra: pip install -e ".[rl]"
cadenza login ada my-token

# scaffold + profile the residual head on the frozen base
cadenza env residual init rescue-dog --alpha 0.15

# governed PPO (the API steers + returns DEPLOY|BLOCK|NEEDS_DATA)
cadenza env residual train rescue-dog

# prove it beats full RL on compute / cost / accuracy
cadenza env residual bench rescue-dog

# distill into a compact, int8, base-free onboard student + govern it
cadenza env distill rescue-dog --quantize
cadenza env distill eval rescue-dog --promote

See Residual RL & distillation.

6. Adapt a standalone VLA with GRD

# requires the lora extra (and vla for a real model)
cadenza env vla init reach-arm --robot franka --model default
cadenza env vla data reach-arm add "reach the red block" \
  --steps 'move_forward 1.0, lower_arm 0.5, close_gripper'
cadenza env vla grd reach-arm --rounds 4
cadenza env vla eval reach-arm --promote

See VLA mode & GRD.

7. Drive a mission from Python

Take any project built above and run it with your own policy via the SDK:

from pathlib import Path
from cadenzalabs.customenv import CustomEnv, load_config

cfg = load_config(Path("rescue-dog/env.json"))
env = CustomEnv(cfg, headless=True)

obs, info = env.reset()
done = False
while not done:
    action = your_policy(obs, info["goal_prompt"])   # ActionCall, dict, or str
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

env.write_finetune(Path("train.jsonl"))

Full SDK guide: Custom environments.

Introduction

Configure

Using the CLI

Megan

Reference

1. Build, run, and export a mission

2. Author a custom phase

3. Create and reuse an action

4. Fine-tune a LoRA policy and drive with it

5. Adapt a frozen base with residual RL, then distill it

6. Adapt a standalone VLA with GRD

7. Drive a mission from Python

​1. Build, run, and export a mission

​2. Author a custom phase

​3. Create and reuse an action

​4. Fine-tune a LoRA policy and drive with it

​5. Adapt a frozen base with residual RL, then distill it

​6. Adapt a standalone VLA with GRD

​7. Drive a mission from Python

1. Build, run, and export a mission

2. Author a custom phase

3. Create and reuse an action

4. Fine-tune a LoRA policy and drive with it

5. Adapt a frozen base with residual RL, then distill it

6. Adapt a standalone VLA with GRD

7. Drive a mission from Python