Skip to main content
VLA-only mode is the path for adapting a vision-language-action model without the full Cadenza mission stack. A VLA project is just two things in a vla.json: a model and an action library. The GRD loop then adapts only the model’s LoRA adapter — the VLA itself stays frozen — in a single governed pass that mixes imitation and RL.
The GRD loop needs the lora extra (pip install -e ".[lora]"). Loading a real downloaded model (smolvla / hf / local) additionally needs the vla extra (pip install -e ".[vla]", which adds transformers + lerobot). The governed commands require sign-in.

env vla init: scaffold a model + action library

cadenza env vla init reach-arm --robot franka --model default
FlagDefaultPurpose
--robot <name>go1Robot whose action library to attach.
--model default|smolvla|hf|localdefaultWhich backbone to adapt. default uses the built-in stable encoder; the others load a real VLA.
--model-id <repo|path>HF repo id (hf) or local path (local/smolvla).
--forceoffOverwrite an existing vla.json.
Writes reach-arm/vla.json. Inspect it any time with env vla show reach-arm.

env vla data: author goal→action pairs

GRD’s imitation signal comes from goal→action examples you author against the model’s action library.
# add one pair
cadenza env vla data reach-arm add "reach the red block" \
  --steps 'move_forward 1.0, lower_arm 0.5, close_gripper'

# list the whole dataset
cadenza env vla data reach-arm

env vla grd: the governed GRD loop

GRD = Govern · Refine · Decide. In one loop it fine-tunes the LoRA adapter on your goal→action data (imitation) and RL-tunes it, then asks the Cadenza API for a verdict. The API steers the loop — it sets three dials and returns DEPLOY | BLOCK | NEEDS_DATA:
DialWhat it controls
--lambda <l>The imitation ↔ RL mix.
--rl-budget <n>How much RL the round is allowed.
--change-cap <c>How far the adapter may move from the frozen base in one round.
cadenza env vla grd reach-arm --rounds 4
FlagDefaultPurpose
--lambda <l>API-steeredImitation/RL mix (the API overrides per round).
--rl-budget <n>API-steeredRL budget per round.
--change-cap <c>API-steeredPer-round adapter change cap.
--rounds <r>Max GRD rounds before stopping.
--device <d>autoTraining device.
Only the LoRA adapter is trained; the VLA stays frozen throughout.

env vla eval: one-shot governance

Score the current adapter without running a full loop:
cadenza env vla eval reach-arm --promote
Returns the same DEPLOY | BLOCK | NEEDS_DATA verdict (see Governance); --promote deploys the adapter if it passes.

Full loop

cadenza env vla init reach-arm --robot franka --model default
cadenza env vla data reach-arm add "reach the red block" \
  --steps 'move_forward 1.0, lower_arm 0.5, close_gripper'
cadenza env vla grd reach-arm --rounds 4
cadenza env vla eval reach-arm --promote
VLA mode and the LoRA action head both fine-tune a LoRA adapter, but they are different entry points: env lora adapts the action head inside a full env.json mission, while env vla adapts a standalone model with no mission spec — just a model and an action library.