Skip to main content
This walks through a complete project end to end: a Go1 navigates a scene with obstacles to a target waypoint and sits when it arrives, driven by your own world model through the inference stack. Every snippet is copy-paste runnable and headless (no display needed), so it works in CI too. The captured output at the bottom is from running these exact files.

0. Set up

pip install cadenza-lab
mkdir patrol && cd patrol
patrol/
├── policy.py      # the world model + a perception modality
└── run_patrol.py  # builds the scene and runs the loop

1. The world model (policy.py)

The model turns each observation into a named action. Proximity is a modality that injects target_dist into every observation. HeadingPolicy reads it, turns toward the target, walks, and sits when within 0.6 m.
policy.py
"""policy.py: a heuristic world model and a perception modality for the patrol."""
import math
from pathlib import Path
from cadenza_lab import (WorldModelAdapter, AdapterReply, ProposedAction,
                         Modality, ModalityResult)

TARGET = (-3.0, 0.0)   # where we want the robot to end up (Cadenza forward = -x)

class Proximity(Modality):
    """Adds 'target_dist' to every observation the model sees."""
    name = "proximity"
    def compute(self, observation) -> ModalityResult:
        d = math.hypot(TARGET[0] - observation.pos[0], TARGET[1] - observation.pos[1])
        return ModalityResult(keys={"target_dist": d}, summary=f"target {d:.2f}m away")

class HeadingPolicy(WorldModelAdapter):
    """Turn toward the target, walk, and sit when close enough."""
    name = "heading-policy"
    description = "Greedy go-to-target controller over the action vocabulary."

    @classmethod
    def detect(cls, root: Path):
        return None  # we pass this adapter in explicitly

    def propose_actions(self, observation, goal, vocabulary, history=None):
        dist = observation.get("target_dist", 9.0)
        pos, yaw = observation["pos"], observation["rpy"][2]
        if dist < 0.6:
            return AdapterReply(actions=[ProposedAction("sit")], done=True, note="arrived")
        desired = math.atan2(TARGET[1] - pos[1], TARGET[0] - pos[0])
        err = (desired - (yaw + math.pi) + math.pi) % (2 * math.pi) - math.pi
        if abs(err) > 0.4:
            name = "turn_left" if err > 0 else "turn_right"
            return AdapterReply(
                actions=[ProposedAction(name, {"rotation_rad": min(abs(err), 0.8)})],
                note=f"correct heading ({math.degrees(err):.0f}°)")
        return AdapterReply(
            actions=[ProposedAction("walk_forward", {"distance_m": min(dist, 1.0)})],
            note=f"advance ({dist:.2f}m to go)")
Two rules the stack enforces: the first parameter of propose_actions must be named observation (it’s passed by keyword), and you must return an AdapterReply with done=True to end the loop. ProposedAction.name must be a real action. Check with cadenza.list_actions("go1").

2. The mission (run_patrol.py)

Build a Scene, compile it to an XML the stack can load via xml_path, then run the loop.
run_patrol.py
"""run_patrol.py: build a scene, hand it to the stack, drive to the target."""
from pathlib import Path
import cadenza_lab as cadenza
from policy import HeadingPolicy, Proximity, TARGET

# 1. Build a world with two obstacles and compile it to a model the stack loads.
scene = (cadenza.Scene()
         .add_box(position=(-1.5, 0.5, 0.08), size=(0.15, 0.15, 0.08))
         .add_sphere(position=(-2.2, -0.4, 0.12), radius=0.12, rgba=(0.9, 0.3, 0.2, 1)))
xml = scene.compile(cadenza.Go1.model())

# 2. Run the perceive-reason-act loop, headless.
result = cadenza.stack.run(
    robot="go1",
    goal="patrol to the marker and sit",
    target=TARGET,
    world_model=HeadingPolicy,
    modalities=[Proximity()],
    xml_path=str(xml),
    headless=True,
    render_camera=False,
    max_iterations=15,
    verbose=True,
)

# 3. Report.
print("\n=== mission summary ===")
print("reached goal:", result.done)
print("actions executed:", result.total_actions)
print("final position:", result.final_observation.pos.round(2))
Path(xml).unlink(missing_ok=True)

3. Run it

python run_patrol.py
Output:
  [stack] world model: heading-policy (source=explicit, checkpoint=None)
  [stack] vocabulary: 21 actions for go1; goal="patrol to the marker and sit"
  [stack] modalities: proximity
  [stack]   target 3.01m away
  [stack] iter 1/15: 1 proposed | done=False | advance (3.01m to go)
  [stack]   -> walk_forward (~6.7s)
  [stack]   target 2.04m away
  [stack] iter 2/15: 1 proposed | done=False | advance (2.04m to go)
  [stack]   -> walk_forward (~6.7s)
  [stack]   target 1.09m away
  [stack] iter 3/15: 1 proposed | done=False | advance (1.09m to go)
  [stack]   -> walk_forward (~6.7s)
  [stack]   target 0.26m away
  [stack] iter 4/15: 1 proposed | done=True | arrived
  [stack]   -> sit (~3.0s)
  [stack] finished: 4 actions executed, done=True

=== mission summary ===
reached goal: True
actions executed: 4
final position: [-2.78  0.22  0.25]
The robot walked from the origin to ≈ (-2.78, 0.22), within the 0.6 m arrival radius of the (-3.0, 0.0) target, and sat.

4. Watch it (optional)

Everything above is headless. To see the same scene in the MuJoCo viewer, drop this beside run_patrol.py (needs a display):
watch.py
import cadenza_lab as cadenza

scene = (cadenza.Scene()
         .add_box(position=(-1.5, 0.5, 0.08), size=(0.15, 0.15, 0.08))
         .add_sphere(position=(-2.2, -0.4, 0.12), radius=0.12, rgba=(0.9, 0.3, 0.2, 1)))
cadenza.view(robot="go1", scene=scene)

Where to take it next

Swap in a real VLA

Replace HeadingPolicy.propose_actions with calls into your trained model.

Add perception

Add a vision/depth Modality and reason over observation['camera'].

Harder terrain

Add slopes, a snake of boxes, or dynamic (fixed=False) objects.

Ship to hardware

Drive the physical Go1 with the same action vocabulary.