Custom environments

CustomEnv is the bridge between the CLI and your own code. It loads a project’s env.json and exposes the standard Gym 5-tuple loop, so you can run the exact same phase-aware mission with your own policy or VLA and emit the same fine-tune records the CLI does. It lives in the CLI package:

from pathlib import Path
from cadenzalabs.customenv import CustomEnv, load_config

Drive a mission

cfg = load_config(Path("rescue-dog/env.json"))   # -> EnvConfig
env = CustomEnv(cfg, headless=True)

obs, info = env.reset()
done = False
while not done:
    # info["goal_prompt"] is the current phase's goal, feed it to your model.
    action = your_policy(obs, info["goal_prompt"])   # ActionCall, dict, or str
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

env.write_finetune(Path("train.jsonl"))   # same records as `env run`
env.write_log(Path("run.log.jsonl"))

Constructor

CustomEnv(cfg, *, headless=True, xml_path=None, seed=0, adapter_factory=None)

Param	Description
`cfg`	An `EnvConfig` from `load_config(...)`.
`headless`	Run without the viewer.
`xml_path`	Override the scene XML.
`seed`	RNG seed.
`adapter_factory`	Supply a custom sim adapter (advanced).

Methods

Method	Returns	Description
`reset()`	`(obs, info)`	Start the mission at phase 0.
`step(action)`	`(obs, reward, terminated, truncated, info)`	Advance one tick. `action` may be an `ActionCall`, a dict, or a command string.
`current_phase()`	phase	The active phase object.
`records`	list	Accumulated per-step records.
`mission_success()`	`bool`	Whether all phases were completed.
`judge_summary()`	dict	LLM-judge verdict for the rollout.
`signal_summary()`	dict	Reward/signal breakdown.
`write_finetune(path)`	None	Write `(prompt, action, reward)` rows via the project’s template.
`write_log(path)`	None	Write the full tick-by-tick log.
`close()`	None	Shut down the sim.

The info dict from reset() / step() carries the current goal_prompt, phase index, and predicate state: everything your policy needs to act per phase.

Build and iterate on missions with the CLI, then bring the finished env.json here to evaluate a real policy against it, with no re-authoring required.

Related config types in cadenzalabs.customenv: EnvConfig, Phase, FineTuneConfig, and EnvConfigError (raised on an invalid env.json).

Get started

Features

Tutorial

Drive a mission

Constructor

Methods

​Drive a mission

​Constructor

​Methods

Drive a mission

Constructor

Methods