Skip to main content

CLI Reference

llem has three commands (run, config, doctor) and one flag (--version).

llem run [CONFIG] [OPTIONS] # run an experiment or study
llem config [OPTIONS] # show environment and configuration status
llem doctor # verify Docker images match the host schema
llem --version # print version and exit

llem run

Run an experiment or study. Detects study mode automatically when the YAML config contains sweep: or experiments: keys.

Arguments:

ArgumentTypeRequiredDescription
configpathnoPath to experiment or study YAML config

Options:

FlagShortTypeDefaultDescription
--model-mstrModel name or HuggingFace path
--engine-estrInference engine (pytorch, vllm, tensorrt)
--dataset-dstrDataset source (aienergyscore or .jsonl file path)
-nintNumber of prompts to run (dataset.n_prompts)
--batch-sizeintBatch size (Transformers engine only)
--dtype-pstrModel dtype (float32, float16, bfloat16)
--output-ostrOutput directory for results
--dry-runflagfalseValidate config and estimate VRAM without running
--quiet-qflagfalseSuppress progress bars
--verbose-vflagfalseShow detailed output and tracebacks
--cyclesintNumber of measurement cycles (study mode)
--orderstrCycle ordering: sequential, interleave, shuffle (study mode)
--no-gapsflagfalseDisable thermal gaps between experiments (study mode)
--skip-preflightflagfalseSkip Docker pre-flight checks (GPU visibility, CUDA/driver compatibility)
--resumeflagfalseResume most recent interrupted study
--resume-dirpathResume a specific study directory
--fail-fastflagfalseAbort study on first failure (circuit breaker threshold=1)
--no-circuit-breakerflagfalseDisable circuit breaker entirely
--timeoutfloatStudy wall-clock timeout in hours (e.g. 24, 1.5)
--no-lockflagfalseDisable GPU lock files (advanced)

CLI effective defaults for study mode (applied when neither the YAML study_execution: block nor a CLI flag specifies the value):

  • --cycles defaults to 3 (Pydantic model default is 1)
  • --order defaults to shuffle (Pydantic model default is sequential)

These defaults are applied at the CLI layer to give better statistical coverage out of the box. To use the conservative model defaults, set them explicitly in the YAML study_execution: block.

Exit codes: 0 success, 1 experiment/engine/preflight error, 2 config validation error, 130 interrupted (Ctrl-C).


llem config

Show environment and configuration status. Always exits 0 — this command is informational only.

Options:

FlagShortTypeDefaultDescription
--verbose-vflagfalseShow driver version, engine versions, and full config diff

Example output:

GPU
NVIDIA A100-SXM4-80GB 80.0 GB
Engines
transformers: installed
vllm: not installed (runs in Docker — see docs/development.md)
tensorrt: not installed (runs in Docker — see docs/development.md)
Energy
Energy: nvml
Config
Path: /home/user/.config/llenergymeasure/config.yaml
Status: using defaults (no config file)
Python
3.12.0

llem doctor

Verify that every engine's resolved Docker image matches the host's ExperimentConfig schema. Compares the llem.expconf.schema.fingerprint OCI label baked into each image against a fingerprint computed from the host's current ExperimentConfig.model_json_schema().

Image resolution follows the same chain as llem run: local build (llenergymeasure:{engine}) first, then the versioned GHCR tag (ghcr.io/henrycgbaker/llenergymeasure/{engine}:v{version}).

Exit codes: 0 when every reachable image matches (or labels are absent on legacy images); 1 when at least one image's schema fingerprint differs from the host.

Columns:

ColumnMeaning
EngineEngine identifier (pytorch, vllm, tensorrt)
ImageResolved image tag (local or GHCR)
Pkg verorg.opencontainers.image.version label (llenergymeasure release)
Img FPFirst 12 chars of llem.expconf.schema.fingerprint label
Host FPFirst 12 chars of host ExperimentConfig fingerprint
StatusOK / MISMATCH / UNVERIFIED (pre-handshake image) / UNREACHABLE (no such image)

Bypass: LLEM_SKIP_IMAGE_CHECK=1 disables the runtime handshake in llem run; when set, llem doctor still reports the true status but prints a warning in the footer.

Example:

llem doctor
Engine Image Pkg ver Img FP Host FP Status
---------------------------------------------------------------------------------------------
pytorch llenergymeasure:transformers 0.9.0 a1b2c3d4e5f6 a1b2c3d4e5f6 OK
vllm llenergymeasure:vllm 0.9.0 9988776655ff a1b2c3d4e5f6 MISMATCH
└─ repull: docker pull vllm/vllm-openai:0.7.3
tensorrt llenergymeasure:tensorrt 0.9.0 a1b2c3d4e5f6 a1b2c3d4e5f6 OK

Host llenergymeasure version: 0.9.0
Host ExperimentConfig fingerprint: a1b2c3d4e5f6…

llem --version

Print version and exit.

llem --version

Example output:

llem v0.9.0

Examples

Single experiment via flags

llem run --model gpt2 -e pytorch

Single experiment via YAML

llem run experiment.yaml

Dry run (validate config, estimate VRAM)

llem run experiment.yaml --dry-run

Study with cycle override

# Run 5 cycles in interleave order instead of the CLI default (3 shuffle)
llem run study.yaml --cycles 5 --order interleave

Suppress thermal gaps (testing only)

llem run study.yaml --no-gaps

Skip Docker pre-flight (when Docker daemon is on a remote host)

llem run study.yaml --skip-preflight

Resume an interrupted study

# Auto-detect most recent resumable study
llem run study.yaml --resume

# Resume a specific study directory
llem run study.yaml --resume-dir results/full-suite-all-engines_20260329_1716/

Fail-fast mode (abort on first failure)

llem run study.yaml --fail-fast

Set a wall-clock timeout

# Abort after 24 hours, mark remaining experiments as skipped
llem run study.yaml --timeout 24

Environment check

llem config
llem config --verbose