study-config
Configuration Reference
Full reference for all ExperimentConfig fields.
All fields except model are optional and have sensible defaults.
Sections:
- Top-Level Fields
- Warmup (
warmup:) - Baseline (
baseline:) - Transformers Engine (
transformers:) - Transformers Engine Params (
transformers.engine_params:) - Transformers Sampling Params (
transformers.sampling_params:) - vLLM Engine (
vllm:) - vLLM Engine Params (
vllm.engine_params:) - vLLM Sampling Params (
vllm.sampling_params:) - TensorRT-LLM Engine (
tensorrt:) - TensorRT-LLM Engine Params (
tensorrt.engine_params:) - TensorRT-LLM Sampling Params (
tensorrt.sampling_params:) - Harness Overrides (
harness:) - Transformers Harness (
harness.transformers:)
Top-Level Fields
| Field | Type | Default | Description |
|---|---|---|---|
task | TaskConfig | (see section) | Task configuration: model, dataset, workload shape |
engine | Engine | (see section) | Inference engine |
measurement | MeasurementConfig | (see section) | Measurement methodology: warmup, baseline, energy sampling |
sampling_preset | 'deterministic' | 'standard' | 'creative' |
transformers | Config | None | null |
vllm | Config | None | null |
tensorrt | Config | None | null |
harness | HarnessConfig | None | null |
passthrough_kwargs | dict | None | null |
Warmup (warmup:)
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable warmup phase |
n_prompts | integer | 5 | Number of full-length warmup prompts in fixed mode |
thermal_floor_seconds | number | 60.0 | Minimum seconds to wait after warmup before measuring (thermal stabilisation). Minimum 30s enforced. |
convergence_detection | boolean | false | Enable CV-based adaptive convergence (governed by min_prompts, max_prompts, cv_threshold, window_size) |
cv_threshold | number | 0.05 | CV target for convergence (only used when convergence_detection=True) |
max_prompts | integer | 20 | Maximum warmup prompts when CV mode is on (safety cap) |
window_size | integer | 3 | Sliding window size for CV calculation (3 balances responsiveness and stability) |
min_prompts | integer | 5 | Minimum prompts before checking convergence (warm start) |
Baseline (baseline:)
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable baseline power measurement |
duration_seconds | number | 30.0 | Baseline measurement duration in seconds |
strategy | 'cached' | 'validated' | 'fresh' |
cache_ttl_seconds | number | 7200.0 | How long a cached baseline remains valid before re-measurement, in seconds. Only used with strategy='cached' or 'validated'. |
validation_interval | integer | 5 | Re-validate baseline every N experiments. Only used with strategy='validated'. |
drift_threshold | number | 0.1 | Power drift threshold (fraction) to trigger re-measurement. Only used with strategy='validated'. |
Transformers Engine (transformers:)
| Field | Type | Default | Description |
|---|---|---|---|
engine_params | EngineParams | None | null |
sampling_params | SamplingParams | None | null |
Transformers Engine Params (transformers.engine_params:)
| Field | Type | Default | Description |
|---|---|---|---|
dtype | any | None | null |
attn_implementation | any | None | null |
load_in_4bit | any | None | null |
load_in_8bit | any | None | null |
bnb_4bit_compute_dtype | any | None | null |
bnb_4bit_quant_type | any | None | null |
bnb_4bit_use_double_quant | any | None | null |
use_cache | boolean | None | null |
cache_implementation | string | None | null |
num_beams | integer | None | null |
early_stopping | boolean | None | null |
length_penalty | number | None | null |
no_repeat_ngram_size | integer | None | null |
prompt_lookup_num_tokens | integer | None | null |
device_map | any | None | null |
max_memory | any | None | null |
low_cpu_mem_usage | any | None | null |
tp_plan | any | None | null |
tp_size | any | None | null |
Transformers Sampling Params (transformers.sampling_params:)
| Field | Type | Default | Description |
|---|---|---|---|
temperature | number | None | null |
do_sample | boolean | None | null |
top_k | integer | None | null |
top_p | number | None | null |
repetition_penalty | number | None | null |
min_p | number | None | null |
min_new_tokens | integer | None | null |
vLLM Engine (vllm:)
| Field | Type | Default | Description |
|---|---|---|---|
engine_params | EngineParams | None | null |
sampling_params | SamplingParams | None | null |
vLLM Engine Params (vllm.engine_params:)
| Field | Type | Default | Description |
|---|---|---|---|
dtype | 'auto' | 'half' | 'float16' |
gpu_memory_utilization | number | None | 0.9 |
cpu_offload_gb | number | None | 0 |
block_size | integer | None | null |
kv_cache_dtype | 'auto' | 'float16' | 'bfloat16' |
enforce_eager | boolean | None | false |
enable_chunked_prefill | boolean | None | null |
max_num_seqs | integer | None | null |
max_num_batched_tokens | integer | None | null |
max_model_len | integer | None | null |
tensor_parallel_size | integer | None | 1 |
pipeline_parallel_size | integer | None | 1 |
distributed_executor_backend | any | None | null |
enable_prefix_caching | boolean | None | null |
quantization | any | None | null |
speculative_config | SpeculativeConfig | None | null |
offload_group_size | integer | None | 0 |
offload_num_in_group | integer | None | 1 |
offload_prefetch_step | integer | None | 1 |
offload_params | any | None | [] |
disable_custom_all_reduce | boolean | None | false |
kv_cache_memory_bytes | integer | None | null |
compilation_config | CompilationConfig | None | null |
attention | any | None | null |
beam_search | any | None | null |
vLLM Sampling Params (vllm.sampling_params:)
| Field | Type | Default | Description |
|---|---|---|---|
temperature | number | None | 1.0 |
top_k | integer | None | 0 |
top_p | number | None | 1.0 |
repetition_penalty | number | None | 1.0 |
min_p | number | None | 0.0 |
min_tokens | integer | None | 0 |
presence_penalty | number | None | 0.0 |
frequency_penalty | number | None | 0.0 |
ignore_eos | boolean | None | false |
n | integer | None | 1 |
TensorRT-LLM Engine (tensorrt:)
| Field | Type | Default | Description |
|---|---|---|---|
engine_params | EngineParams | None | null |
sampling_params | SamplingParams | None | null |
TensorRT-LLM Engine Params (tensorrt.engine_params:)
| Field | Type | Default | Description |
|---|---|---|---|
max_batch_size | integer | None | null |
tensor_parallel_size | integer | None | 1 |
pipeline_parallel_size | integer | None | 1 |
max_input_len | integer | None | null |
max_seq_len | integer | None | null |
max_num_tokens | integer | None | null |
dtype | string | None | auto |
fast_build | boolean | None | false |
backend | string | None | null |
quant_config | any | None | null |
kv_cache_config | any | None | null |
scheduler_config | any | None | null |
TensorRT-LLM Sampling Params (tensorrt.sampling_params:)
| Field | Type | Default | Description |
|---|---|---|---|
temperature | number | None | null |
top_k | integer | None | null |
top_p | number | None | null |
repetition_penalty | number | None | null |
min_p | number | None | null |
min_tokens | integer | None | null |
n | integer | None | 1 |
ignore_eos | boolean | None | false |
Harness Overrides (harness:)
| Field | Type | Default | Description |
|---|---|---|---|
transformers | TransformersHarness | None | null |
Transformers Harness (harness.transformers:)
| Field | Type | Default | Description |
|---|---|---|---|
batch_size | integer | None | null |
torch_compile | boolean | None | null |
torch_compile_mode | string | None | null |
torch_compile_backend | string | None | null |
allow_tf32 | boolean | None | null |
autocast_enabled | boolean | None | null |
autocast_dtype | 'float16' | 'bfloat16' | None |