vllm bench sweep plot_pareto¶
JSON CLI 参数¶
在传递 JSON 命令行参数时,以下各组参数是等价的:
--json-arg '{"key1": "value1", "key2": {"key3": "value2"}}'--json-arg.key1 value1 --json-arg.key2.key3 value2
此外,可以使用 + 符号逐个传递列表元素:
--json-arg '{"key4": ["value3", "value4", "value5"]}'--json-arg.key4+ value3 --json-arg.key4+='value4,value5'
参数¶
--user-count-var¶
- Result key that stores concurrent user count. Falls back to max_concurrent_requests if missing.
- Default:
max_concurrency
--gpu-count-var¶
- Result key that stores GPU count. If not provided, falls back to num_gpus/gpu_count or tensor_parallel_size * pipeline_parallel_size.
--label-by¶
- Comma-separated list of fields to annotate on Pareto frontier points.
- Default:
max_concurrency,gpu_count
--dry-run¶
- If set, prints the figures to plot without drawing them.
- Default:
False