"tests/kernels/attention/test_encoder_decoder_attn.py" did not exist on "d4d93db2c54ad989ea800004f491c8d116017a4c"
Unverified Commit 9d2b4a70 authored by Mark McLoughlin's avatar Mark McLoughlin Committed by GitHub
Browse files

[V1][Metrics] Updated list of deprecated metrics in v0.8 (#14695)


Signed-off-by: default avatarMark McLoughlin <markmc@redhat.com>
parent 0b0d6421
...@@ -39,7 +39,16 @@ The following metrics are exposed: ...@@ -39,7 +39,16 @@ The following metrics are exposed:
The following metrics are deprecated and due to be removed in a future version: The following metrics are deprecated and due to be removed in a future version:
- *(No metrics are currently deprecated)* - `vllm:num_requests_swapped`, `vllm:cpu_cache_usage_perc`, and
`vllm:cpu_prefix_cache_hit_rate` because KV cache offloading is not
used in V1.
- `vllm:gpu_prefix_cache_hit_rate` is replaced by queries+hits
counters in V1.
- `vllm:time_in_queue_requests` because it duplicates
`vllm:request_queue_time_seconds`.
- `vllm:model_forward_time_milliseconds` and
`vllm:model_execute_time_milliseconds` because
prefill/decode/inference time metrics should be used instead.
Note: when metrics are deprecated in version `X.Y`, they are hidden in version `X.Y+1` Note: when metrics are deprecated in version `X.Y`, they are hidden in version `X.Y+1`
but can be re-enabled using the `--show-hidden-metrics-for-version=X.Y` escape hatch, but can be re-enabled using the `--show-hidden-metrics-for-version=X.Y` escape hatch,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment