Commits · d3af8c18317c0dc008d42e4367fbb9045cfb7bf6 · OpenDAS / vllm_cscc

14 Apr, 2026 1 commit

[Core][Metrics][BugFix] Replace num_cached_tokens/num_external_computed_tokens... · d3af8c18

Mark McLoughlin authored Apr 14, 2026


[Core][Metrics][BugFix] Replace num_cached_tokens/num_external_computed_tokens with PrefillStats (#37460)

Related to `Counters can only be incremented by non-negative amounts`
error with the `vllm:prompt_tokens_by_source_total` metric.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Or Ozeri <or@ozery.com>

d3af8c18

12 Apr, 2026 1 commit
- [Core][Metrics] Remove `vllm:prompt_tokens_recomputed` metric (#38709) · 72ff142c
  Mark McLoughlin authored Apr 12, 2026
```
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
```
  72ff142c
04 Feb, 2026 1 commit

[Metrics] Add labeled prompt token metrics for P/D disaggregation (#33290) · 4403e3ed

zhanqiuhu authored Feb 04, 2026

Add labeled Prometheus metrics to distinguish where prompt tokens come
from in P/D disaggregated deployments.

In P/D disaggregation, decode instances receive KV cache from prefill instances.
Currently, decode reports inflated prompt throughput because it counts all
prompt tokens as "computed", even though most were transferred.

This PR adds labeled metrics so users can understand actual compute work vs
transferred work:

vllm:prompt_tokens_by_source_total{source="local_compute"} # Tokens prefilled locally
vllm:prompt_tokens_by_source_total{source="external_kv_transfer"} # Tokens received via KV transfer
vllm:prompt_tokens_by_source_total{source="local_cache_hit"} # Tokens from local prefix cache
vllm:prompt_tokens_cached_total # Total cached (local + external, -1 when all
Signed-off-by: Zhanqiu Hu <zh338@cornell.edu>

4403e3ed

09 Dec, 2025 1 commit
- feat(metrics): Add prefill KV compute metric excluding cached tokens (#30189) · f1599ca5
  Victor Ziliang Peng authored Dec 08, 2025
```
Signed-off-by: Ziliang Peng <ziliang@character.ai>
```
  f1599ca5
10 Nov, 2025 1 commit
- [Metrics] Refactor LoRA state tracking (#26801) · 6f7de33b
  Mark McLoughlin authored Nov 10, 2025
```
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
```
  6f7de33b
05 Nov, 2025 1 commit
- [Feature]: Add corrupted request metric to V1 metrics system. (#27306) · e1560178
  Snehlata authored Nov 06, 2025
```
Signed-off-by: atalhens <sneh.lata@nutanix.com>
```
  e1560178
05 Oct, 2025 2 commits
- Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247) · d6953beb
  Harry Mellor authored Oct 05, 2025
```
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
```
  d6953beb
- [Easy] Add str repr for IterationStats (#26232) · 78c1d5bf
  22quinn authored Oct 04, 2025
```
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
```
  78c1d5bf