@@ -145,9 +145,9 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco
...
@@ -145,9 +145,9 @@ vLLM V1 currently excludes model architectures with the `SupportsV0Only` protoco
and the majority fall into the following categories. V1 support for these models will be added eventually.
and the majority fall into the following categories. V1 support for these models will be added eventually.
**Embedding Models**
**Embedding Models**
Initially, we will create a [separate model runner](https://github.com/vllm-project/vllm/pull/18015) to provide V1 support without conflicting with other ongoing work.
The initial support will be provided by [PR #16188](https://github.com/vllm-project/vllm/pull/16188).
Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) to enable simultaneous generation and embedding using the same engine instance in V1.[PR #16188](https://github.com/vllm-project/vllm/pull/16188) is the first step towards enabling this.
Later, we will consider using [hidden states processor](https://github.com/vllm-project/vllm/issues/12249), which is based on [global logits processor](https://github.com/vllm-project/vllm/pull/13360) to enable simultaneous generation and embedding using the same engine instance in V1.
**Mamba Models**
**Mamba Models**
Models using selective state-space mechanisms (instead of standard transformer attention)
Models using selective state-space mechanisms (instead of standard transformer attention)