[Bugfix] LoRA : Fix the order in which the kernels process LoRAs (#16040)

Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>

[Bugfix] LoRA : Fix the order in which the kernels process LoRAs (#16040)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
3a100b92 · Varun Sundar Rabindranath · GitHub · 242a637a · 3a100b92
Unverified Commit 3a100b92 authored Apr 06, 2025 by Varun Sundar Rabindranath Committed by GitHub Apr 06, 2025
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

vllm/lora/ops/triton_ops/lora_kernel_metadata.py vllm/lora/ops/triton_ops/lora_kernel_metadata.py +1 -1

No files found.
--- a/vllm/lora/ops/triton_ops/lora_kernel_metadata.py
+++ b/vllm/lora/ops/triton_ops/lora_kernel_metadata.py
@@ -111,7 +111,7 @@ class LoRAKernelMeta:
        # active_lora_ids, num_tokens_per_lora
        lora_ids, num_tokens_per_lora = torch.unique(token_lora_mapping,
-                                                     sorted=False,
+                                                     sorted=True,
                                                     return_counts=True)
        self.active_lora_ids[:lora_ids.size(0)].copy_(lora_ids,
                                                      non_blocking=True)