Unverified Commit ad788684 authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

[Misc] Remove unused slot_mapping buffer (#23502)


Signed-off-by: default avatarWoosuk Kwon <woosuk.kwon@berkeley.edu>
parent e2db1164
...@@ -254,9 +254,6 @@ class GPUModelRunner(LoRAModelRunnerMixin, KVConnectorModelRunnerMixin): ...@@ -254,9 +254,6 @@ class GPUModelRunner(LoRAModelRunnerMixin, KVConnectorModelRunnerMixin):
self.seq_lens = torch.zeros(self.max_num_reqs, self.seq_lens = torch.zeros(self.max_num_reqs,
dtype=torch.int32, dtype=torch.int32,
device=self.device) device=self.device)
self.slot_mapping = torch.zeros(self.max_num_tokens,
dtype=torch.int64,
device=self.device)
# None in the first PP rank. The rest are set after load_model. # None in the first PP rank. The rest are set after load_model.
self.intermediate_tensors: Optional[IntermediateTensors] = None self.intermediate_tensors: Optional[IntermediateTensors] = None
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment