Unverified Commit 94fbb098 authored by namgyu-youn's avatar namgyu-youn Committed by GitHub
Browse files

[EASY] Drop duplicate KV-cache initialization (#38799)


Signed-off-by: default avatarnamgyu-youn <namgyu.dev@gmail.com>
parent 419e73cd
...@@ -131,9 +131,6 @@ def _init_kv_cache_quant( ...@@ -131,9 +131,6 @@ def _init_kv_cache_quant(
quant_config: Optional quantization configuration. quant_config: Optional quantization configuration.
prefix: Layer name prefix for quantization method lookup. prefix: Layer name prefix for quantization method lookup.
""" """
quant_method = (
quant_config.get_quant_method(layer, prefix=prefix) if quant_config else None
)
# Note [Register q/k/v/prob scales in state dict] # Note [Register q/k/v/prob scales in state dict]
# When calling model.to(device), only parameters/buffers in state dict are # When calling model.to(device), only parameters/buffers in state dict are
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment