Commit e658bf7f authored by zhuwenwen's avatar zhuwenwen
Browse files

update prefix_prefill params

parent 675bceed
...@@ -686,7 +686,7 @@ if triton.__version__ >= "2.1.0": ...@@ -686,7 +686,7 @@ if triton.__version__ >= "2.1.0":
sliding_window=None): sliding_window=None):
cap = current_platform.get_device_capability() cap = current_platform.get_device_capability()
BLOCK = 128 if cap[0] >= 8 else 32 BLOCK = 32 if cap[0] >= 8 else 32
# need to reduce num. blocks when using fp32 # need to reduce num. blocks when using fp32
# due to increased use of GPU shared memory # due to increased use of GPU shared memory
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment