Commit d4bccff3 authored by xuxzh1's avatar xuxzh1 🎱
Browse files

Optimize the performance of GPTQ

parent ee3d6944
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
#include "quant/qdq_6.cuh" #include "quant/qdq_6.cuh"
#include "quant/qdq_8.cuh" #include "quant/qdq_8.cuh"
#define GPTQ_BLOCK_KN_SIZE 128 #define GPTQ_BLOCK_KN_SIZE 256
#define GPTQ_BLOCK_M_SIZE_MAX 8 #define GPTQ_BLOCK_M_SIZE_MAX 8
#define GPTQ_MAX_GROUPS_IN_BLOCK (GPTQ_BLOCK_KN_SIZE / 32) #define GPTQ_MAX_GROUPS_IN_BLOCK (GPTQ_BLOCK_KN_SIZE / 32)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment