Commit d4bccff3 authored by xuxzh1's avatar xuxzh1 🎱
Browse files

Optimize the performance of GPTQ

parent ee3d6944
......@@ -10,7 +10,7 @@
#include "quant/qdq_6.cuh"
#include "quant/qdq_8.cuh"
#define GPTQ_BLOCK_KN_SIZE 128
#define GPTQ_BLOCK_KN_SIZE 256
#define GPTQ_BLOCK_M_SIZE_MAX 8
#define GPTQ_MAX_GROUPS_IN_BLOCK (GPTQ_BLOCK_KN_SIZE / 32)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment