Move quantization to new backend (#10363)
* Move quantization logic to GGML via new backend This moves the model aware logic to Go code and calls GGMLs quantization code for model creation. * Remove "add model quantizations" This is no longer needed now that quantization is implemented in Go+GGML code directly.
Showing
File moved
File moved
server/quantization.go
0 → 100644
server/quantization_test.go
0 → 100644
This diff is collapsed.
Please register or sign in to comment