llama/llama.cpp/src/llama-arch.cpp · 424810450f3043e97aca539f1250d149a26cd99e · OpenDAS / ollama

Move quantization to new backend (#10363) · 42481045

Daniel Hiltgen authored May 06, 2025

* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.

42481045

llama-arch.cpp 99.1 KB

Replace llama-arch.cpp