• Daniel Hiltgen's avatar
    Move quantization to new backend (#10363) · 42481045
    Daniel Hiltgen authored
    * Move quantization logic to GGML via new backend
    
    This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.
    
    * Remove "add model quantizations"
    
    This is no longer needed now that quantization is implemented in Go+GGML code directly.
    42481045
llama-arch.cpp 99.1 KB