"vscode:/vscode.git/clone" did not exist on "dd865befde5ca736193ebe212a77321e4f1b921f"
-
Daniël de Kok authored
* Switch from fbgemm-gpu w8a8 scaled matmul to vLLM/marlin-kernels Performance and accuracy of these kernels are on par (tested with Llama 70B and 405B). Removes a dependency and resolves some stability issues we have been seeing. * Update test snapshots
0f346a32