"pytorch/cuda/moe_compute_kernel.cu" did not exist on "881b10c20eb2a0614e872fe31197711d2bb1873f"
-
gushiqiao authored
* reconstruct quantization and fix memory leak bug. * Support lazy load inference. * reconstruct quantization * Fix hunyuan bugs * deleted tmp file --------- Co-authored-by:
root <root@pt-c0b333b3a1834e81a0d4d5f412c6ffa1-worker-0.pt-c0b333b3a1834e81a0d4d5f412c6ffa1.ns-devsft-3460edd0.svc.cluster.local> Co-authored-by:
gushiqiao <gushqiaio@sensetime.com> Co-authored-by:
gushiqiao <gushiqiao@sensetime.com>
5c241f86