"vllm/model_executor/layers/sampler.py" did not exist on "a283ec2eece57454ec9301e5542cffa1201e175f"
- 15 Apr, 2025 2 commits
- 14 Apr, 2025 10 commits
-
-
Zimin Li authored
-
Zimin Li authored
-
Zimin Li authored
issue/127: Optimize elementwise CUDA code by removing redundancy, change/correct kernel logic when all inputs have the same dtype
-
Zimin Li authored
issue/127: Refactor ElementwiseInfo, refactor elementwise to use workspace for storing meta, fix misc. issues
-
Zimin Li authored
issue/127: fix CUDA mix-precision broadcasting input mismatch issue, adjust comment structure and template variable order
-
Zimin Li authored
enable_if, remove std::move() in elementwise_cpu.h, add <array> inclusion
-
Zimin Li authored
issue/127: refactor ElementwiseInfo to use utils::Result, change elementwise calcualte and calculateImpl to return infiniStatus_t, add CHECK_CUDA to cuda function calls
-
Zimin Li authored
issue/127: modify swiglu test to correctly handle broadcast scenarios, add two broadcast testcases, correct elementwise cpu mix-precision implementation
-
Zimin Li authored
issue/127: refactor elementwise framework, complete CUDA implementation, refactor swiglu using the generic elementwise framework
-
Zimin Li authored
-