- 17 Mar, 2025 1 commit
-
-
Parth Sareen authored
* updated minP to use early exit making use of sorted tokens
-
- 13 Mar, 2025 1 commit
-
-
Parth Sareen authored
-
- 12 Mar, 2025 3 commits
-
-
ParthSareen authored
-
ParthSareen authored
-
ParthSareen authored
-
- 10 Mar, 2025 2 commits
-
-
Parth Sareen authored
-
Jeffrey Morgan authored
-
- 07 Mar, 2025 1 commit
-
-
Parth Sareen authored
This change bring in various interface cleanups along with greatly improving the performance of the sampler. Tested with llama3.2 on local machine. Improves performance from ~ 70 tokens/s -> 135 tokens/s with topK(40) enabled. Without topK performance is ~ 110 tokens/s
-
- 25 Feb, 2025 1 commit
-
-
Parth Sareen authored
-