- 31 Jul, 2024 1 commit
-
-
HandH1998 authored
-
- 30 Jul, 2024 1 commit
-
-
Tyler Michael Smith authored
-
- 21 Jul, 2024 1 commit
-
-
Alexander Matveev authored
-
- 09 Jun, 2024 1 commit
-
-
bnellnm authored
-
- 22 May, 2024 1 commit
-
-
Michael Goin authored
-
- 16 May, 2024 1 commit
-
-
Alexander Matveev authored
Co-authored-by:Robert Shaw <rshaw@neuralmagic.com>
-
- 24 Apr, 2024 1 commit
-
-
alexm-nm authored
This PR addresses the Marlin kernel H100 crash that was reported here: neuralmagic#187. The reason for the crash was the inline PTX assembly that introduced the async_copy with streaming behavior. The solution is to use the more standard PTX for async_copy (without the fractional L2 policy for "evict_first"). There is no performance difference between standard async_copy PTX and the previous one.
-
- 01 Mar, 2024 1 commit
-
-
Robert Shaw authored
Co-authored-by:
Robert Shaw <114415538+rib-2@users.noreply.github.com> Co-authored-by:
alexm <alexm@neuralmagic.com>
-