"vscode:/vscode.git/clone" did not exist on "71bcaf99e2cb2c677bf3a9addb9e8039cbcab22a"
-
Phuong Nguyen authored
* implemented custom call with ffi in csrc * moved headers of misc to misc.h, add ffi.h * ActLu and DActLu lowering with ffi_lowering * CastTranspose with ffi_lowering * enabled cudaGraph * added 4d input test case to TestActivationLu * added operand_output_aliases for CastTranspose * added env var NVTE_JAX_WITH_FFI, default value = 1 * replace casting ActivationEnum by taking its value --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
4b2b39b4