-
Phuong Nguyen authored
* fix ctx.aval_out indexing for workspace * add cudnn init to prepare phase of norm custom calls * add thread_local for norm registry instance --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
0e1d9fae
* fix ctx.aval_out indexing for workspace
* add cudnn init to prepare phase of norm custom calls
* add thread_local for norm registry instance
---------
Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>