[JAX] Fix partitioning issues in LayerNorm and LayerNormMLP layers (#1743)
* Enforce input sharding of norm primitive does not shard hidden dim Signed-off-by:Jeremy Berchtold <jberchtold@nvidia.com> * Fix partitioning issue in dact primitive causing NaN and add better shape checks before calling TE API Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Move dact shape assertion from cpp to python Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> --------- Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com>
Showing
Please register or sign in to comment