-
jberchtold-nvidia authored
* Custom call tests passing Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Fix test_layer.py Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Lint Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Fix comments Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Support using amax on HighPrecision tensor if it exists instead of recomputing for current scaling Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Fix shardy issue with amax being shape 1,1,1 instead of shape (1,) Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Add higher-precision VJP tests to test_distributed_layernorm_mlp Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Cast non-quantized kernels to input dtype in VJPs Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Rename HighPrecisionTensor to NoScaleTensor Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Use NoScaleTensor in pure JAX impls where it was missing Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> * Fix tests Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> --------- Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com>
c47f329b