-
Reed authored
Also, do Transformer inference in fp16, as well as training, when --dtype=fp16. In TF 2, layers now cannot run in multiple different dtypes, so we must use the same dtype for training and inference.
58340818
Also, do Transformer inference in fp16, as well as training, when --dtype=fp16. In TF 2, layers now cannot run in multiple different dtypes, so we must use the same dtype for training and inference.