Fix momentum and epsilon values (#19454)
The momentum value for PyTorch and TensorFlow batch normalization layers is not equivalent. The TensorFlow value should be (1 - pytorch_momentum) in order to ensure the correct updates are applied to the running mean and running variance calculations. We wouldn't observe a difference loading a pretrained model and performing inference, but evaluation outputs would change after some training steps.
Showing
Please register or sign in to comment