-
Jerry Ma authored
This commit adds an FP16Model class as a successor to network_to_half. The benefits of this class are: - Preservation of single-precision for BatchNorm layers. The models generated by network_to_half() convert BatchNorm moment tensors to half-precision, then back to single-precision, which hurts the accuracy of the moment estimators and occasionally results in NaNs. - Support for multi-argument nn.Modules (self-explanatory from code).
713e0fb8