"torchvision/vscode:/vscode.git/clone" did not exist on "62e3fbd82a7e6c9c6455af57cf8c4191035c90d4"
Added block diagonal feedforward layer.
This layer replaces the weight matrix of the output_dense layer with a block diagonal matrix to save layer parameters and FLOPs. A linear mixing layer can be added optionally to improve layer expressibility. PiperOrigin-RevId: 418828099
Showing
Please register or sign in to comment