"torchvision/vscode:/vscode.git/clone" did not exist on "834ac30ab5f0aab6e6fd9d3b9d8782765f5e4b1b"
Added block diagonal feedforward layer.
This layer replaces the weight matrix of the output_dense layer with a block diagonal matrix to save layer parameters and FLOPs. A linear mixing layer can be added optionally to improve layer expressibility. PiperOrigin-RevId: 418828099
Showing
Please register or sign in to comment