@@ -3,15 +3,11 @@ This project implements a method for faster and more memory-efficient RNN-T comp
## How does the pruned-rnnt work ?
We first obtain pruning bounds for the RNN-T recursion using
a simple joiner network that is just an addition of the encoder and decoder,
then we use those pruning bounds to evaluate the full, non-linear joiner network.
We first obtain pruning bounds for the RNN-T recursion using a simple joiner network that is just an addition of the encoder and decoder, then we use those pruning bounds to evaluate the full, non-linear joiner network.
The picture below display the gradients (obtained by rnnt_loss_simple with return_grad equals to true)
of transducer lattice node, at each time frame, only a small set of nodes have a non-zero gradient,
which justifies the pruned RNN-T loss, i.e., putting a limit on the number of symbols per frame.
The picture below display the gradients (obtained by `rnnt_loss_simple` with `return_grad=true`) of lattice nodes, at each time frame, only a small set of nodes have a non-zero gradient, which justifies the pruned RNN-T loss, i.e., putting a limit on the number of symbols per frame.