supporting different intermediate sizes other than 4 * hidden_dim (#389)
* supporting different intermediate sizes other than 4*hidden_dim
* run precommit
* uncommnet the unit tests
Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
Showing
csrc/transformer/ds_transformer_cuda.cpp
100755 → 100644
deepspeed/ops/transformer/transformer.py
100644 → 100755
Please register or sign in to comment