TF: XLA-trainable DeBERTa v2 (#18546)
* fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float
Showing
Please register or sign in to comment
* fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float