- 08 Feb, 2024 1 commit
-
-
Sayak Paul authored
* attention_head_dim * debug * print more info * correct num_attention_heads behaviour * down_block_num_attention_heads -> num_attention_heads. * correct the image link in doc. * add: deprecation for num_attention_head * fix: test argument to use attention_head_dim * more fixes. * quality * address comments. * remove depcrecation.
-
- 31 Jan, 2024 1 commit
-
-
Sayak Paul authored
--------- Co-authored-by:
Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
YiYi Xu <yixu310@gmail.com>
-