-
Suraj Patil authored
* add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests
4c0dd199
* add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests