1. 01 Dec, 2021 1 commit
    • Suraj Patil's avatar
      FlaxGPTJ (#14396) · 4c0dd199
      Suraj Patil authored
      * add flax gptj
      
      * no bias in attention dense
      
      * no wpe
      
      * fix rotary embeddings
      
      * fix rotary embeds
      
      * fix rotray embeds
      
      * quality
      
      * doc and quality
      
      * fix equivalence tests
      4c0dd199
  2. 22 Sep, 2021 1 commit
  3. 27 Aug, 2021 1 commit
  4. 06 Jul, 2021 1 commit
    • Suraj Patil's avatar
      FlaxGPTNeo (#12493) · 7a259c19
      Suraj Patil authored
      * flax gpt neo
      
      * fix query scaling
      
      * update generation test
      
      * use flax model for test
      7a259c19
  5. 26 May, 2021 1 commit
    • Patrick von Platen's avatar
      Flax Generate (#11777) · 996a315e
      Patrick von Platen authored
      
      
      * fix_torch_device_generate_test
      
      * remove @
      
      * add
      
      * indexing
      
      * correct a couple of tests
      
      * fix tests
      
      * add logits processor
      
      * finish top_k, top_p, temp
      
      * add docs
      
      * correct flax prng key default
      
      * improve generate
      
      * add generation docs
      
      * add docs
      
      * make style
      
      * revert model outputs change
      
      * make style
      
      * correct typo
      
      * fix tests
      
      * fix slow test
      
      * add raise
      
      * finish generation
      Co-authored-by: default avatarPatrick von Platen <patrick@huggingface.co>
      996a315e
  6. 18 May, 2021 1 commit
    • Suraj Patil's avatar
      FlaxGPT2 (#11556) · ca33278f
      Suraj Patil authored
      
      
      * flax gpt2
      
      * combine masks
      
      * handle shared embeds
      
      * add causal LM sample
      
      * style
      
      * add tests
      
      * style
      
      * fix imports, docs, quality
      
      * don't use cache
      
      * add cache
      
      * add cache 1st version
      
      * make use cache work
      
      * start adding test for generation
      
      * finish generation loop compilation
      
      * rewrite test
      
      * finish
      
      * update
      
      * update
      
      * apply sylvains suggestions
      
      * update
      
      * refactor
      
      * fix typo
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      ca33278f