1. 16 Apr, 2021 2 commits
  2. 15 Apr, 2021 2 commits
  3. 14 Apr, 2021 2 commits
  4. 13 Apr, 2021 2 commits
  5. 12 Apr, 2021 2 commits
    • Philipp Schmid's avatar
      Sagemaker test docs update for framework upgrade (#11206) · f243a5ec
      Philipp Schmid authored
      * increased train_runtime for model parallelism
      
      * added documentation for framework upgrade
      f243a5ec
    • NielsRogge's avatar
      Add DeiT (PyTorch) (#11056) · 9f126097
      NielsRogge authored
      * First draft of deit
      
      * More improvements
      
      * Remove DeiTTokenizerFast from init
      
      * Conversion script works
      
      * Add DeiT to ViT conversion script
      
      * Add tests, add head model, add support for deit in vit conversion script
      
      * Update model checkpoint names
      
      * Update image_mean and image_std, set resample to bicubic
      
      * Improve docs
      
      * Docs improvements
      
      * Add DeiTForImageClassificationWithTeacher to init
      
      * Address comments by @sgugger
      
      * Improve feature extractors
      
      * Make fix-copies
      
      * Minor fixes
      
      * Address comments by @patil-suraj
      
      * All models uploaded
      
      * Fix tests
      
      * Remove labels argument from DeiTForImageClassificationWithTeacher
      
      * Fix-copies, style and quality
      
      * Fix tests
      
      * Fix typo
      
      * Multiple docs improvements
      
      * More docs fixes
      9f126097
  6. 09 Apr, 2021 4 commits
  7. 08 Apr, 2021 6 commits
  8. 07 Apr, 2021 3 commits
  9. 06 Apr, 2021 3 commits
  10. 05 Apr, 2021 5 commits
  11. 01 Apr, 2021 2 commits
  12. 31 Mar, 2021 4 commits
  13. 30 Mar, 2021 3 commits
    • Suraj Patil's avatar
      GPT Neo few fixes (#10968) · 83d38c9f
      Suraj Patil authored
      * fix checkpoint names
      
      * auto model
      
      * fix doc
      83d38c9f
    • Patrick von Platen's avatar
      fix big bird gpu test (#10967) · 7772ddb4
      Patrick von Platen authored
      7772ddb4
    • Suraj Patil's avatar
      GPT Neo (#10848) · 86026437
      Suraj Patil authored
      
      
      * lets begin
      
      * boom boom
      
      * fix out proj in attn
      
      * fix attention
      
      * fix local attention
      
      * add tokenizer
      
      * fix imports
      
      * autotokenizer
      
      * fix checkpoint name
      
      * cleanup
      
      * more clean-up
      
      * more cleanup
      
      * output attentions
      
      * fix attn mask creation
      
      * fix imports
      
      * config doc
      
      * add tests
      
      * add slow tests
      
      * quality
      
      * add conversion script
      
      * copyright
      
      * typo
      
      * another bites the dust
      
      * fix attention tests
      
      * doc
      
      * add embed init in convert function
      
      * fix copies
      
      * remove tokenizer
      
      * enable caching
      
      * address review comments
      
      * improve config and create attn layer list internally
      
      * more consistent naming
      
      * init hf config from mesh-tf config json file
      
      * remove neo tokenizer from doc
      
      * handle attention_mask in local attn layer
      
      * attn_layers => attention_layers
      
      * add tokenizer_class in config
      
      * fix docstring
      
      * raise if len of attention_layers is not same as num_layers
      
      * remove tokenizer_class from config
      
      * more consistent naming
      
      * fix doc
      
      * fix checkpoint names
      
      * fp16 compat
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      86026437