1. 15 Apr, 2021 2 commits
  2. 14 Apr, 2021 2 commits
  3. 13 Apr, 2021 2 commits
  4. 12 Apr, 2021 2 commits
    • Philipp Schmid's avatar
      Sagemaker test docs update for framework upgrade (#11206) · f243a5ec
      Philipp Schmid authored
      * increased train_runtime for model parallelism
      
      * added documentation for framework upgrade
      f243a5ec
    • NielsRogge's avatar
      Add DeiT (PyTorch) (#11056) · 9f126097
      NielsRogge authored
      * First draft of deit
      
      * More improvements
      
      * Remove DeiTTokenizerFast from init
      
      * Conversion script works
      
      * Add DeiT to ViT conversion script
      
      * Add tests, add head model, add support for deit in vit conversion script
      
      * Update model checkpoint names
      
      * Update image_mean and image_std, set resample to bicubic
      
      * Improve docs
      
      * Docs improvements
      
      * Add DeiTForImageClassificationWithTeacher to init
      
      * Address comments by @sgugger
      
      * Improve feature extractors
      
      * Make fix-copies
      
      * Minor fixes
      
      * Address comments by @patil-suraj
      
      * All models uploaded
      
      * Fix tests
      
      * Remove labels argument from DeiTForImageClassificationWithTeacher
      
      * Fix-copies, style and quality
      
      * Fix tests
      
      * Fix typo
      
      * Multiple docs improvements
      
      * More docs fixes
      9f126097
  5. 09 Apr, 2021 4 commits
  6. 08 Apr, 2021 6 commits
  7. 07 Apr, 2021 3 commits
  8. 06 Apr, 2021 3 commits
  9. 05 Apr, 2021 5 commits
  10. 01 Apr, 2021 2 commits
  11. 31 Mar, 2021 4 commits
  12. 30 Mar, 2021 5 commits
    • Suraj Patil's avatar
      GPT Neo few fixes (#10968) · 83d38c9f
      Suraj Patil authored
      * fix checkpoint names
      
      * auto model
      
      * fix doc
      83d38c9f
    • Patrick von Platen's avatar
      fix big bird gpu test (#10967) · 7772ddb4
      Patrick von Platen authored
      7772ddb4
    • Suraj Patil's avatar
      GPT Neo (#10848) · 86026437
      Suraj Patil authored
      
      
      * lets begin
      
      * boom boom
      
      * fix out proj in attn
      
      * fix attention
      
      * fix local attention
      
      * add tokenizer
      
      * fix imports
      
      * autotokenizer
      
      * fix checkpoint name
      
      * cleanup
      
      * more clean-up
      
      * more cleanup
      
      * output attentions
      
      * fix attn mask creation
      
      * fix imports
      
      * config doc
      
      * add tests
      
      * add slow tests
      
      * quality
      
      * add conversion script
      
      * copyright
      
      * typo
      
      * another bites the dust
      
      * fix attention tests
      
      * doc
      
      * add embed init in convert function
      
      * fix copies
      
      * remove tokenizer
      
      * enable caching
      
      * address review comments
      
      * improve config and create attn layer list internally
      
      * more consistent naming
      
      * init hf config from mesh-tf config json file
      
      * remove neo tokenizer from doc
      
      * handle attention_mask in local attn layer
      
      * attn_layers => attention_layers
      
      * add tokenizer_class in config
      
      * fix docstring
      
      * raise if len of attention_layers is not same as num_layers
      
      * remove tokenizer_class from config
      
      * more consistent naming
      
      * fix doc
      
      * fix checkpoint names
      
      * fp16 compat
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      86026437
    • Patrick von Platen's avatar
      [WIP][Flax] Add general conversion script (#10809) · 8780caa3
      Patrick von Platen authored
      
      
      * save intermediate
      
      * finish first version
      
      * delete some more
      
      * improve import
      
      * fix roberta
      
      * Update src/transformers/modeling_flax_pytorch_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/modeling_flax_pytorch_utils.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * small corrections
      
      * apply all comments
      
      * fix deterministic
      
      * make fix-copies
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8780caa3
    • Philipp Schmid's avatar
      Sagemaker test (#10925) · 604c0850
      Philipp Schmid authored
      * init
      
      * first working test
      
      * added todo for setup.py
      
      * working test for single node multi node ddp and smd
      
      * added tensorflow single node test
      
      * added directory for pytorch and tensorflow due to different requirements.txt
      
      * added directory for pytorch and tensorflow
      
      * added comment for run_glue until it is available
      
      * added output_dir to it
      
      * smaller dataset to make test running faster
      
      * adjust HP and script
      
      * adjusted parameter for tensorflow
      
      * refactored test scripts
      
      * adjusted make file
      
      * init
      
      * first working test
      
      * added todo for setup.py
      
      * working test for single node multi node ddp and smd
      
      * added tensorflow single node test
      
      * added directory for pytorch and tensorflow due to different requirements.txt
      
      * added directory for pytorch and tensorflow
      
      * added comment for run_glue until it is available
      
      * added output_dir to it
      
      * smaller dataset to make test running faster
      
      * adjust HP and script
      
      * adjusted parameter for tensorflow
      
      * refactored test scripts
      
      * adjusted make file
      
      * updated dlc container
      
      * commented in all tests
      
      * added both ecr images
      
      * added new master branches
      
      * debug
      
      * added new datasets version
      
      * init
      
      * strange rebase bug
      
      * removed changes
      
      * changed min version for tests to work
      
      * updated DLC
      
      * added model parallel test
      
      * removed test files
      
      * removed test files
      
      * tested with ned dlc
      
      * added correct sagemaker sdk version
      
      * adjust DLCs for official one
      
      * reworked tests
      
      * quality
      
      * removed default profile added documentation to it
      
      * added step in release for sagemaker tests
      
      * reverted version for example script removed duplicated script and added install from master to requirements.txt
      
      * removed mistaken .DS_Stores from mac
      
      * fixed tests
      
      * added Sylvains feedback
      
      * make style
      
      * added lysandre's feedback
      604c0850