1. 16 Mar, 2022 1 commit
  2. 15 Mar, 2022 1 commit
    • Anton Lozhkov's avatar
      Add the XTREME-S fine-tuning example (#15985) · 99fd3eb4
      Anton Lozhkov authored
      * CTC+classification draft
      
      * CTC+classification draft
      
      * style
      
      * multilingual runs
      
      * Fix race condition during processor.from_reatrained
      
      * Merge covost experiments
      
      * Add README
      
      * Quality
      
      * Switch to .all configs
      
      * Fix typos
      99fd3eb4
  3. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  4. 10 Mar, 2022 1 commit
  5. 04 Mar, 2022 1 commit
  6. 02 Mar, 2022 1 commit
  7. 21 Feb, 2022 1 commit
  8. 15 Feb, 2022 1 commit
  9. 11 Feb, 2022 1 commit
  10. 09 Feb, 2022 1 commit
  11. 07 Feb, 2022 1 commit
  12. 31 Jan, 2022 2 commits
    • Jonatas Grosman's avatar
    • Julien Plu's avatar
      Add (M)Luke model training for Token Classification in the examples (#14880) · aa19f478
      Julien Plu authored
      * Add Luke training
      
      * Fix true label tags
      
      * Fix true label tags
      
      * Fix true label tags
      
      * Update the data collator for Luke
      
      * Some training refactor for Luke
      
      * Improve data collator for Luke
      
      * Fix import
      
      * Fix datasets concatenation
      
      * Add the --max_entity_length argument for Luke models
      
      * Remove unused code
      
      * Fix style issues
      
      * Fix style issues
      
      * Move the Luke training into a separate folder
      
      * Fix style
      
      * Fix naming
      
      * Fix filtering
      
      * Fix filtering
      
      * Fix filter
      
      * Update some preprocessing
      
      * Move luke to research_projects
      
      * Checkstyle
      
      * Address comments
      
      * Fix style
      aa19f478
  13. 27 Jan, 2022 4 commits
  14. 24 Jan, 2022 1 commit
  15. 21 Jan, 2022 2 commits
  16. 20 Jan, 2022 2 commits
  17. 19 Jan, 2022 5 commits
  18. 18 Jan, 2022 1 commit
  19. 12 Jan, 2022 1 commit
  20. 10 Jan, 2022 1 commit
  21. 23 Dec, 2021 1 commit
  22. 13 Dec, 2021 1 commit
  23. 06 Dec, 2021 1 commit
  24. 02 Dec, 2021 1 commit
    • Leandro von Werra's avatar
      Add CodeParrot 馃 codebase (#14536) · 43f953cc
      Leandro von Werra authored
      
      
      * add readme skeleton
      
      * update readme
      
      * add initialization script
      
      * add deduplication script
      
      * add codeparrot training script
      
      * add code generation evaluation
      
      * add validation loss script
      
      * add requirements
      
      * update readme
      
      * tweak readme
      
      * make style
      
      * add highlights to readme
      
      * add CLIs to scripts
      
      * add tokenizer training script
      
      * add docstring to constant length dataset
      
      * fix defaults in arguments
      
      * update readme with cli
      
      * move image to hub
      
      * tweaks of readme
      
      * fix cli commands
      
      * add author
      
      * explain env variables
      
      * fix formatting
      
      * Update examples/research_projects/codeparrot/README.md
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      
      * replace generic with gpt2 tokenizer
      Co-authored-by: default avatarlewtun <lewis.c.tunstall@gmail.com>
      43f953cc
  25. 30 Nov, 2021 1 commit
  26. 22 Nov, 2021 1 commit
  27. 19 Nov, 2021 1 commit
  28. 17 Nov, 2021 1 commit
  29. 15 Nov, 2021 1 commit
  30. 11 Nov, 2021 1 commit