1. 31 Mar, 2023 5 commits
  2. 30 Mar, 2023 10 commits
  3. 29 Mar, 2023 14 commits
  4. 28 Mar, 2023 4 commits
  5. 27 Mar, 2023 7 commits
    • Kshiteej K's avatar
      [neptune] fix checkpoint bug with relative out_dir (#22102) · 3ec7a476
      Kshiteej K authored
      
      
      * [neptune] fix checkpoint bug with relative out_dir
      
      * update imports
      
      * reformat with black
      
      * check neptune without imports
      
      * fix typing-related issue
      
      * run black on code
      
      * use os.path.sep instead of raw \
      
      * simplify imports and remove type annotation
      
      * make ruff happy
      
      * apply review suggestions
      
      ---------
      Co-authored-by: default avatarAleksander Wojnarowicz <alwojnarowicz@gmail.com>
      3ec7a476
    • Arthur's avatar
      [WIP]`NLLB-MoE` Adds the moe model (#22024) · 19ade242
      Arthur authored
      * Initial commit
      
      * update modeling code
      
      * update doc
      
      * add functions necessary
      
      * fix impotrs
      
      * revert changes
      
      * fixup
      
      * more styling to get going
      
      * remove standalone encoder
      
      * update code
      
      * styling
      
      * fix config and model
      
      * update code and some refactoring
      
      * make more tests pass
      
      * Adding NLLB-200 - MoE - 54.5B for no language left behind
      Fixes #21300
      
      * fix mor common tests
      
      * styke
      
      * update testing file
      
      * update
      
      * update
      
      * Router2 doc
      
      * update check config with sparse layer
      
      * add dummy router
      
      * update current conversion script
      
      * create on the fly conversion script
      
      * Fixup
      
      * style
      
      * style 2
      
      * fix empty return
      
      * fix return
      
      * Update default config sparse layers
      
      * easier to create sparse layers
      
      * update
      
      * update conversion script
      
      * update modeling
      
      * add to toctree
      
      * styling
      
      * make ruff happy
      
      * update docstring
      
      * update conversion script
      
      * update, will break tests but impelemting top2
      
      * update
      
      * local groups are supported here
      
      * ️ Support for local groups is now removed ️
      
      This is because it has to work with model parallelism that we do not support
      
      * finish simplificaiton
      
      * Fix forward
      
      * style
      
      * fixup
      
      * Update modelling and test, refactoring
      
      * update tests
      
      * remove final layer)norm as it is done in the FF
      
      * routing works! Logits test added
      
      * nit in test
      
      * remove top1router
      
      * style
      
      * make sure sparse are tested. Had to change route_tokens a liottle bit
      
      * add support for unslip models when converting
      
      * fixup
      
      * style
      
      * update test s
      
      * update test
      
      * REFACTOR
      
      * encoder outputs match!
      
      * style
      
      * update testing
      
      * 🎉encoder and decoder logits match 🎉
      
      
      
      * styleing
      
      * update tests
      
      * cleanup tests
      
      * fix router test and CIs
      
      * cleanup
      
      * cleanup test styling
      
      * fix tests
      
      * Finally the generation tests match!
      
      * cleanup
      
      * update test
      
      * style testing file
      
      * remove script
      
      * cleanup
      
      * more cleanup
      
      * nits
      
      * update
      
      * NLLB tokenizer is wrong and will be fixed soon
      
      * use LongTensors
      
      * update tests
      
      * revert some small changes
      
      * fix second expert sampling and batch prioritized routing
      
      * update tests
      
      * finish last tests
      
      * make ruff happy
      
      * update
      
      * ruff again
      
      * style
      
      * Update docs/source/en/model_doc/nllb-moe.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Updates based on review
      
      * style and fix import issue
      
      * nit
      
      * more nits
      
      * cleanup
      
      * styling
      
      * update test_seconde_expert_policy
      
      * fix name
      
      * last nit on the markdown examples
      
      ---------
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      19ade242
    • Sylvain Gugger's avatar
      Fix quality · 057e1d74
      Sylvain Gugger authored
      057e1d74
    • Donny Greenberg's avatar
      Hardware Auto-Setup for Examples (#22319) · f02e3a2b
      Donny Greenberg authored
      * Add initial remote hardware auto-setup docs
      
      * Fix a few typos and clarify some language
      
      * Add missing dependency
      
      * Update self-hosted launch script with Sylvain's comments.
      
      * Formatting.
      
      * Trigger CI
      
      * Style
      f02e3a2b
    • Joao Gante's avatar
      Trainer: missing None check (#22404) · 738944c9
      Joao Gante authored
      missing None check
      738944c9
    • Joao Gante's avatar
    • NielsRogge's avatar
      [Pix2Struct] Add support to resize embeddings (#22394) · 0e708178
      NielsRogge authored
      * First draft
      
      * Fix integration test
      
      * Remove script
      
      * Fix test and typos
      
      * Fix one more test
      
      * Skip tied embeddings test
      
      * Remove line
      
      * Address comments
      0e708178