- 25 Jul, 2023 15 commits
-
-
Gema Parre帽o authored
* add example NoBadWordsLogitsProcessor * fix L764 & L767 * make style
-
Arthur authored
* draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match 脿 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Xiaoke Huang authored
Repeat per sample for SAM image embeddings
-
Harheem Kim authored
* dos: ko: hpo_train.mdx * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions
-
Arthur authored
[`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value (#25030) * check max length is default * nit * update warning: no-longer deprecate * comment in the configuration_utils in case max length's default gets changed in the futur
-
Alan Ji authored
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task (#25078) replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice
-
Susnato Dhar authored
Update README_hd.md
-
Xuehai Pan authored
-
Injin Paek authored
-
Sylvain Gugger authored
* Fix last models for common tests that are too big. * Remove print statement
-
Sangam Lee authored
* docs: ko: perf_hardware.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> * fix: resolve suggestions Co-authored-by:
Haewon Kim <ehdvkf02@naver.com> * Fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: fix rendering error of perf_hardware.md --------- Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> Co-authored-by:
Haewon Kim <ehdvkf02@naver.com>
-
Haewon Kim authored
* docs: ko: tf_xla.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions
-
Kashif Rasul authored
fix rope_scaling doc string
-
Joao Gante authored
-
Arthur authored
* Add note in doc on `RwkvStoppingCriteria` * give some breathing space to the code
-
- 24 Jul, 2023 17 commits
-
-
Sylvain Gugger authored
* Better error message when signal is not supported on OS * Address review comments
-
seank021 authored
* dos: ko: perf_train_cpu.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: manual edits Co-authored-by:
Haewon Kim <ehdvkf02@naver.com> --------- Co-authored-by:
Haewon Kim <ehdvkf02@naver.com>
-
Younes Belkada authored
fix 8bit corner case with Blip2 8bit
-
Nate Brake authored
compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. (#25044) * added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict * check for PEFT model in compute_loss section --------- Co-authored-by:Nathan Brake <nbrake3@mmm.com>
-
Rinat authored
* pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Make more test models tiny * Make more test models tiny * More models * More models
-
S枚ren Brunk authored
-
Zach Mueller authored
* Dispatch batches * Copy items
-
Sunmin Cho authored
* docs: ko: testing.md * feat: draft * fix: manual edits * fix: edit ko/_toctree.yml * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * fix: resolve suggestions
-
Sangam Lee authored
* dos: ko: performance.md * feat: chatgpt draft * fix: manual edits * fix: manual edits * Update docs/source/ko/performance.md Co-authored-by:
Kihoon Son <75935546+kihoon71@users.noreply.github.com> * Update docs/source/ko/performance.md --------- Co-authored-by:
Kihoon Son <75935546+kihoon71@users.noreply.github.com>
-
Iskren Ivov Chernev authored
* Better handling missing SYS in llama conversation tokenizer The existing code failed to add SYS if the conversation has history without SYS, but did modify the passed conversation as it did. Rearrange the code so modification to the conversation object are taken into account for token id generation. * Fix formatting with black * Avoid one-liners * Also fix fast tokenizer * Drop List decl
-
Lucain authored
* Support GatedRepoError + use raise from * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Use token instead of use_auth_token in error messages --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Maria Khalusova authored
* first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Bharat Ramanathan authored
fix: store training args to wandb config without sanitization. Allows resuming runs by reusing the wandb config. Co-authored-by:Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>
-
Arthur authored
set default logger
-
Stas Bekman authored
* [check_config_docstrings.py] improve diagnostics * style * rephrase * fix
-
- 21 Jul, 2023 8 commits
-
-
Wonhyeong Seo authored
fix: update ko/serialization.md * chatgpt draft
-
Sylvain Gugger authored
-
Ivan Sorokin authored
* improve from_pretrained for zero3 multi gpus mode * Add check if torch.distributed.is_initialized * Revert torch.distributed --------- Co-authored-by:Stas Bekman <stas@stason.org>
-
Arthur authored
remove persistent tensor
-
Younes Belkada authored
add simple check for bnb
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-