- 01 Jun, 2023 6 commits
-
-
Lysandre Debut authored
-
Adam Lewis authored
rename encode input to match docstring
-
Sylvain Gugger authored
-
Sheon Han authored
-
fxmarty authored
consistentcy
-
Sanchit Gandhi authored
-
- 31 May, 2023 29 commits
-
-
Sylvain Gugger authored
-
NielsRogge authored
Add first draft
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Re-enable squad test * [all-test] * [all-test] Fix all test command * Fix the all-test
-
Sourab Mangrulkar authored
remove the extra `accelerator.prepare` that slipped in with multiple update from main
😅 -
amyeroberts authored
Bug fix - flip_channel_order for channels_first
-
Sylvain Gugger authored
* Try easy first * Add an empty job * Fix name * Fix method
-
amyeroberts authored
Raise error if loss can't be calculated
-
Hari authored
* add conditional statement for auxiliary loss calculation * fix style and copies
-
Younes Belkada authored
fix RWKV 4bit
-
Zachary Mueller authored
* Upgrade safetensors * Second table
-
Connor Henderson authored
fix: Replace `add_prefix_space` in `get_prompt_ids` with manual space for FastTokenizer compatibility (#23796) * add ' ' replacement for add_prefix_space * add fast tokenizer test
-
Zachary Mueller authored
* Move import check to before state reset * Guard better
-
Younes Belkada authored
* add warning for gpt2-like models * more details * adapt from suggestions
-
Sanchit Gandhi authored
* fix for ragged list * unpin numba * make style * np.object -> object * propagate changes to tokenizer as well * np.long -> "long" * revert tokenization changes * check with tokenization changes * list/tuple logic * catch numpy * catch else case * clean up * up * better check * trigger ci * Empty commit to trigger CI
-
Xinyu Yang authored
* ensure banned_mask and indices in same device * ensure banned_mask and indices in same device switch the order in which indices and banned_mask are created and create banned_mask on the proper device
-
Thomas Wang authored
* Suport shared storage * Really be sure we have the same storage * Make style * - Refactor storage identifier mechanism - Group everything into a single for loop * Make style * PR * make style * Update src/transformers/pytorch_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Calico authored
-
Sylvain Gugger authored
-
Sourab Mangrulkar authored
* mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix
😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving * shift torch dynamo handling to accelerate * shift deepspeed integration and save & load utils to accelerate * fix accelerate launcher support * oops * fix🐛 * save ckpt fix * Trigger CI * nasty🐛 😅 * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate * make tests happy * quality✨ * loss tracked needs to account for grad_acc * fixing the deepspeed tests * quality✨ *😅 😅 😅 * tests😡 * quality✨ * Trigger CI * resolve comments and fix the issue with the previous merge from branch * Trigger CI * accelerate took over deepspeed integration --------- Co-authored-by:Stas Bekman <stas@stason.org>
-
Denisa Roberts authored
* Add tf code for efficientformer * Fix return dict bug - return last hidden state after last stage * Fix corresponding return dict bug * Override test tol * Change default values of training to False * Set training to default False X3 * Rm axis from ln * Set init in dense projection * Rm debug stuff * Make style; all tests pass. * Modify year to 2023 * Fix attention biases codes * Update the shape list logic * Add a batch norm eps config * Remove extract comments in test files * Add conditional attn and hidden states return for serving output * Change channel dim checking logic * Add exception for withteacher model in training mode * Revert layer count for now * Add layer count for conditional layer naming * Transpose for conv happens only in main layer * Make tests smaller * Make style * Update doc * Rm from_pt * Change to actual expect image class label * Remove stray print in tests * Update image processor test * Remove the old serving output logic * Make style * Make style * Complete test
-
Sylvain Gugger authored
-
Sam Passaglia authored
* add \n * removed copied from header
-
Sourab Mangrulkar authored
* mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix
😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving * shift torch dynamo handling to accelerate -
Sourab Mangrulkar authored
* mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix
😅 * resolving comments * move fsdp handling to accelerate * fixex * fix saving -
Sohyun Sim authored
* docs: ko: pad_truncation.mdx * feat: manual draft * fix: resolve suggestions Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com> --------- Co-authored-by:
Hyeonseo Yun <0525yhs@gmail.com>
-
Sourab Mangrulkar authored
* mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * move ddp prep to accelerate * fix
😅 * resolving comments -
Sourab Mangrulkar authored
* mixed precision support via accelerate * fix issues * fix for the sharded ddp case * fix flax and tf failing tests * `refactor the place to create `Accelerator` object * address comments by removing debugging print statements
-
- 30 May, 2023 5 commits
-
-
Abhinav Patil authored
Adds support for AutoProcessor.from_pretrained to MCTCTProcessor models
-
George authored
* Editing issue with pickle def with lambda function * fix type * Made helper function private * delete tab --------- Co-authored-by:georgebredis <9454-georgebredis@users.noreply.gitlab.aicrowd.com>
-
Arthur authored
* Better warning * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * format line --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Vijeth Moudgalya authored
-
Arthur authored
* Update the processor when changing add_eos and add_bos * fixup * update * add a test * fix failing tests * fixup
-