- 12 Sep, 2023 6 commits
-
-
MinJae Kang authored
* docs: ko-llama2.md * feat: chatGPT draft and manul edits * feat: added inline TOC * fix: inline TOC * fix: resolve suggestions Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: resolve suggestion Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> * fix: resolve suggestion Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com>
-
pokjay authored
* Fix issues in test_exponential_decay_length_penalty Fix tests which were broken and add validation of negative scores. Current test didn't take into account that ExponentialDecayLengthPenalty updates the score inplace, resulting in updates to base tested Tensor. In addition, the gt assert had empty Tensors due to indexing along the batch dimension. Test is currently expected to fail to show ExponentialDecayLengthPenalty issues with negative scores * Fix ExponentialDecayLengthPenalty negative logits issue In cases where the scores are negative, ExponentialDecayLengthPenalty decreases the score of eos_token_id instead of increasing it. To fix this issue we compute the penalty of the absolute value and add it to the original score. * Add examples for ExponentialDecayLengthPenalty * Fix styling issue in ExponentialDecayLengthPenalty doc * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Style and quality fix * Fix example outputs --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
larekrow authored
-
Joao Gante authored
-
Younes Belkada authored
import tensorflow inside relevant methods in trainer_utils
-
Arthur authored
* intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Sep, 2023 4 commits
-
-
Phuc Van Phan authored
* docs: add space to docs * docs: remove reduntant space
-
Patrick von Platen authored
* improve import time * Update src/transformers/integrations/__init__.py * sort import
-
Phuc Van Phan authored
-
Hang authored
only main process should call _save when deepspeed zero3
-
- 09 Sep, 2023 1 commit
-
-
Arthur authored
* skip failing tests until #26054 is merged * fixup
-
- 08 Sep, 2023 5 commits
-
-
Arthur authored
* fix `set_infilling_processor` to properly reset * Add docstring! * fixups * more details in the docuemtation about the tokenization * styl;e
-
Harheem Kim authored
* docs: ko-llama.md * fix: chatgpt draft * feat: manual edits * fix: resolve suggestions
-
Angela Yi authored
* Ignore warning if tracing with dynamo * fix import error * separate to function * add test
-
Thien Tran authored
* add missing doc for activation dropout * fix doc for SEW-D dropout * deprecate hidden_dropout for SEW-D
-
Alexander Krauck authored
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically: 1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`. 2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers. These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
-
- 07 Sep, 2023 9 commits
-
-
dumpmemory authored
* fix loss inconsistent after resume #25340 * fix typo * clean code * reformatted code * adjust code according to comments * adjust check_dataloader_randomsampler location * return sampler only * handle sampler is None * Update src/transformers/trainer_pt_utils.py thanks @amyeroberts Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
MyungHa Kwon authored
fix typo
-
raghavanone authored
* Fix vilt config init parameter to match the ones in documentation * Fix the documentation
-
Muskan Kumar authored
* Added HerBERT to README.md * Update README.md to contain HerBERT (#26016) * Resolved #26016: Updated READMEs and index.md to contain Herbert Updated READMEs and ran make fix-copies
-
Sanchit Gandhi authored
* fix tokenizer * make bs even * fix multi gpu test * style * model forward * fix torch import * revert tok pin
-
CokeDong authored
* Add tgs metrics * bugfix and black formatting * workaround for tokens counting * formating and bugfix * Fix * Add opt-in for tgs metrics * make style and fix error * Fix doc * fix docbuild * hf-doc-build * fix * test * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Fix some symbol * test * Update src/transformers/trainer_utils.py match nameing patterns Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/trainer.py nice Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix reviews * Fix * Fix black --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Kai authored
-
Zach Mueller authored
* Fix err * Use version check
-
- 06 Sep, 2023 7 commits
-
-
Marc Sun authored
* add new arg for gptq * add tests * add min version autogptq * fix order * skip test * fix * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style * change model path --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Matt authored
Remove falcon from undocumented list
-
Harheem Kim authored
* docs: ko: llm_tutoroal.md * feat: chatgpt draft * fix: manual edits * fix: resolve suggestions * fix: resolve suggestions
-
zspo authored
* fix some samll bugs in readme * Update docs/README.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
* Fix revision propagation * Cleaner
-
Nino Risteski authored
fixed a typo
-
- 05 Sep, 2023 8 commits
-
-
tju_skywalker authored
* fix convert megatron model too large * fix convert megatron model too large
-
Tanay Mehta authored
* add: potential fix to mega chunking in decoder only model bug * add: decoder with chunking test * add: input_mask passed with input_ids
-
Arthur authored
* revision did not exist * correct revision
-
Arthur authored
* start with error too * fix ? * start with nit * one more path * use `job_name` * mark pipeline test as slow
-
Injin Paek authored
* docs: feat: model resources for llama * fix: resolve suggestion Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com>
-
Sanchit Gandhi authored
* [Wav2Vec2 Conformer] Fix inference float16 * fix test * fix test more * clean pipe test
-
Sourab Mangrulkar authored
deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863) * Add support for deepspeed optimizer and HF scheduler * fix bug * fix the import * fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario * fix loading of hf scheduler when loading deepspeed checkpoint * fix import of `DeepSpeedSchedulerWrapper` * add tests * add the comment and skip the failing tests * address comment
-
raghavanone authored
* Add TFDebertaV2ForMultipleChoice * Import newer model in main init * Fix import issues * Fix copies * Add doc * Fix tests * Fix copies * Fix docstring
-