Commits · 6d67837f06fb8e3155a5c5b0dd57cd09841bc9f9 · chenpangpang / transformers

11 Mar, 2024 1 commit

Add Fill-in-the-middle training objective example - PyTorch (#27464) · 6d67837f

Tanay Mehta authored Mar 11, 2024

* add: initial script to train clm fim

* fix: if training model from scratch, new tokens will be added and embeddings resized

* fix: fixed attention_mask errors when generating FIM data

* fix: file formatted using black

* add: run_fim_no_trainer.py and fixed some comments in run_fim.py

* add: added fim examples to the README.md and ran code fixup

* fix: little bug in both fim training scripts

* fix: remove comment from notebook and added a note on fim related params

* fix: minor typo in README

* add: suggested minor changes to README and run_fim.py

* add: gradient_accumulation_steps and gradient_checkpointing args

* add: improved model embedding resizing

* add: pad_to_multiple_of and attn_implementation params

* add: requested minor changes

* add: deepspeed zero compatibility

* add: resize embeddings layer with zero3 support for fim model initialization

6d67837f

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
22 Mar, 2023 1 commit

add low_cpu_mem_usage option in run_clm.py example which will benefit… (#22288) · 4ccaf268

Wang, Yi authored Mar 22, 2023



* add low_cpu_mem_usage option in run_clm.py example which will benefit LLM loading
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update all the example and README under language-modeling
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

4ccaf268

30 Jan, 2023 1 commit

[`run_(clm|mlm).py` examples] add streaming dataset support (#21343) · 98d88b23

Stas Bekman authored Jan 30, 2023

* [run_clm example] add streaming dataset support

* unrefactor kwargs

* fix

* fix

* require datasets>=2.0.0

* port to mlm

98d88b23

23 Mar, 2022 1 commit

Updates the default branch from master to main (#16326) · eca77f47

Lysandre Debut authored Mar 23, 2022



* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

eca77f47

10 Feb, 2022 1 commit
- Add example batch size to all commands (#15596) · 3d5dea9b
  Patrick von Platen authored Feb 10, 2022
  
  3d5dea9b
22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

14 Jun, 2021 1 commit

[lm examples] Replicate --config_overrides addition to other LM examples (#12135) · 9de62cfb

Kumar Abhishek authored Jun 14, 2021



* [lm examples] Replicate --config_overrides addition to other LM examples

* Removing no trainer files changes

* Update README
Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>

9de62cfb

25 May, 2021 1 commit

[Examples] create model with custom config on the fly (#11798) · 1b653010

Stas Bekman authored May 25, 2021



* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1b653010

21 Apr, 2021 1 commit

Examples reorg (#11350) · dabeb152

Sylvain Gugger authored Apr 21, 2021



* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

dabeb152

06 Apr, 2021 1 commit
- Add Readme for language modeling scripts with accelerate (#11073) · 6ab7d1a4
  Hemil Desai authored Apr 06, 2021
  
  6ab7d1a4
19 Mar, 2021 1 commit

Expand a bit the presentation of examples (#10799) · 946400fb

Sylvain Gugger authored Mar 19, 2021



* Expand a bit the presentation of examples

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

946400fb

01 Feb, 2021 1 commit

Fit chinese wwm to new datasets (#9887) · 1682804e

wlhgtc authored Feb 01, 2021



* MOD: fit chinese wwm to new datasets

* MOD: move wwm to new folder

* MOD: formate code

* Styling

* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

1682804e

22 Dec, 2020 1 commit
- Fix link to old language modeling script (#9254) · 189c1b91
  Manuel Romero authored Dec 22, 2020
  
  189c1b91
11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

19 Nov, 2020 1 commit

fix small typo (#8644) · a79a96dd

Matthias authored Nov 19, 2020

Fixed a small typo on the XLNet and permutation language modelling section

a79a96dd

18 Nov, 2020 1 commit
- Update README.md (#8635) · 28d16e7a
  Tim Isbister authored Nov 19, 2020
  
  28d16e7a
04 Nov, 2020 2 commits
- Fix path to old run_language_modeling.py script (#8302) · b1d3e95e
  Manuel Romero authored Nov 04, 2020
  
  b1d3e95e
- Fix typo in language-modeling README.md (#8287) · 734afa37
  Pengzhi Gao authored Nov 04, 2020
  
  734afa37
02 Nov, 2020 1 commit

Add line by line option to mlm/plm scripts (#8240) · e1b1b614

Sylvain Gugger authored Nov 02, 2020



* Make line by line optional in run_mlm

* Add option to disable dynamic padding

* Add option to plm too and update README

* Typos

* More typos

* Even more typos

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

e1b1b614

30 Oct, 2020 1 commit

Finalize lm examples (#8188) · cdc48ce9

Sylvain Gugger authored Oct 30, 2020



* Finish the cleanup of the language-modeling examples

* Update main README

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Propagate changes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

cdc48ce9

29 Oct, 2020 1 commit

Fix eval ref miss in Chinese WWM. (#8115) · 9a21b506

wlhgtc authored Oct 30, 2020



* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme

* fix eval ref file miss bug

* format file

* MOD: move ref code to contrib

* MOD: add delimeter check

* reformat code

* refomat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

9a21b506

26 Oct, 2020 1 commit

Update README.md (#8050) · 098ddc22

mohammadreza-Banaei73 authored Oct 26, 2020

--wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask

098ddc22

22 Oct, 2020 1 commit

# Add whole word mask support for lm fine-tune (#7925) · a16e568f

wlhgtc authored Oct 22, 2020



* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

a16e568f

29 Jul, 2020 1 commit
- XLNet PLM Readme (#6121) · 641b873c
  Lysandre Debut authored Jul 29, 2020
  
  641b873c
25 May, 2020 1 commit
- add DistilBERT to supported models (#4558) · 50d1ce41
  Antonis Maronikolakis authored May 25, 2020
  
  50d1ce41
07 May, 2020 2 commits
- [doc] Fix broken links + remove crazy big notebook · c99fe038
  Julien Chaumond authored May 07, 2020
  
  c99fe038
- BIG Reorganize examples (#4213) · 0ae96ff8
  Julien Chaumond authored May 07, 2020
```
* Created using Colaboratory

* [examples] reorganize files

* remove run_tpu_glue.py as superseded by TPU support in Trainer

* Bugfix: int, not tuple

* move files around
```
  0ae96ff8