Commits · b48ac1a094e572d6076b46a9e4ed3e0ebe978afc · chenpangpang / transformers

23 May, 2022 1 commit

Fix CodeParrot training script (#17291) · b48ac1a0

Loubna Ben Allal authored May 23, 2022



* average loss over batches and accumulated steps for tracking

* fix layernorm weight decay

* use AdamW from Pytorch instead of Transformers

* add shuffling of sequences inside the batches

* add shuffling of sequences inside the batches

* add logging dir and reformat code

* fix lr tracking

* remove Mistral scaling

* keep Mistral scaling

* reformat code

* fix error

* fix error

* use shuffling function from Pytorch

* remove argument for shuffling batch sequences as it isn't optional

* update package versions and install accelerate from source

* remove unused package

* Update loss average over accumulated steps
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update loss average over accumulated steps
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use one shuffle buffer argument

* compute avg_loss in one line
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

b48ac1a0

19 May, 2022 1 commit
- Fix bug in Wav2Vec2 pretrain example (#17326) · 48c22691
  ddobokki authored May 20, 2022
  
  48c22691
18 May, 2022 3 commits

Fix metric calculation in examples and setup tests to run on multi-gpu for... · 1762ded3

Zachary Mueller authored May 18, 2022

Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (#17331)

* Fix length in no_trainer examples

* Add setup and teardown

* Use new accelerator config generator to automatically make tests able to run based on environment

1762ded3

Fix style · 47107028
Sylvain Gugger authored May 18, 2022

47107028

Add Information Gain Filtration algorithm (#16953) · 5fdb54ec

mraunak authored May 18, 2022



* Add information gain filtration algorithm

* Complying with black requirements

* Added author

* Fixed import order

* flake8 corrections
Co-authored-by: Javier Turek <javier.turek@intel.com>

5fdb54ec

17 May, 2022 1 commit

Improve mismatched sizes management when loading a pretrained model (#17257) · 28a08116

regisss authored May 17, 2022

- Add --ignore_mismatched_sizes argument to classification examples

- Expand the error message when loading a model whose head dimensions are different from expected dimensions

28a08116

16 May, 2022 3 commits

CodeParrot data pretokenization (#16932) · 05a90579

Loubna Ben Allal authored May 16, 2022



* add pretokenization arguments

* add pretokenization script

* add support for pretokenized data

* reformat code

* fix run command for training

* fix model call from config

* remove a package

* add comments on pretokenization in the readme

* remove explicit parallelization
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* update readme -remove username
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* keep data parallelization

* reformat code

* reformat code

* update readme

* reformat code

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>

05a90579

Update codeparrot data preprocessing (#16944) · e730e125

Loubna Ben Allal authored May 16, 2022



* add new preprocessing arguments

* add new filters

* add new filters to readme

* fix config and test count, update function names and docstrings

* reformat code

* update readme

* Update readme

* rename config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename few_assignments filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename tokenizer in arguments
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* rename functions and add limit_line argument for config_test filter

* update threshold for config_test filter
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>

e730e125

fixed bug in run_mlm_flax_stream.py (#17203) · 71d18d08

Kenneth Enevoldsen authored May 16, 2022



* fixed bug run_mlm_flax_stream.py

Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output.

* Update run_mlm_flax_stream.py

* adding missing paranthesis

* formatted to black

* remove cols from dataset instead

* reformat to black

* moved rem. columns to map

* formatted to black
Co-authored-by: KennethEnevoldsen <kennethcenevolsen@gmail.com>

71d18d08

12 May, 2022 2 commits
- Black preview (#17217) · afe5d42d
  Sylvain Gugger authored May 12, 2022
```
* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black
```
  afe5d42d
- Dev version · 5294fa12
  Lysandre Debut authored May 12, 2022
  
  5294fa12
09 May, 2022 1 commit
- Fix all docs for accelerate install directions (#17145) · d719bcd4
  Zachary Mueller authored May 09, 2022
  
  d719bcd4
04 May, 2022 4 commits

Update to build via git for accelerate (#17084) · ef203902
Zachary Mueller authored May 04, 2022

ef203902

Bump notebook from 6.4.1 to 6.4.10 in /examples/research_projects/lxmert (#16634) · 2bf95e2b

dependabot[bot] authored May 04, 2022

Bumps [notebook](http://jupyter.org

) from 6.4.1 to 6.4.10.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

2bf95e2b

Bump notebook in /examples/research_projects/visual_bert (#16635) · 7a229ef4

dependabot[bot] authored May 04, 2022

Bumps [notebook](http://jupyter.org

) from 6.4.1 to 6.4.10.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

7a229ef4

Fix hashing for deduplication (#17048) · db034660
Thomas Wang authored May 04, 2022

db034660

03 May, 2022 1 commit
- Remove device parameter from create_extended_attention_mask_for_decoder (#16894) · 39f8eafc
  Pavel Belevich authored May 03, 2022
  
  39f8eafc
02 May, 2022 3 commits
- Fix no_trainer examples to properly calculate the number of samples (#17046) · f275e593
  Zachary Mueller authored May 02, 2022
```
* Update all examples to properly calculate progress bar
```
  f275e593
- Update no_trainer examples to use new logger (#17044) · 35d48db8
  Zachary Mueller authored May 02, 2022
```
* Propagate and fix imports
```
  35d48db8
- add torch.no_grad when in eval mode (#17020) · bdd690a7
  yujun authored May 02, 2022
```
* add torch.no_grad when in eval mode

* make style quality
```
  bdd690a7
28 Apr, 2022 2 commits
- Fix savedir for by epoch (#16996) · 3486a92a
  Zachary Mueller authored Apr 28, 2022
  
  3486a92a
- Add parameter --config_overrides for run_mlm_wwm.py (#16961) · 1be8d56e
  conan1024hao authored Apr 28, 2022
```
* dd parameter --config_overrides for run_mlm_wwm.py

* linter
```
  1be8d56e
27 Apr, 2022 5 commits

Fixup no_trainer save logic (#16968) · 60e1d883
Zachary Mueller authored Apr 27, 2022
```
* Fixup all examples
```
60e1d883
Fix multiple deletions of the same files in save_pretrained (#16947) · c79bbc3b
Sylvain Gugger authored Apr 27, 2022
```
* Fix multiple deletions of the same files in save_pretrained

* Add is_main_process argument
```
c79bbc3b

Misc. fixes for Pytorch QA examples: (#16958) · c82e017a

Leonid Boytsov authored Apr 27, 2022

1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed).
2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion).
3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions.
4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.

c82e017a

Add semantic script, trainer (#16834) · 479fdc49

NielsRogge authored Apr 27, 2022

* Add first draft

* Improve script and README

* Improve README

* Apply suggestions from code review

* Improve script, add link to resulting model

* Add corresponding test

* Adjust learning rate

479fdc49

[Research] Speed up evaluation for XTREME-S (#16785) · a4a88fa0
Anton Lozhkov authored Apr 27, 2022
```
* Avoid repeated per-lang filtering

* Language groups and logits preprocessing

* Style
```
a4a88fa0

25 Apr, 2022 2 commits
- Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) · 65687520
  code-review-doctor authored Apr 25, 2022
  
  65687520
- Replace deprecated logger.warn with warning (#16876) · fea94d67
  Sanchit Gandhi authored Apr 25, 2022
  
  fea94d67
21 Apr, 2022 1 commit

New features for CodeParrot training script (#16851) · d9184131

Loubna Ben Allal authored Apr 21, 2022



* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

d9184131

20 Apr, 2022 1 commit
- Fix multiproc metrics in no_trainer examples (#16865) · 705d6536
  Zachary Mueller authored Apr 20, 2022
  
  705d6536
19 Apr, 2022 5 commits

Correct Logging of Eval metric to Tensorboard (#16825) · b5c6a63e

Jeevesh Juneja authored Apr 19, 2022

* Correct Logging of Eval metric to Tensorboard

An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``.

* Remove unused variable

b5c6a63e

Add image classification script, no trainer (#16727) · b96e82c8

NielsRogge authored Apr 19, 2022

* Add first draft

* Improve README and run fixup

* Make script aligned with other scripts, improve README

* Improve script and add test

* Remove print statement

* Apply suggestions from code review

* Add num_labels to make test pass

* Improve README

b96e82c8

fix `rum_clm.py` seeking text column name twice (#16624) · b74a9553
Wonjae Kim authored Apr 19, 2022

b74a9553

[Flax] improve large model init and loading (#16148) · d3bd9ac7

Suraj Patil authored Apr 19, 2022



* begin do_init

* add params_shape_tree

* raise error if params are accessed when do_init is False

* don't allow do_init=False when keys are missing

* make shape tree a property

* assign self._params at the end

* add test for do_init

* add do_init arg to all flax models

* fix param setting

* disbale do_init for composite models

* update test

* add do_init in FlaxBigBirdForMultipleChoice

* better names and errors

* improve test

* style

* add a warning when do_init=False

* remove extra if

* set params after _required_params

* add test for from_pretrained

* do_init => _do_init

* chage warning to info

* fix typo

* add params in init_weights

* add params to gpt neo init

* add params to init_weights

* update do_init test

* Trigger CI

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update template

* trigger CI

* style

* style

* fix template
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3bd9ac7

Add semantic script no trainer, v2 (#16788) · 7db7aab4

NielsRogge authored Apr 19, 2022

* Add first draft from previous PR

* First draft

* Improve README and remove num_labels

* Make script more aligned with other scripts

* Improve README and apply suggestion from code review

7db7aab4

15 Apr, 2022 1 commit
- Update README.md (#16797) · 78f346c2
  NielsRogge authored Apr 15, 2022
  
  78f346c2
14 Apr, 2022 1 commit

Improve image classification example (#16585) · 048443db

NielsRogge authored Apr 14, 2022



* Improve README

* Make dataset_name argument optional

* Improve local data

* Fix bug

* Improve README some more

* Apply suggestions from code review

* Improve README
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

048443db

13 Apr, 2022 2 commits

Fixup no_trainer examples scripts and add more tests (#16765) · be752d12

Zachary Mueller authored Apr 13, 2022

* Change tracking to store_true

* Remove step param and use it in the log dictionary directly

* use vars(args) when passing args to init_trackers

* Include tracking tests since tensorboard is already a dep

be752d12

Add self training code for text classification (#16738) · 34ef029d

Tu Vu authored Apr 13, 2022

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Add self-training code for text-classification

* Delete strata

34ef029d