Commits · c79bbc3ba54a81dab2eac13d89f264ed64cb2460 · chenpangpang / transformers

"examples/vscode:/vscode.git/clone" did not exist on "6d67837f06fb8e3155a5c5b0dd57cd09841bc9f9"

27 Apr, 2022 9 commits

Fix multiple deletions of the same files in save_pretrained (#16947) · c79bbc3b
Sylvain Gugger authored Apr 27, 2022
```
* Fix multiple deletions of the same files in save_pretrained

* Add is_main_process argument
```
c79bbc3b
Fix add-new-model-like when model doesn't support all frameworks (#16966) · bfbec177
Sylvain Gugger authored Apr 27, 2022

bfbec177
Update custom_models.mdx (#16964) · cf8a7c24
Mishig Davaadorj authored Apr 27, 2022
```
BertModelForSequenceClassification -> BertForSequenceClassification
```
cf8a7c24
Fix `distributed_concat` with scalar tensor (#16963) · 5896b3ec
Antoni Baum authored Apr 27, 2022
```
* Fix `distributed_concat` with scalar tensor

* Update trainer_pt_utils.py
```
5896b3ec

[HF Argparser] Fix parsing of optional boolean arguments (#16946) · 084c38c5

NielsRogge authored Apr 27, 2022



* Add fix

* Apply suggestion from code review
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

084c38c5

Misc. fixes for Pytorch QA examples: (#16958) · c82e017a

Leonid Boytsov authored Apr 27, 2022

1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed).
2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion).
3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions.
4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.

c82e017a

Fix HubertRobustTest PT/TF equivalence test on GPU (#16943) · 49d5bcb0
Yih-Dar authored Apr 27, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
49d5bcb0

Add semantic script, trainer (#16834) · 479fdc49

NielsRogge authored Apr 27, 2022

* Add first draft

* Improve script and README

* Improve README

* Apply suggestions from code review

* Improve script, add link to resulting model

* Add corresponding test

* Adjust learning rate

479fdc49

[Research] Speed up evaluation for XTREME-S (#16785) · a4a88fa0
Anton Lozhkov authored Apr 27, 2022
```
* Avoid repeated per-lang filtering

* Language groups and logits preprocessing

* Style
```
a4a88fa0

26 Apr, 2022 7 commits
- use original loaded keys to find mismatched keys (#16920) · 2d91e3c3
  Yongliang Shen authored Apr 27, 2022
  
  2d91e3c3
- Fix RuntimeError message format (#16906) · d365f507
  nikkie authored Apr 27, 2022
  
  d365f507
- documentation: some minor clean up (#16850) · 10dfa126
  Yang Ming authored Apr 27, 2022
  
  10dfa126
- Add onnx config for RoFormer (#16861) · aaee4038
  Krishna Sirumalla authored Apr 26, 2022
```
* add roformer onnx config
```
  aaee4038
- FIx Iterations for decoder (#16934) · 8afaaa26
  Ahmed Elnaggar authored Apr 26, 2022
```
FIx Iterations for decoder
```
  8afaaa26
- apply torch int div to layoutlmv2 (#15457) · fa322474
  Manuel authored Apr 26, 2022
```
* apply torch int div

* black linting fixup

* update path to torch_int_div

* clarify imports
```
  fa322474
- Limit the use of PreTrainedModel.device (#16935) · 344b9fb0
  Sylvain Gugger authored Apr 25, 2022
```
* Limit the use of PreTrainedModel.device

* Fix
```
  344b9fb0
25 Apr, 2022 11 commits

Fix issue probably-meant-fstring found at https://codereview.doctor (#16913) · 65687520
code-review-doctor authored Apr 25, 2022

65687520
Replace deprecated logger.warn with warning (#16876) · fea94d67
Sanchit Gandhi authored Apr 25, 2022

fea94d67
TF: XLA stable softmax (#16892) · e03966e4
Joao Gante authored Apr 25, 2022
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
e03966e4
added deit onnx config (#16887) · 8246caf3
Rushi Chaudhari authored Apr 25, 2022
```
* added deit onnx config
```
8246caf3
TF: XLA Logits Warpers (#16899) · 9331b379
Joao Gante authored Apr 25, 2022
```
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
```
9331b379
TF: XLA logits processors - minimum length, forced eos, and forced bos (#16912) · 809dac48
Joao Gante authored Apr 25, 2022
```
* XLA min len, forced eos, and forced bos
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
```
809dac48
Fix RemBertTokenizerFast (#16933) · f6210c49
Yih-Dar authored Apr 25, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
f6210c49

Fix PyTorch RAG tests GPU OOM (#16881) · 32adbb26

Yih-Dar authored Apr 25, 2022



* add torch.cuda.empty_cache in some PT RAG tests

* torch.cuda.empty_cache in tearDownModule()

* tearDown()

* add gc.collect()
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

32adbb26

Add missing ckpt in config docs (#16900) · 3e47d19c

Yih-Dar authored Apr 25, 2022



* add missing ckpt in config docs

* add more missing ckpt in config docs

* fix wrong ckpts

* fix realm ckpt

* fix s2t2

* fix xlm_roberta ckpt

* Fix for deberta v2

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use only one checkpoint for DPR

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

3e47d19c

Fix doc test quicktour dataset (#16929) · 3a71e94a
Patrick von Platen authored Apr 25, 2022
```
* fix doc test

* fix doc test
Co-authored-by: Patrick <patrick@pop-os.localdomain>
```
3a71e94a
add bigbird typo fixes (#16897) · 508baf19
Thomas Chaigneau authored Apr 25, 2022
```
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
```
508baf19

23 Apr, 2022 1 commit
- [DocTests] Fix some doc tests (#16889) · 72728be3
  Patrick von Platen authored Apr 23, 2022
```
* [DocTests] Fix some doc tests

* hacky fix

* correct
```
  72728be3
22 Apr, 2022 7 commits

Changes in create_optimizer to support tensor parallelism with SMP (#16880) · 22fc93c4

cavdard authored Apr 22, 2022



* changes in create optimizer to support tensor parallelism with SMP

* Update src/transformers/trainer.py

Convert if check to one line.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Cavdar <dcavdar@a07817b12d7e.ant.amazon.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

22fc93c4

TF: XLA repetition penalty (#16879) · 99c8226b
Joao Gante authored Apr 22, 2022

99c8226b
Add OnnxConfig for ConvBERT (#16859) · ec81c11a
Thomas Chaigneau authored Apr 22, 2022
```
* add OnnxConfig for ConvBert
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
```
ec81c11a

Add doc tests for Albert and Bigbird (#16774) · 0d1cff11

Minh Chien Vu authored Apr 23, 2022



* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

* Add Doctest for Albert and Bigbird

* make fixup

* overwrite examples for Albert and Bigbird

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update longer examples for Bigbird

* using examples from squad_v2

* print out example text

* change name token-classification-big-bird checkpoint to random
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

0d1cff11

Minor fixes/improvements in `convert_file_size_to_int` (#16891) · 9fa88172
Mario Šaško authored Apr 22, 2022
```
* Minor improvements to `convert_file_size_to_int`

* Add <unit>bit version to kilos and megas

* Minor fix
```
9fa88172
TF: rework XLA generate tests (#16866) · 6d90d76f
Joao Gante authored Apr 22, 2022

6d90d76f

Add missing entries in mappings (#16857) · 3b1bbefc

Yih-Dar authored Apr 22, 2022



* add missing entries in some mappings
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

3b1bbefc

21 Apr, 2022 5 commits

New features for CodeParrot training script (#16851) · d9184131

Loubna Ben Allal authored Apr 21, 2022



* add tflops logging and fix grad accumulation

* add accelerate tracking and checkpointing

* scale loss of last batch correctly

* fix typo

* compress loss computation
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add resume from checkpoint argument

* add load_state accelerate from checkpoint, register lr scheduler and add tflops function

* reformat code

* reformat code

* add condition on path for resume checkpoint

* combine if conditions
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* add source for tflops formula
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

d9184131

Fix doctest list (#16878) · eef2422e
Yih-Dar authored Apr 21, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
eef2422e

Fix GPT-J onnx conversion (#16780) · 0b1e0fcf

Thomas Chaigneau authored Apr 21, 2022



* add gptj to TOKENIZER_MAPPING_NAMES

* fix int32 to float to avoid problem in onnx

* Update src/transformers/models/gptj/modeling_gptj.py
Co-authored-by: ChainYo <t.chaigneau.tc@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

0b1e0fcf

Use ACT2FN to fetch ReLU activation (#16874) · bae9b645

Eldar Kurtic authored Apr 21, 2022

- all activations should be fetched through ACT2FN
- it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`

bae9b645

Return input_ids in ImageGPT feature extractor (#16872) · cb555af2
Sylvain Gugger authored Apr 21, 2022

cb555af2