Commits · 7f08dbd10a1e6e08c77aa907fa211f899ae1e876 · chenpangpang / transformers

10 Jun, 2021 5 commits

Update README.md to cover the TF GLUE example. · 7f08dbd1
Matt authored Jun 10, 2021

7f08dbd1
Fix quality · d72e5a3a
Sylvain Gugger authored Jun 10, 2021

d72e5a3a

New TF GLUE example (#12028) · 73a53265

Matt authored Jun 10, 2021



* Pushing partially-complete new GLUE example

* First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready.

* Fix to the fit() call

* Bugfixes, making sure TPU and multi-GPU support is ready

* Remove logger line that depends on Pytorch

* Style pass

* Deleting old TF GLUE example

* Include label2id and id2label in the saved model config

* Don't clobber the existing model.config.label2id

* Style fixes

* Update examples/tensorflow/text-classification/run_glue.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

73a53265

CLIPFeatureExtractor should resize images with kept aspect ratio (#11994) · 9d2cee8b

Tobias Norlund authored Jun 10, 2021



* Resize with kept aspect ratio

* Fixed failed test

* Overload center_crop and resize methods instead

* resize should handle non-PIL images

* update slow test

* Tensor => tensor
Co-authored-by: patil-suraj <surajp815@gmail.com>

9d2cee8b

Add text_column_name and label_column_name to run_ner and run_ner_no_trainer args (#12083) · 472a8676
kumapo authored Jun 10, 2021
```
* Add text_column_name and label_column_name to run_ner args

* Minor fix: grouping for text and label column name
```
472a8676

09 Jun, 2021 8 commits

[Wav2Vec2ForPretraining] Correct checkpoints wav2vec2 & fix tests (#12089) · bc6f51e5
Patrick von Platen authored Jun 09, 2021
```
* fix_torch_device_generate_test

* remove @

* fix tests
```
bc6f51e5
rm require_version_examples (#12088) · 61e19198
Stas Bekman authored Jun 09, 2021

61e19198
pass decay_mask fn to optimizer (#12087) · d1500d91
Suraj Patil authored Jun 09, 2021

d1500d91

Wav2Vec2 Pretraining (#11306) · d472bd7b

Anton Lozhkov authored Jun 09, 2021



* Working quantizer forward

* Working quantizer forward

* Clean up unused model parts, test reproducibility

* Working quantizer forward

* Clean up unused model parts, test reproducibility

* Remove custom outputs from the shared ones

* correct conversion

* correct bug

* add first pretrain script

* save intermediate

* static shapes

* save intermediate

* finish first pretrain script version

* more refactor

* remove wanddb

* refactor more

* improve test

* correct perplexity compute bug

* finish model implementation

* add to docs

* finish docs

* finish pretraining script

* finish pretraining script

* remove wandb

* finish PR for merge

* finish config

* finish

* make deepspeed work

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

* fix flaky test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d472bd7b

[test] support more than 2 gpus (#12074) · b1a8aa94
Stas Bekman authored Jun 09, 2021
```
* support more than 2 gpus

* style
```
b1a8aa94

Add DETR (#11653) · d3eacbb8

NielsRogge authored Jun 09, 2021



* Squash all commits of modeling_detr_v7 branch into one

* Improve docs

* Fix tests

* Style

* Improve docs some more and fix most tests

* Fix slow tests of ViT, DeiT and DETR

* Improve replacement of batch norm

* Restructure timm backbone forward

* Make DetrForSegmentation support any timm backbone

* Fix name of output

* Address most comments by @LysandreJik

* Give better names for variables

* Conditional imports + timm in setup.py

* Address additional comments by @sgugger

* Make style, add require_timm and require_vision to testsé

* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone

* Add png files to fixtures

* Fix type hint

* Add timm to workflows

* Add `BatchNorm2d` to the weight initialization

* Fix retain_grad test

* Replace model checkpoints by Facebook namespace

* Fix name of checkpoint in test

* Add user-friendly message when scipy is not available

* Address most comments by @patrickvonplaten

* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner

* Better initialization

* Scipy is necessary to get sklearn metrics

* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel

* Make style

* Improve docs and add 2 community notebooks
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

d3eacbb8

sync LayerDrop for Wav2Vec2Encoder + tests (#12076) · d14e0af2
Stas Bekman authored Jun 09, 2021

d14e0af2
Update run_ner.py with id2label config (#12001) · 82a2b76c
Koichi Yasuoka authored Jun 09, 2021

82a2b76c

08 Jun, 2021 11 commits

typo · 0e82f0cb
Stas Bekman authored Jun 08, 2021

0e82f0cb

[Deepspeed Wav2vec2] integration (#11638) · 11d86d3d

Stas Bekman authored Jun 08, 2021

* wip

* wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044

* cleanup

* workaround

* working 5/8 modes

* solve fp32 distributed zero3

* style

* sync

* sync

* rework

* deprecation

* cleanup

* https://github.com/microsoft/DeepSpeed/pull/1044

 pr was merged

* clean up

* add a guide

* more prose

* more prose

* fix

* more prose

* sub_group_size was too big

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor

* bug fix

* make the true check explicit

* new deepspeed release
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

11d86d3d

[Deepspeed] various fixes (#12058) · 32290d87

Stas Bekman authored Jun 08, 2021

* replace deprecated config

* sub_group_size was too big

* complete deprecation removal

32290d87

Properly indent block_size (#12070) · fd690283
Sylvain Gugger authored Jun 08, 2021

fd690283

Add torch to requirements.txt in language-modeling (#12040) · 49bee0ae

cdleong authored Jun 08, 2021



* Add torch to requirements.txt in language-modeling

* Update examples/pytorch/language-modeling/requirements.txt
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

49bee0ae

Replace legacy tensor.Tensor with torch.tensor/torch.empty (#12027) · f5eec0d8
Mario Šaško authored Jun 08, 2021
```
* Replace legacy torch.Tensor constructor with torch.{tensor, empty}

* Remove torch.Tensor in examples
```
f5eec0d8

updated the original RAG implementation to be compatible with latest Pytorch-Lightning (#11806) · e33085d6

Shamane Siri authored Jun 09, 2021

* updated the original RAG implementation to be compatible with the latest PL version

* updated the requirements.txt file

* execute make style

* code quality test

* code quality

* conflix resolved in requirement.txt

* code quality

* changed the MyDDP class name to CustomDDP

e33085d6

Fix tapas issue (#12063) · 70f88eec

NielsRogge authored Jun 08, 2021

* Fix scatter function to be compatible with torch-scatter 2.7.0

* Allow test again

70f88eec

Fix integration tests (#12066) · e56e3140
NielsRogge authored Jun 08, 2021

e56e3140
skip failing test (#12059) · 4abc6dd6
Stas Bekman authored Jun 07, 2021

4abc6dd6
adds metric prefix. (#12057) · e363e1d9
Russell Klopfer authored Jun 07, 2021
```
* adds metric prefix.

* update tests to include prefix
```
e363e1d9

07 Jun, 2021 7 commits

Add optional grouped parsers description to HfArgumentParser (#12042) · 8994c1e4
Peter Izsak authored Jun 07, 2021
```
* Adding optional argument group to HfArgumentParser

* Minor

* remove whitespace

* Minor styling
```
8994c1e4

Extend pipelines for automodel tupels (#12025) · 2056f26e

Nicolas Patry authored Jun 07, 2021



* fix_torch_device_generate_test

* remove @

* finish

* refactor

* add test

* fix test

* Attempt at simplification.

* Small fix.

* Fixing non existing AutoModel for TF.

* Naming.

* Remove extra condition.
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

2056f26e

Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c

François Lagunas authored Jun 07, 2021

* Fixing bug that appears when using distilation (and potentially other uses).
During backward pass Pytorch complains with:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.

* Fixing all models QA clamp_ bug.

f8bd8c6c

[JAX] Bump jax lib (#12053) · 59f75d53
Patrick von Platen authored Jun 07, 2021
```
* fix_torch_device_generate_test

* remove @

* bump up jax lib
```
59f75d53
fix docs of past_key_values (#12049) · 185122ef
Suraj Patil authored Jun 07, 2021

185122ef
fix deberta 2 tokenizer integration test (#12017) · 3857f2b4
Philip May authored Jun 07, 2021

3857f2b4
Fixed Typo in modeling_bart.py (#12035) · 20b6f3b8
Shiva Pundir authored Jun 07, 2021
```
* Fixed Typo in modeling_bart.py - Issue #11895

* Fixed Typo in modeling_bart.py
```
20b6f3b8

04 Jun, 2021 2 commits

[TrainerArguments] format and sort __repr__, add __str__ (#12018) · 1f335aef
Stas Bekman authored Jun 04, 2021
```
* format and sort __repr__, add __str__

* typo

* use __str__ directly

* alias __repr__ = __str__
```
1f335aef

[Deepspeed] Assert on mismatches between ds and hf args (#12021) · 2c73b930

Stas Bekman authored Jun 04, 2021



* wip

* add mismatch validation + test

* renames

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* renames
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

2c73b930

03 Jun, 2021 2 commits

[Flax] Refactor MLM (#12013) · 242ec31a

Patrick von Platen authored Jun 03, 2021



* fix_torch_device_generate_test

* remove @

* finish refactor
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

242ec31a

Fix weight decay masking in `run_flax_glue.py` (#11964) · 4674061b

Nicholas Vadivelu authored Jun 03, 2021



* Fix weight decay masking in `run_flax_glue.py`

Issues with the previous implementation:
- The `dict` from `traverse_util.flatten_dict` has keys which are tuples of strings, not one long string with the path separated by periods.
- `optax.masked` applies the transformation wherever the mask is True, so the masks are flipped.
- Flax's LayerNorm calls the scale parameter `scale` not `weight`

* Fix formatting with black

* adapt results
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

4674061b

02 Jun, 2021 5 commits
- [deepspeed] add nvme test skip rule (#11997) · 61c50634
  Stas Bekman authored Jun 02, 2021
```
* add nvme skip rule

* fix
```
  61c50634
- [deepspeed] Move code and doc into standalone files (#11984) · 640318be
  Stas Bekman authored Jun 02, 2021
```
* move code and docs

* style

* moved

* restore
```
  640318be
- Update return introduction (#11976) · d6d747cb
  Kou Yong Kang authored Jun 03, 2021
```
Make it clear that the `forward` method now returns a dict instead of tuple.

Fix style
```
  d6d747cb
- [docs] fix xref to `PreTrainedModel.generate` (#11049) · d406a272
  Stas Bekman authored Jun 02, 2021
```
* fix xref to generate

* do the same for search methods

* style

* style
```
  d406a272
- Fix examples (#11990) · 123b597f
  Gunjan Chhablani authored Jun 02, 2021
  
  123b597f