Commits · 4297f44b63a97d346090db932e4bbd5d83763ce2 · chenpangpang / transformers

"tests/vscode:/vscode.git/clone" did not exist on "93cd94b79d62a8a8b6f4f20e51fda95e6daa4d3a"

23 Jun, 2022 3 commits

Update type hints modeling_yoso.py (#17827) · 4297f44b

Fx039482 authored Jun 23, 2022

* Update modeling_yoso.py

* make fixup

* Update modeling_yoso.py

That should be it copied from previous PR

4297f44b

TF: generate without `tf.TensorArray` (#17801) · 5cce3076
Joao Gante authored Jun 23, 2022

5cce3076

add doctests for DETR (#17786) · ab223fc1

Quentin authored Jun 23, 2022

* add: check labels for detr object detection doctests

* add: check shapes

* add: add detr to documentation_tests.py

* fix: make fixup output

* fix: add a comment

ab223fc1

22 Jun, 2022 5 commits

Offload fixes (#17810) · df8e6804
Sylvain Gugger authored Jun 22, 2022
```
* Offload fixes

* Add a test
```
df8e6804

CLI: use hub's `create_commit` (#17755) · 0d0c392c

Joao Gante authored Jun 22, 2022

* use create_commit

* better commit message and description

* touch setup.py to trigger cache update

* add hub version gating

0d0c392c

initial commit (#17818) · 56b83cf0
Arthur authored Jun 22, 2022

56b83cf0

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer`... · 13570381

Eran Hirsch authored Jun 22, 2022

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805)

* Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`

* Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it

* Remove `self._num_beams` from trainer classes

* - Run fixup
- Fix "Constraint" not exposed
- Fix synced_gpus to actually read from param

* Use kwargs

* Copy kwargs before making changes to it

* Fix style issues unused imports

13570381

Flax sharded (#17760) · 16c6eb7c
Arthur authored Jun 22, 2022

16c6eb7c

21 Jun, 2022 13 commits

Fix `top_k_top_p_filtering` having unexpected behavior (#17744) · 3b00b623

unifyh authored Jun 22, 2022

- Fix `top_k_top_p_filtering` not passing `filter_value` to
   `TopPLogitsWarper` causing any top-p filtered logits to be -inf
   instead of specified value

 - Add corresponding test

3b00b623

Remove duplicate code (#17708) · 3ccff0d4
Kyungmin Lee authored Jun 22, 2022

3ccff0d4

Improve error message Union not allowed (#17769) · 26a6a426

Bram Vanroy authored Jun 21, 2022



* Improve error message Union not allowed

* make style

* Update src/transformers/hf_argparser.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

26a6a426

Add final_layer_norm to OPT model (#17785) · abc400b0

Thomas Wang authored Jun 21, 2022



* Add final_layer_norm to OPT model

* Add JAX and TF version

* Fix Keras name

* Woops

* Allow for non breaking change

* Apply suggestions from code review

* add tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

abc400b0

Properly check for a TPU device (#17802) · 52404cba
Zachary Mueller authored Jun 21, 2022

52404cba
Fix test for BF16 detection (#17803) · ef23fae5
Sylvain Gugger authored Jun 21, 2022

ef23fae5

TF Sharded (#17713) · 7cced021

Arthur authored Jun 21, 2022



* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7cced021

Prepare transformers for v0.8.0 huggingface-hub release (#17716) · 6a5272b2

Lysandre Debut authored Jun 21, 2022



* Prepare CI for v0.8.0

* pin hfh (revert before merge)

* Revert "pin hfh (revert before merge)"

This reverts commit a0103140e1c77b810ffcb735192968bc03be3e1f.

* Test rc3

* Test latest rc

* Unpin to the RC
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

6a5272b2

Fix forward reference imports in DeBERTa configs (#17800) · 7bc88c05
Sylvain Gugger authored Jun 21, 2022

7bc88c05

Fix Automatic Download of Pretrained Weights in DETR (#17712) · 27e90738

Anugunj Naman authored Jun 21, 2022



* added use_backbone_pretrained

* style fixes

* update

* Update detr.mdx

* Update detr.mdx

* Update detr.mdx

* update using doc py

* Update detr.mdx

* Update src/transformers/models/detr/configuration_detr.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

27e90738

[ViTMAE] Fix docstrings and variable names (#17710) · b681e12d

NielsRogge authored Jun 21, 2022



* Fix docstrings and variable names

* Rename x to something better

* Improve messages

* Fix docstrings and add test for greyscale images
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

b681e12d

add onnx support for deberta and debertav2 (#17617) · eb16be41

mrbean authored Jun 21, 2022



* add onnx support for debertav2

* debertav2 -> deberta-v2 in onnx features file

* remove causal lm

* add deberta-v2-xlarge to onnx tests

* use self.type().dtype() in xsoftmax
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* remove hack for deberta

* remove unused imports

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* use generate dummy inputs

* linter

* add imports

* add support for deberta v1 as well

* deberta does not support multiple choice

* Update src/transformers/models/deberta/configuration_deberta.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* one line ordered dict

* fire build
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

eb16be41

Add UL2 (just docs) (#17740) · 8fcbe275

Patrick von Platen authored Jun 21, 2022



* Add UL2
Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com>

* Correct naming

* sort better

* up

* apply sylvains suggestion

8fcbe275

20 Jun, 2022 5 commits

Update modeling_longt5.py (#17777) · da27c4b3

Brad Jascob authored Jun 20, 2022

On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer"
This is because the dtype here is `int64`. For `dtype=int64`, this needs to simply be `-1`.
This impacts the long-t5-tglogbal-x model. It does not impact the long-t5-local-x version which does not appear to call this line.

da27c4b3

Not use -1e4 as attn mask (#17306) · d3cb2888

Yih-Dar authored Jun 20, 2022



* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3cb2888

Fix cache for GPT-Neo-X (#17764) · fdb12080
Sylvain Gugger authored Jun 20, 2022
```
* Fix cache for GPT-Neo-X

* Add more tests
```
fdb12080
deprecate is_torch_bf16_available (#17738) · a2d34b7c
Stas Bekman authored Jun 20, 2022
```
* deprecate is_torch_bf16_available

* address suggestions
```
a2d34b7c
TF: BART compatible with XLA generation (#17479) · 132402d7
Joao Gante authored Jun 20, 2022
```
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
```
132402d7

17 Jun, 2022 3 commits

Save huggingface checkpoint as artifact in mlflow callback (#17686) · 522a9ece

Swetha Mandava authored Jun 17, 2022



* Fix eval to compute rouge correctly for rouge_score

* styling

* moving sentence tokenization to utils from run_eval

* saving ckpt in mlflow

* use existing format of args

* fix documentation
Co-authored-by: Swetha Mandava <smandava@nvidia.com>

522a9ece

Migrate HFDeepSpeedConfig from trfrs to accelerate (#17623) · 21a77242

Sourab Mangrulkar authored Jun 17, 2022



* Migrate HFDeepSpeedConfig from trfrs to accelerate

* add `accelerate` to testing dep

* addressing comments

* addressing comments

Using `_shared_state` and avoiding object creation. This is necessary as `notebook_launcher` in `launcers.py` checks `len(AcceleratorState._shared_state)>0` to throw an error.

* resolving comments

1. Use simple API from accelerate to manage the deepspeed config integration
2. Update the related documentation

* reverting changes and addressing comments

* docstring correction

* addressing nits

* addressing nits

* addressing nits 3

* bumping up the accelerate version to 0.10.0

* resolving import

* update setup.py to include deepspeed dependencies

* Update dependency_versions_table.py

* fixing imports

* reverting changes to CI dependencies for "run_tests_pipelines_tf*" tests

These changes didn't help with resolving the failures and I believe this needs to be addressed in another PR.

* removing `accelerate` as hard dependency

Resolves issues related to CI Tests

* adding `accelerate` as dependency for building docs

resolves failure in Build PR Documentation test

* adding `accelerate` as dependency in "dev" to resolve doc build issue

* resolving comments

1. adding `accelerate` to extras["all"]
2. Including check for accelerate too before import HFDeepSpeedConfig from there
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolving comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

21a77242

feat: add num_workers arg to DataLoader (#17751) · 2d7c1bb1
greg2451 authored Jun 17, 2022

2d7c1bb1

16 Jun, 2022 4 commits
- v4.21.0.dev0 · 7c6ec195
  Sylvain Gugger authored Jun 16, 2022
  
  7c6ec195
- Refine Bf16 test for deepspeed (#17734) · 36d46479
  Sylvain Gugger authored Jun 16, 2022
```
* Refine BF16 check in CPU/GPU

* Fixes

* Renames
```
  36d46479
- Fix tf shared embedding (#17730) · f44e2c2b
  Arthur authored Jun 16, 2022
```
* fix the naming

* from pt in test for now

* make style

* slow test and removed from_pt
```
  f44e2c2b
- Fix mask token in the example (#17725) · 2eadb7e5
  Jiayi Pan authored Jun 16, 2022
```
VIsualBert uses bert-base-uncased tokenizer, therefore, instead of {mask}, the mask token should be [MASK]
```
  2eadb7e5
15 Jun, 2022 3 commits
- normalize keys_to_ignore (#17722) · 66f89332
  Stas Bekman authored Jun 15, 2022
  
  66f89332
- CLI: Add flag to push TF weights directly into main (#17720) · c3c62b5d
  Joao Gante authored Jun 15, 2022
```
* Add flag to push weights directly into main
```
  c3c62b5d
- Documentation: RemBERT fixes (#17641) · 242cc6e2
  Stefan Schweter authored Jun 15, 2022
```
* rembert: fix python codeblock

* rembert: use correct google/rembert checkpoint name in documentation

* rembert: use correct google/rembert checkpoint name in TF documentation
```
  242cc6e2
14 Jun, 2022 4 commits

FX function refactor (#17625) · 7ec9128e

Michael Benayoun authored Jun 14, 2022



* Function refactor

* Update src/transformers/utils/fx.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ec9128e

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes (#17639) · edb672ac

Hailey Schoelkopf authored Jun 14, 2022



* add new bloom classes

* (feat) add bloom classification tests; make style

* style: change import in test

* add some typehints to bloom classes

* merge main into branch

* fix: input checking in bloom seq classification

* fix tests

* change model class tests

* fix few tests

- more tests should pass
- one test left

* make token classifier return hidden states

* style: make BLOOM typehints consistent
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

edb672ac

Swin main layer (#17693) · bd43151a
amyeroberts authored Jun 14, 2022
```
* Swin models call TFSwinMainLayer

* Tidy up
```
bd43151a
[LongT5] Rename checkpoitns (#17700) · 53496ac5
Patrick von Platen authored Jun 14, 2022

53496ac5