Commits · 80468251bc3771d53427f77aa2dc9d49a55d2bf0 · chenpangpang / transformers

11 Aug, 2022 5 commits

Change BartLearnedPositionalEmbedding's forward method signature to support... · 80468251

Dan Jones authored Aug 11, 2022


Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies
Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

80468251

Skip broken tests · 3f0707b2
Sylvain Gugger authored Aug 11, 2022

3f0707b2

Fix LayoutLMv3 documentation (#17932) · 4c8ec66a

Wonseok Lee (Jack) authored Aug 11, 2022

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

4c8ec66a

Fix resizing bug in OWL-ViT (#18573) · f762f373

Alara Dirik authored Aug 11, 2022

* Fixes resizing bug in OWL-ViT
* Defaults to square resize if size is set to an int
* Sets do_center_crop default value to False

f762f373

Segformer TF: fix output size in documentation (#18572) · 76568d24

Maxime G authored Aug 11, 2022



* Segformer TF: fix output size in doc

* Segformer pytorch: fix output size in doc
Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>

76568d24

10 Aug, 2022 11 commits

fix string (#18568) · 051311ff
Michael Wyatt authored Aug 10, 2022

051311ff
raise atol for MT5OnnxConfig (#18560) · 9a9a525b
Yih-Dar authored Aug 10, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9a9a525b

Adds CLIP to models exportable with ONNX (#18515) · f62cb831

Dhruv Karan authored Aug 11, 2022



* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f62cb831

Properly move cache when it is not in default path (#18563) · 50949fab
Sylvain Gugger authored Aug 10, 2022

50949fab
Update philosophy to include other preprocessing classes (#18550) · 6936e7c4
Steven Liu authored Aug 10, 2022
```
* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks
```
6936e7c4

`pipeline` support for `device="mps"` (or any other string) (#18494) · 9d4a4550

Julien Chaumond authored Aug 10, 2022



* `pipeline` support for `device="mps"` (or any other string)

* Simplify `if` nesting

* Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix? @sgugger

* passing `attr=None` is not the same as not passing `attr` 🤯
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9d4a4550

Use commit hash to look in cache instead of calling head (#18534) · 0d0aada5

Sylvain Gugger authored Aug 10, 2022



* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments
Co-authored-by: Julien Chaumond <julien@huggingface.co>

0d0aada5

TF Examples Rewrite (#18451) · 6eb51450

Matt authored Aug 10, 2022

* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/d...

6eb51450

Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) · d7e2d7b4
Sylvain Gugger authored Aug 10, 2022
```
* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement
```
d7e2d7b4

TF: XLA-trainable DeBERTa v2 (#18546) · 34aad0da

Joao Gante authored Aug 10, 2022

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

34aad0da

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a

Younes Belkada authored Aug 10, 2022



* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers

 into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

4a51075a

09 Aug, 2022 9 commits

📝 update documentation build section (#18548) · 8cf4a6f0
Steven Liu authored Aug 09, 2022

8cf4a6f0
Clean up comment · 38a67459
Sylvain Gugger authored Aug 09, 2022

38a67459

Restore _init_weights value in no_init_weights (#18504) · 5e2f3737

YouJiacheng authored Aug 10, 2022



* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5e2f3737

📝 update metric with evaluate (#18535) · 0c183cc2
Steven Liu authored Aug 09, 2022

0c183cc2

Adding a new `align_to_words` param to qa pipeline. (#18010) · 9f5fe635

Nicolas Patry authored Aug 09, 2022



* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9f5fe635

BART - Fix attention mask device issue on copied models (#18540) · ab2006e3

Younes Belkada authored Aug 09, 2022

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

ab2006e3

Minor update of `run_call_with_unpacked_inputs` (#18541) · 6bea7b81

Yih-Dar authored Aug 09, 2022


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6bea7b81

Add mt5 onnx config (#18394) · 8cb5ecd9

Thomas Chaigneau authored Aug 09, 2022

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

8cb5ecd9

fix: data2vec-vision Onnx ready-made configuration. (#18427) · fe785730

Niklas Hansson authored Aug 09, 2022

* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config

fe785730

08 Aug, 2022 15 commits

Let's not cast them all (#18471) · ab62a23d

Younes Belkada authored Aug 08, 2022



* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

ab62a23d

Spanish translation of summarization.mdx (#15947) (#18477) · 499450ed

AguilaCudicio authored Aug 08, 2022



* Add Spanish translation of summarization.mdx

* Apply suggestions from code review
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

499450ed

Add Spanish translation of converting_tensorflow_models.mdx (#18512) · ed70f242

Ian Castillo authored Aug 08, 2022

* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review

ed70f242

Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) · a765b68a

Rasmus Arpe Fogh Jensen authored Aug 08, 2022

* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script

* make fixup changes

* PR comments

* changed input to Acceletor based on PR comment, ran make fixup

* Added comment explaining the sync_gradients statement

* Fixed lr scheduler max steps

* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper

* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper

* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script

* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py

* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script

a765b68a

Update perf_train_gpu_one.mdx (#18532) · f1f5de31
Mishig Davaadorj authored Aug 08, 2022

f1f5de31

[VideoMAE] Add model to doc tests (#18523) · 82bb6826

NielsRogge authored Aug 08, 2022



* Add videomae to doc tests

* Add pip install decord
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

82bb6826

Add example of multimodal usage to pipeline tutorial (#18498) · 3632531e

Steven Liu authored Aug 08, 2022

* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback

3632531e

✨ update to use interlibrary links instead of Markdown (#18500) · 36b37990
Steven Liu authored Aug 08, 2022

36b37990
unpin resampy (#18527) · ec8d2624
Yih-Dar authored Aug 08, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ec8d2624
New cache fixes: add safeguard before looking in folders (#18522) · 47e16762
Sylvain Gugger authored Aug 08, 2022

47e16762
Specify en in doc-builder README example (#18526) · 74959240
Ankur Goyal authored Aug 08, 2022
```
Co-authored-by: Ankur Goyal <ankur@impira.com>
```
74959240
Remove debug statement · aff5117f
Sylvain Gugger authored Aug 08, 2022

aff5117f

Fix compatibility with 1.12 (#17925) · 70b0d4e1

Sylvain Gugger authored Aug 08, 2022



* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

70b0d4e1

update fsdp docs (#18521) · 2fecde74
Sourab Mangrulkar authored Aug 08, 2022
```
* updating fsdp documentation

* typo fix
```
2fecde74
Clean up hub (#18497) · 377cdded
Sylvain Gugger authored Aug 08, 2022
```
* Clean up utils.hub

* Remove imports

* More fixes

* Last fix
```
377cdded