Commits · a5ca56ff158075351149220319c14dde555a86f5 · chenpangpang / transformers

12 Aug, 2022 8 commits

Supporting seq2seq models for `bitsandbytes` integration (#18579) · a5ca56ff

Younes Belkada authored Aug 12, 2022

* Supporting seq2seq models for `bitsandbytes` integration

- `bitsandbytes` integration supports now seq2seq models
- check if a model has tied weights as an additional check

* small modification

- tie the weights before looking at tied weights!

a5ca56ff

Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) · ed1924e8
Joao Gante authored Aug 12, 2022
```
* validate generate model_kwargs

* generate tests -- not all models have an attn mask
```
ed1924e8
Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) · 2156619f
Yih-Dar authored Aug 12, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2156619f
FSDP bug fix for `load_state_dict` (#18596) · 4eed2bec
Sourab Mangrulkar authored Aug 12, 2022

4eed2bec
typos (#18594) · d344534b
Stas Bekman authored Aug 12, 2022

d344534b

update doc for perf_train_cpu_many, add intel mpi introduction (#18576) · 3cdaea47

Wang, Yi authored Aug 12, 2022



* update doc for perf_train_cpu_many, add mpi introduction
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* Update docs/source/en/perf_train_cpu_many.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_train_cpu_many.mdx
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3cdaea47

Add type hints for ViLT models (#18577) · 46d09410
Ian Castillo authored Aug 12, 2022
```
* Add type hints for Vilt models

* Add missing return type for TokenClassification class
```
46d09410

Load sharded pt to flax (#18419) · bce36ee0

Arthur authored Aug 12, 2022



* initial commit

* add small test

* add cross pt tf flag to test

* fix quality

* style

* update test with new repo

* fix failing test

* update

* fix wrong param ordering

* style

* update based on review

* update related to recent new caching mechanism

* quality

* Update based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* quality and style

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bce36ee0

11 Aug, 2022 13 commits

Return the permuted hidden states if return_dict=True (#18578) · c8b6ae85
amyeroberts authored Aug 11, 2022

c8b6ae85
fix owlvit tests, update docstring examples (#18586) · f28f2408
Alara Dirik authored Aug 11, 2022

f28f2408

Bump nbconvert in /examples/research_projects/visual_bert (#18566) · 05d3a43c

dependabot[bot] authored Aug 11, 2022

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0

)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

05d3a43c

Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) · 713ab6fd

dependabot[bot] authored Aug 11, 2022

Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0.
- [Release notes](https://github.com/jupyter/nbconvert/releases)
- [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0

)

---
updated-dependencies:
- dependency-name: nbconvert
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

713ab6fd

Fix docstrings with last version of hf-doc-builder styler (#18581) · c23cbdff
Sylvain Gugger authored Aug 11, 2022
```
* Fix docstrings with last version of hf-doc-builder styler

* Remove empty Parameter block
```
c23cbdff

[FX] _generate_dummy_input supports audio-classification models for labels (#18580) · 42b8940b

Michael Benayoun authored Aug 11, 2022

* Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not

* Use ENV_VARS_TRUE_VALUES

42b8940b

Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) · d53dffec

iiLaurens authored Aug 11, 2022



* Fix critical trace warnings to allow ONNX export

* Force input to `sqrt` to be float type

* Cleanup code

* Remove unused import statement

* Update model sew

* Small refactor
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Use broadcasting instead of repeat

* Implement suggestion
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* Match deberta v2 changes in sew_d

* Improve code quality

* Update code quality

* Consistency of small refactor

* Match changes in sew_d
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

d53dffec

german docs translation (#18544) · 5d3f0374

flozi00 authored Aug 11, 2022

* Create _config.py

* Create _toctree.yml

* Create index.mdx

not sure about "du / ihr" oder "sie"

* Create quicktour.mdx

* Update _toctree.yml

* Update build_documentation.yml

* Update build_pr_documentation.yml

* fix build

* Update index.mdx

* Update quicktour.mdx

* Create installation.mdx

* Update _toctree.yml

5d3f0374

Change BartLearnedPositionalEmbedding's forward method signature to support... · 80468251

Dan Jones authored Aug 11, 2022


Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486)

* changing BartLearnedPositionalEmbedding forward signature and references to it

* removing debugging dead code (thanks style checker)

* blackened modeling_bart file

* removing copy inconsistencies via make fix-copies

* changing references to copied signatures in Bart variants

* make fix-copies once more

* using expand over repeat (thanks @michaelbenayoun)

* expand instead of repeat for all model copies
Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com>

80468251

Skip broken tests · 3f0707b2
Sylvain Gugger authored Aug 11, 2022

3f0707b2

Fix LayoutLMv3 documentation (#17932) · 4c8ec66a

Wonseok Lee (Jack) authored Aug 11, 2022

* fix typos

* fix sequence_length docs of LayoutLMv3Model

* delete trailing white spaces

* fix layoutlmv3 docs more

* apply make fixup & quality

* change to two versions of input docstring

* apply make fixup & quality

4c8ec66a

Fix resizing bug in OWL-ViT (#18573) · f762f373

Alara Dirik authored Aug 11, 2022

* Fixes resizing bug in OWL-ViT
* Defaults to square resize if size is set to an int
* Sets do_center_crop default value to False

f762f373

Segformer TF: fix output size in documentation (#18572) · 76568d24

Maxime G authored Aug 11, 2022



* Segformer TF: fix output size in doc

* Segformer pytorch: fix output size in doc
Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com>

76568d24

10 Aug, 2022 11 commits

fix string (#18568) · 051311ff
Michael Wyatt authored Aug 10, 2022

051311ff
raise atol for MT5OnnxConfig (#18560) · 9a9a525b
Yih-Dar authored Aug 10, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
9a9a525b

Adds CLIP to models exportable with ONNX (#18515) · f62cb831

Dhruv Karan authored Aug 11, 2022



* onnx config for clip

* default opset as 14

* changes from the original repo

* input values order fix

* outputs fix

* remove unused import

* ran make fix-copies

* black format

* review comments: forward ref, import fix, model change revert, .to cleanup

* make style

* formatting fixes

* revert groupvit

* comment for cast to int32

* comment fix

* make .T as .t() for onnx conversion

* ran make fix-copies

* remove unneeded comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* remove comment
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f62cb831

Properly move cache when it is not in default path (#18563) · 50949fab
Sylvain Gugger authored Aug 10, 2022

50949fab
Update philosophy to include other preprocessing classes (#18550) · 6936e7c4
Steven Liu authored Aug 10, 2022
```
* 📝 update philosophy to include other preprocessing classes

* 🖍 apply feedbacks
```
6936e7c4

`pipeline` support for `device="mps"` (or any other string) (#18494) · 9d4a4550

Julien Chaumond authored Aug 10, 2022



* `pipeline` support for `device="mps"` (or any other string)

* Simplify `if` nesting

* Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix? @sgugger

* passing `attr=None` is not the same as not passing `attr` 🤯
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9d4a4550

Use commit hash to look in cache instead of calling head (#18534) · 0d0aada5

Sylvain Gugger authored Aug 10, 2022



* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments
Co-authored-by: Julien Chaumond <julien@huggingface.co>

0d0aada5

TF Examples Rewrite (#18451) · 6eb51450

Matt authored Aug 10, 2022



* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

6eb51450

Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) · d7e2d7b4
Sylvain Gugger authored Aug 10, 2022
```
* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement
```
d7e2d7b4

TF: XLA-trainable DeBERTa v2 (#18546) · 34aad0da

Joao Gante authored Aug 10, 2022

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

34aad0da

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a

Younes Belkada authored Aug 10, 2022



* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers

 into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

4a51075a

09 Aug, 2022 8 commits

📝 update documentation build section (#18548) · 8cf4a6f0
Steven Liu authored Aug 09, 2022

8cf4a6f0
Clean up comment · 38a67459
Sylvain Gugger authored Aug 09, 2022

38a67459

Restore _init_weights value in no_init_weights (#18504) · 5e2f3737

YouJiacheng authored Aug 10, 2022



* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5e2f3737

📝 update metric with evaluate (#18535) · 0c183cc2
Steven Liu authored Aug 09, 2022

0c183cc2

Adding a new `align_to_words` param to qa pipeline. (#18010) · 9f5fe635

Nicolas Patry authored Aug 09, 2022



* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9f5fe635

BART - Fix attention mask device issue on copied models (#18540) · ab2006e3

Younes Belkada authored Aug 09, 2022

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

ab2006e3

Minor update of `run_call_with_unpacked_inputs` (#18541) · 6bea7b81

Yih-Dar authored Aug 09, 2022


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6bea7b81

Add mt5 onnx config (#18394) · 8cb5ecd9

Thomas Chaigneau authored Aug 09, 2022

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

8cb5ecd9