Commits · 0d0aada56444ad554021947addaa035feb55948f · chenpangpang / transformers

10 Aug, 2022 5 commits

Use commit hash to look in cache instead of calling head (#18534) · 0d0aada5

Sylvain Gugger authored Aug 10, 2022



* Use commit hash to look in cache instead of calling head

* Add tests

* Add attr for local configs too

* Stupid typos

* Fix tests

* Update src/transformers/utils/hub.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address Julien's comments
Co-authored-by: Julien Chaumond <julien@huggingface.co>

0d0aada5

TF Examples Rewrite (#18451) · 6eb51450

Matt authored Aug 10, 2022



* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

6eb51450

Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) · d7e2d7b4
Sylvain Gugger authored Aug 10, 2022
```
* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement
```
d7e2d7b4

TF: XLA-trainable DeBERTa v2 (#18546) · 34aad0da

Joao Gante authored Aug 10, 2022

* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float

34aad0da

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) · 4a51075a

Younes Belkada authored Aug 10, 2022



* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers

 into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

4a51075a

09 Aug, 2022 7 commits

Clean up comment · 38a67459
Sylvain Gugger authored Aug 09, 2022

38a67459

Restore _init_weights value in no_init_weights (#18504) · 5e2f3737

YouJiacheng authored Aug 10, 2022



* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5e2f3737

Adding a new `align_to_words` param to qa pipeline. (#18010) · 9f5fe635

Nicolas Patry authored Aug 09, 2022



* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

9f5fe635

BART - Fix attention mask device issue on copied models (#18540) · ab2006e3

Younes Belkada authored Aug 09, 2022

* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix

ab2006e3

Minor update of `run_call_with_unpacked_inputs` (#18541) · 6bea7b81

Yih-Dar authored Aug 09, 2022


Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6bea7b81

Add mt5 onnx config (#18394) · 8cb5ecd9

Thomas Chaigneau authored Aug 09, 2022

* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>

8cb5ecd9

fix: data2vec-vision Onnx ready-made configuration. (#18427) · fe785730

Niklas Hansson authored Aug 09, 2022

* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config

fe785730

08 Aug, 2022 7 commits

Let's not cast them all (#18471) · ab62a23d

Younes Belkada authored Aug 08, 2022



* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

ab62a23d

unpin resampy (#18527) · ec8d2624
Yih-Dar authored Aug 08, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ec8d2624
New cache fixes: add safeguard before looking in folders (#18522) · 47e16762
Sylvain Gugger authored Aug 08, 2022

47e16762
Remove debug statement · aff5117f
Sylvain Gugger authored Aug 08, 2022

aff5117f

Fix compatibility with 1.12 (#17925) · 70b0d4e1

Sylvain Gugger authored Aug 08, 2022



* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

70b0d4e1

Clean up hub (#18497) · 377cdded
Sylvain Gugger authored Aug 08, 2022
```
* Clean up utils.hub

* Remove imports

* More fixes

* Last fix
```
377cdded

[DX fix] Fixing QA pipeline streaming a dataset. (#18516) · a4562552

Nicolas Patry authored Aug 08, 2022

* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.

a4562552

06 Aug, 2022 1 commit

`transformers-cli login` => `huggingface-cli login` (#18490) · 9129fd03

Julien Chaumond authored Aug 06, 2022

* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`

9129fd03

05 Aug, 2022 9 commits

Typo reported by Joel Grus on TWTR (#18493) · b8c247b6
Julien Chaumond authored Aug 05, 2022

b8c247b6
Forgot one new_ for cache migration · 56a55d3c
Sylvain Gugger authored Aug 05, 2022

56a55d3c
Move cache folder to huggingface/hub for consistency with hf_hub (#18492) · faacdf00
Sylvain Gugger authored Aug 05, 2022
```
* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one
```
faacdf00

Use new huggingface_hub tools for download models (#18438) · 5cd40323

Sylvain Gugger authored Aug 05, 2022

* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments

5cd40323

Fix pipeline tests (#18487) · 70fa1a8d
Sylvain Gugger authored Aug 05, 2022
```
* Fix pipeline tests

* Make sure all pipelines tests run with init changes
```
70fa1a8d
Remove py.typed (#18485) · c7849d9e
Sylvain Gugger authored Aug 05, 2022

c7849d9e

Refactor `TFSwinLayer` to increase serving compatibility (#18352) · bf174f91

Seunghwan Hong authored Aug 05, 2022



* Refactor `TFSwinLayer` to increase serving compatibility
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix missed parameters while refactoring
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix window_reverse to calculate batch size
Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

bf174f91

Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) · 575aa6ef
Seunghwan Hong authored Aug 05, 2022
```
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
```
575aa6ef

Fixing issue where generic model types wouldn't load properly with the pipeline (#18392) · 586dcf6b

Nicolas Patry authored Aug 05, 2022

* Adding a better error message when the model is improperly configured

within transformers.

* Update src/transformers/pipelines/__init__.py

* Black version.

* Overriding task aliases so that tokenizer+feature_extractor

values are correct.

* Fixing task aliases by overriding their names early

* X.

* Fixing feature-extraction.

* black again.

* Normalizing `translation` too.

* Fixing last few corner cases.

translation need to use its non normalized name (translation_XX_to_YY,
so that the task_specific_params are correctly overloaded).
This can be removed and cleaned up in a later PR.

`speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually
so the error needs to be discarded when the `tokenizer` is already
there.

* doc-builder fix.

* Fixing the real issue.

* Removing dead code.

* Do not import the actual config classes.

586dcf6b

04 Aug, 2022 7 commits

Add `TF_MODEL_FOR_SEMANTIC_SEGMENTATION_MAPPING` (#18469) · 14928921
Yih-Dar authored Aug 04, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
14928921

Add VideoMAE (#17821) · f9a0008d

NielsRogge authored Aug 04, 2022



* First draft

* Add VideoMAEForVideoClassification

* Improve conversion script

* Add VideoMAEForPreTraining

* Add VideoMAEFeatureExtractor

* Improve VideoMAEFeatureExtractor

* Improve docs

* Add first draft of model tests

* Improve VideoMAEForPreTraining

* Fix base_model_prefix

* Make model take pixel_values of shape (B, T, C, H, W)

* Add loss computation of VideoMAEForPreTraining

* Improve tests

* Improve model testsé

* Make all tests pass

* Add VideoMAE to main README

* Add tests for VideoMAEFeatureExtractor

* Add integration test

* Improve conversion script

* Rename patch embedding class

* Remove VideoMAELayer from init

* Update design of patch embeddings

* Improve comments

* Improve conversion script

* Improve conversion script

* Add conversion of pretrained model

* Add loss verification of pretrained model

* Add loss verification of unnormalized targets

* Add integration test for pretraining model

* Apply suggestions from code review

* Fix bug to make feature extractor resize only shorter edge

* Address more comments

* Improve normalization of videos

* Add doc examples

* Move constants to dedicated script

* Remove scripts

* Transfer checkpoints, fix docs

* Update script

* Update image mean and std

* Fix doc tests

* Set return_tensors to NumPy by default

* Revert the previous change
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

f9a0008d

Add FX support for torch.baddbmm andd torch.Tensor.baddbmm (#18363) · 672b6626
Thomas Wang authored Aug 04, 2022

672b6626
Fix load of model checkpoints in the Trainer (#18470) · df28de05
Sylvain Gugger authored Aug 04, 2022

df28de05

HFTracer.trace can now take callables and torch.nn.Module (#18457) · c74befc9

Michael Benayoun authored Aug 04, 2022

* Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones

* Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general

* Remove pdb comment

* Apply suggestions

c74befc9

change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372) · fc1d841b
nlpcat authored Aug 04, 2022
```
* change shape to support dynamic batch input in tf.generate

* add tests
Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>
```
fc1d841b

[BLOOM] Clean modeling code (#18344) · b69a62d5

Thomas Wang authored Aug 04, 2022



* Cleanup some code

* Improve signatures

* Try to reduce the number of reshape/copies

* I don't think we actually need the layer_num scaling trick

* No need for duplication

* Try to fix beam_search

* Fix beam search

* Removing layer num normalization seems to be breaking

* Not sure self.layer_number normalization actually matters

* Try and be backward compatible

* Try to fix beam_search

* Revert attempt to be backward compatible

* Improve documentation on past_key_values format

* Optimize the device allocation in case of hidden_states in multiple devices

* No need to manually cast the values to a specific device

* Rename with long version of variables

* Improve type hinting

* Add comment that explains that some methods return views

* Actually i think the attention casting only makes sense when we use torch.float16

* We don't actually need layer_number to be passed anymore

* Fix FX test

* Bypass torch.baddbmm

* Apply suggestions from code review

* Add comment about support for torchScript v1.11

* fix ONNX support for bloom (#18456)
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

b69a62d5

03 Aug, 2022 4 commits

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

Update pinned hhub version (#18448) · a507908c
Omar Sanseviero authored Aug 03, 2022
```
* Update pinned hhub version

* Make style
```
a507908c

support ONNX export of XDropout in deberta{,_v2} and sew_d (#17502) · 9d7b70bc

Gary Miguel authored Aug 03, 2022

* support ONNX export of XDropout in deberta{,_v2}

* black

* copy to sew_d

* add test

* isort

* use pytest.mark.filterwarnings

* review comments

9d7b70bc

fixing error when using sharded ddp (#18435) · 22a0dd2e
Sourab Mangrulkar authored Aug 03, 2022

22a0dd2e