Commits · e65445b4d689d3af02fcba2f90cb63bdff2d9ba6 · chenpangpang / transformers

16 Dec, 2022 3 commits
- Stop calling expand_1d on newer TF versions (#20786) · e65445b4
  Matt authored Dec 16, 2022
  
  e65445b4
- Fix object detection2 (#20798) · 3ee95820
  Nicolas Patry authored Dec 16, 2022
```
* Revert "Fixing object detection with `layoutlm` (#20776)"

This reverts commit fca66abe.

* Better fix for layoutlm object detection.

* Style.
```
  3ee95820
- [Pipeline] skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING` (#20790) · 4341f4e2
  Younes Belkada authored Dec 16, 2022
```
skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING`
```
  4341f4e2
15 Dec, 2022 9 commits

Recompile `apex` in `DeepSpeed` CI image (#20788) · 1543cee7

Yih-Dar authored Dec 15, 2022



Recompile apex in DeepSpeed CI image
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1543cee7

Move convert_to_rgb to image_transforms module (#20784) · 491e9518
amyeroberts authored Dec 15, 2022
```
* Move convert_to_rgb to image_transforms module

* Fix tests
```
491e9518

Generate: use `GenerationConfig` as the basis for `.generate()` parametrization (#20388) · 4bc723f8

Joao Gante authored Dec 15, 2022



* generate from config mvp

* fix failing tests

* max_time test

* Load default gen config at model load time; Update docs

* further documentation; add tests

* adapt rag to the new structure

* handle models not instantiated with from_pretained (like in tests)

* better default generation config

* add can_generate fn

* handle legacy use case of ad hoc model config changes

* initialize gen config from config in individual methods, if gen config is none

* fix _get_decoder_start_token_id when called outside GenerationMixin

* correct model config load order (set attr > model config > decoder config)

* update rag to match latest changes

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* load gen config from model config in model.from_pretrained

* fix can_generate fn

* handle generate calls without a previous from_pretrained (e.g. tests)

* add legacy behavior (and a warning)

* lower logger severity
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4bc723f8

Install video dependency for pipeline CI (#20777) · b1706f69
Yih-Dar authored Dec 15, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
b1706f69
Fixing object detection with `layoutlm` (#20776) · fca66abe
Nicolas Patry authored Dec 15, 2022
```
* Fixing object detection with layoutlm.

* Fixup.
```
fca66abe
[Pipeline] fix failing bloom `pipeline` test (#20778) · 8891193e
Younes Belkada authored Dec 15, 2022
```
fix failing `pipeline` test
```
8891193e

Patch for FlanT5-XXL 8bit support (#20760) · b9b70b0e

Lars Mennen authored Dec 15, 2022

* Workaround for #20287: FlanT5-XXL 8bit support

* Make fix-copies

* revert unrelated change

* Dont apply to longt5 and switch transformers

b9b70b0e

Install vision for TF pipeline tests (#20771) · fe9152f6
Yih-Dar authored Dec 15, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
fe9152f6
Even more validation. (#20762) · a9912d2f
Nicolas Patry authored Dec 15, 2022
```
* Even more validation.

* Fixing order.
```
a9912d2f

14 Dec, 2022 8 commits

Add Swin backbone (#20769) · 67acb07e

NielsRogge authored Dec 14, 2022



* Add Swin backbone

* Remove line

* Add code example
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

67acb07e

Install `torch-tensorrt 1.3.0` for DeepSpeed CI (#20764) · 94f8e21c
Yih-Dar authored Dec 14, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
94f8e21c
Replaces xxx_required with requires_backends (#20715) · 7b23a582
amyeroberts authored Dec 14, 2022
```
* Replaces xxx_required with requires_backends

* Fixup
```
7b23a582

[CI-Test] Fixes but also skips the mT5 tests (#20755) · 7c9e2f24

Arthur authored Dec 14, 2022

* weight -> weights

* model embedding resize does not work with both v2 and noraml

* remove useless test

7c9e2f24

Fix attribute error problem (#20765) · dfd81842

casuallyName authored Dec 14, 2022

fix: 修复Trainer无法使用use_legacy_prediction_loop参数的问题

解决使用use_legacy_prediction_loop参数在predict阶段使用prediction_loop进行预测时，遇到AttributeError: 'PredictionOutput' object has no attribute 'num_samples'的问题
Co-authored-by: ZhouHang <zhouhang@idataway.com>

dfd81842

[Tests] Improve test_attention_outputs (#20701) · 11745b4e

NielsRogge authored Dec 14, 2022



* Improve tests

* Improve TF tests

* Apply suggestion

* Fix test
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

11745b4e

Fix missing `()` in some usage of `is_flaky` (#20749) · 722bf7ef
Yih-Dar authored Dec 14, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
722bf7ef
Remove image_transforms functions from init (#20704) · 9bafedc0
amyeroberts authored Dec 14, 2022

9bafedc0

13 Dec, 2022 8 commits

Uninstall `torch_tensorrt` in `DeepSpeed` CI image for now (#20758) · d994473b
Yih-Dar authored Dec 13, 2022
```
Uninstall torch_tensorrt for now
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d994473b
Fixing the pipeline tutorial test (#20746) · ba9da49a
Nicolas Patry authored Dec 13, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ba9da49a

Add docs xlm roberta (#20742) · f28c918c

Hazrul Akmal authored Dec 13, 2022

* added model resources for xlm-roberta

* added model resources for xlm-roberta

* resolve suggested changes

* add resources to xlm-roberta

f28c918c

[NAT, DiNAT] Add backbone class (#20654) · 6ef42587

NielsRogge authored Dec 13, 2022



* Add first draft

* Add out_features attribute to config

* Add corresponding test

* Add Dinat backbone

* Add BackboneMixin

* Add Backbone mixin, improve tests

* Fix embeddings

* Fix bug

* Improve backbones

* Fix Nat backbone tests

* Fix Dinat backbone tests

* Apply suggestions
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

6ef42587

in the resize() function in image_transforms.py, the line 267: (#20728) · 30d8919a

dhansmair authored Dec 13, 2022

`image = to_channel_dimension_format(image, ChannelDimension.LAST)`
is redundant as this same conversion is also applied in to_pil_image().

This redundant call actually makes the training fail in rare cases.
The problem can be reproduced with the following code snippet:
```
from transformers.models.clip import CLIPFeatureExtractor
vision_processor = CLIPFeatureExtractor.from_pretrained('openai/clip-vit-large-patch14')
images = [
    torch.rand(size=(3, 2, 10), dtype=torch.float),
    torch.rand(size=(3, 10, 1), dtype=torch.float),
    torch.rand(size=(3, 1, 10), dtype=torch.float)
]
for image in images:
    processed_image = vision_processor(images=image, return_tensors="pt")['pixel_values']
    print(processed_image.shape)
    assert processed_image.shape == torch.Size([1, 3, 224, 224])
```

The last image has a height of 1 pixel.
The second call to to_channel_dimesion_format() will transpose the image, and the height
dimension is wrongly treated as the channels dimension afterwards.
Because of this, the following normalize() step will result in an
exception.

30d8919a

Fix AdamWeightDecay for TF 2.11 (#20735) · 4f1788b3
Matt authored Dec 13, 2022
```
* Fix AdamWeightDecay for TF

* Fix AdamWeightDecay for TF

* make fixup
```
4f1788b3

Change a logic in pipeline test regarding TF (#20710) · a12c5cbc

Yih-Dar authored Dec 13, 2022



* Fix the pipeline test regarding TF

* Fix the pipeline test regarding TF

* update comment
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a12c5cbc

Add `keep_in_fp32_modules` support (#20683) · 1af4bee8

Younes Belkada authored Dec 13, 2022



* add `keep_in_fp32_modules` support

* pass it as class attribute

* few modifs

- make tests `slow`
- fix logic

* better logic

* fix failing test

* `bfloat16` support

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* simplify tests

* simplify tests

* fix test

* modify message

* more checks

* fix failing tests

* add more conditions

- add `is_accelerate_available`
- fixes pipleine tests that failed

* add suggestions

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix failing `bnb` test

* add last safety checker
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1af4bee8

12 Dec, 2022 12 commits

Update CI to torch 1.13.0 (#20687) · d4bf9ee1
Yih-Dar authored Dec 12, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d4bf9ee1
rename `layoutlm_job` to `exotic_models_job` (#20736) · f41a11a1
Yih-Dar authored Dec 12, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
f41a11a1
Add decorator for flaky Donut tests (#20739) · 1416b5d9
amyeroberts authored Dec 12, 2022
```
* Add decorator for flaky tests

* Fix up
```
1416b5d9
Disambiguate test for required_input in tokenization base file. (#20731) · a450789d
Sylvain Gugger authored Dec 12, 2022
```
* Disambiguate test for required_input in tokenization base file.

* Add test for size
```
a450789d
Add a progress bar for large model loading (#20713) · 29ff8716
Sylvain Gugger authored Dec 12, 2022

29ff8716

Add gpt-sw3 model to transformers (#20209) · 5f94855d

Ariel Ekgren authored Dec 12, 2022



* Add templates for gpt-sw3

* Add templates for gpt-sw3

* Added sentencepiece tokenizer

* intermediate commit with many changes

* fixed conflicts

* Init commit for tokenization port

* Tokenization progress

* Remove fast tokenizer

* Clean up and rename spm.model -> spiece.model

* Remove TF -> PT conversion script template, Clean up Megatron -> PT script

* Optimize encode & decode performance

* added new attention

* added new attention

* attention for gpt-sw3 working

* attention good

* Cache is now working

* fixed attention mask so that it works with causal attention

* fixed badbmm bug for cpu and caching

* updated config with correct parameters

* Refactor and leave optimizations as separate functions to avoid breaking expected functionality

* Fix special tokens mapping for both tokenizers

* cleaning up of code and comments

* HF compatible attention outputs

* Tokenizer now passing tests, add documentation

* Update documentation

* reverted back to base implementation after checking that it is identical to pretrained model

* updated gpt-sw3 config

* updated conversion script

* aligned parameters with gpt-sw3 config

* changed default scale_attn_by_inverse_layer_idx to true

* removed flag from conversion script

* added temporary model path

* reverted back to functioning convert script

* small changes to default config

* updated tests for gpt-sw3

* make style, make quality, minor cleanup

* Change local paths to testing online repository

* Change name: GptSw3 -> GPTSw3

* Remove GPTSw3TokenizerFast references

* Use official model repository and add more model sizes

* Added reference to 6.7b model

* Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel

* Remove pointers to non-existing TFGPTSw3

* Add GPTSw3 to docs/_toctree.yml

* Remove TF artifacts from GPTSw3 in __init__ files

* Update README:s with 'make fix-copies'

* Add 20b model to archive list

* Add documentation for GPT-Sw3

* Fix typo in documentation for GPT-Sw3

* Do 'make fix-copies' again after having updated docs

* Fix some typos in docs

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve comments from PR feedback

* Resolve more comments from PR feedback, also set use_cache=True in convert script

* Add '# Copied from' comments for GPTSw3 modeling

* Set 'is_parallelizable = False'

* Remove '# Copied from' where code was modified and add 'with x->y' when appropriate

* Remove parallelize in mdx

* make style, make quality

* Update GPTSw3Config default values and corresponding documentation

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available

* Make style, make quality

* Add dummy object for GPTSw3Tokenizer via 'make fix-copies'

* make fix-copies

* Remove GPTSw3 modeling classes

* make style, make quality

* Add GPTSw3 auto-mappings for other GPT2 heads

* Update docs/source/en/model_doc/gpt-sw3.mdx
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove old TODO-comment

* Add example usage to GPTSw3Tokenizer docstring

* make style, make quality

* Add implementation details and example usage to gpt-sw3.mdx
Co-authored-by: JoeyOhman <joeyoh@kth.se>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5f94855d

Add vision requirement to image transforms (#20712) · b58beebe

amyeroberts authored Dec 12, 2022

* Add require_vision decorator

* Fixup

* Use requires_backends

* Add requires_backend to utils functions

b58beebe

Clarify return_tensor and return_text parameters (#20662) · fd2bed7f
Steven Liu authored Dec 12, 2022
```
* clarify docstring

* make style
```
fd2bed7f
Convert tokenizer outputs for Keras in doc example (#20732) · c1b9a11d
Matt authored Dec 12, 2022
```
* Convert tokenizer outputs for Keras in doc example

* Das deutsche Beispiel auch korrigieren
```
c1b9a11d

Spanish translation of the file debugging.mdx (#20566) · 0ba94ace

Juanjo do Olmo authored Dec 12, 2022



* Create and translate to Spanish debugging.mdx

* solved typo error in a header

* Update debugging.mdx

* Update debugging.mdx

* Update docs/source/es/debugging.mdx
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update _toctree.yml
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0ba94ace

fsdp fix (#20719) · a413c725
Sourab Mangrulkar authored Dec 12, 2022

a413c725
Very small edit to change name to OpenAI GPT (#20722) · 17c742bb
stanleycai95 authored Dec 12, 2022

17c742bb