Commits · df28de0581aaf6d8742c4988137caac2b6602ca8 · chenpangpang / transformers

04 Aug, 2022 5 commits

Fix load of model checkpoints in the Trainer (#18470) · df28de05
Sylvain Gugger authored Aug 04, 2022

df28de05
Update no trainer scripts for multiple-choice (#18468) · 330247ed
Kian Sierra McGettigan authored Aug 04, 2022
```
* swag_no_trainer updated for with gather_metrics

* Removed unused variable samples_seen
```
330247ed

HFTracer.trace can now take callables and torch.nn.Module (#18457) · c74befc9

Michael Benayoun authored Aug 04, 2022

* Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones

* Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general

* Remove pdb comment

* Apply suggestions

c74befc9

change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372) · fc1d841b
nlpcat authored Aug 04, 2022
```
* change shape to support dynamic batch input in tf.generate

* add tests
Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>
```
fc1d841b

[BLOOM] Clean modeling code (#18344) · b69a62d5

Thomas Wang authored Aug 04, 2022



* Cleanup some code

* Improve signatures

* Try to reduce the number of reshape/copies

* I don't think we actually need the layer_num scaling trick

* No need for duplication

* Try to fix beam_search

* Fix beam search

* Removing layer num normalization seems to be breaking

* Not sure self.layer_number normalization actually matters

* Try and be backward compatible

* Try to fix beam_search

* Revert attempt to be backward compatible

* Improve documentation on past_key_values format

* Optimize the device allocation in case of hidden_states in multiple devices

* No need to manually cast the values to a specific device

* Rename with long version of variables

* Improve type hinting

* Add comment that explains that some methods return views

* Actually i think the attention casting only makes sense when we use torch.float16

* We don't actually need layer_number to be passed anymore

* Fix FX test

* Bypass torch.baddbmm

* Apply suggestions from code review

* Add comment about support for torchScript v1.11

* fix ONNX support for bloom (#18456)
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>

b69a62d5

03 Aug, 2022 10 commits

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

fix: keras fit tests for segformer tf and minor refactors. (#18412) · be41eaf5

Sayak Paul authored Aug 03, 2022

* fix: keras fit tests for segformer tf and minor refactors.

* refactor: test_keras_fit to make it simpler using the existing one.

* fix: styling issues.

be41eaf5

add zero-shot obj detection notebook to docs (#18453) · fc546332
Alara Dirik authored Aug 03, 2022

fc546332
Fix failing tests for XLA generation in TF (#18298) · 8fb7c908
Daniel Suess authored Aug 03, 2022
```
* Fix failing test_xla_generate_slow tests

* Fix failing speech-to-text xla_generate tests
```
8fb7c908
Update pinned hhub version (#18448) · a507908c
Omar Sanseviero authored Aug 03, 2022
```
* Update pinned hhub version

* Make style
```
a507908c

Update no trainer scripts for language modeling and image classification examples (#18443) · 3db4378b

Ritik Nandwal authored Aug 03, 2022

* Update no_trainer script for image-classification

* Update no_trainer scripts for language-modeling examples

* Remove unused variable

* Removing truncation from losses array for language modeling examples

3db4378b

Add Spanish translation of run_scripts.mdx (#18415) · 10e1ec9a

Ian Castillo authored Aug 03, 2022

* Add file in spanish docs to be translated

* Translate first two sections to Spanish

* Translate four additional sections to Spanish

* Finish translation to Spanish

* Improve writing style in Spanish

* Add suggested changes from reviewer

10e1ec9a

support ONNX export of XDropout in deberta{,_v2} and sew_d (#17502) · 9d7b70bc

Gary Miguel authored Aug 03, 2022

* support ONNX export of XDropout in deberta{,_v2}

* black

* copy to sew_d

* add test

* isort

* use pytest.mark.filterwarnings

* review comments

9d7b70bc

Update _toctree.yml (#18440) · 92915ebe

Steven Liu authored Aug 03, 2022

This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.

92915ebe

fixing error when using sharded ddp (#18435) · 22a0dd2e
Sourab Mangrulkar authored Aug 03, 2022

22a0dd2e

02 Aug, 2022 10 commits

Add programming languages (#18434) · 5096a654

Christopher Akiki authored Aug 02, 2022

The current wording makes it sound as if the programming languages are part of the 46 natural languages.

5096a654

Update pipeline word heuristic to work with whitespace in token offsets (#18402) · 042f4203

David authored Aug 02, 2022

* Update pipeline word heuristic to work with whitespace in token offsets

This change checks for whitespace in the input string at either the
character preceding the token or in the first character of the token.
This works with tokenizers that return offsets excluding whitespace
between words or with offsets including whitespace.

fixes #18111

starting

* Use smaller model, ensure expected tokenization

* Re-run CI (please squash)

042f4203

Accept `trust_remote_code` and ignore it in `PreTrainedModel.from_pretrained` (#18428) · c382ed8a
Yih-Dar authored Aug 02, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c382ed8a
Improve `generate` docstring (#18198) · dbd9641c
João Lages authored Aug 02, 2022
```
* improve generate docstring

* Remove 'defaults to None' comment
```
dbd9641c
fix run_clip README (#18332) · 5546fb61
Yih-Dar authored Aug 02, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
5546fb61
Fix `test_load_default_pipelines_tf` test error (#18422) · 2959d090
Yih-Dar authored Aug 02, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
2959d090
update maskformer docs (#18423) · 8ae77842
Alara Dirik authored Aug 02, 2022
```
* update maskformer docs

* fix typo
```
8ae77842
Change audio kwarg to images in TROCR processor (#18421) · 0b8c1b69
Yih-Dar authored Aug 02, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
0b8c1b69
Fix the hub user name in a longformer doctest checkpoint (#18418) · dd21fb37
Yih-Dar authored Aug 02, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
dd21fb37

Fix uninitialized parameter in conformer relative attention. (#18368) · 68a894a5

Piotr Dabkowski authored Aug 02, 2022

`torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior:

```
>>> torch.Tensor(100, 100).sum()
tensor(0.)
>>> torch.Tensor(100, 100).sum()
tensor(nan)
>>> torch.Tensor(100, 100).sum()
tensor(0.)
```

68a894a5

01 Aug, 2022 15 commits

fix: create a copy for tokenizer object (#18408) · df5e4232
Yassine authored Aug 01, 2022

df5e4232

Layoutlmv2 tesseractconfig (#17733) · 24845aeb

Kelvin Kong authored Aug 02, 2022



* Added option for users to modify config parameter used by pytesseract during feature extraction

- Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction
- Eg. Can be used to modify psm values by setting tess_config to '--psm 7'
- Different psm values significantly influences the output of layoutlmv2

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Updated variable names to be more explicit

* Fixed styles

* Added option for users to modify config parameter when calling pytesseract during feature extraction

- Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization
- Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6"

* Removed  from function signature
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

24845aeb

Split model list on modality (#18328) · 151a2aaa

Steven Liu authored Aug 01, 2022

* 📝

 split up model list

* Adapt script to reorg

* apply niels feedback
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

151a2aaa

Rewrite push_to_hub to use upload_files (#18366) · 01db72ab

Sylvain Gugger authored Aug 01, 2022

* Rewrite push_to_hub to use upload_files

* Adapt the doc a bit

* Address review comments and clean doc

01db72ab

Add Flax BART pretraining script (#18297) · 3909d7f1

Duong A. Nguyen authored Aug 01, 2022



* add bart pretraining flax script

* fixup

* add bart pretraining flax script

* add BART to README

* add BART to README

* add BART to README

* add BART to README

* add BART to README

* add bos eos document

* Update README.md

* Update README.md

* Update examples/flax/language-modeling/run_bart_dlm_flax.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* final

* final

* final

* remove use_auth_token ing from_config
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

3909d7f1

Fix ROUGE add example check and update README (#18398) · 941d2331
Sylvain Gugger authored Aug 01, 2022
```
* Fix ROUGE add example check and update README

* Stay consistent in values
```
941d2331

Adding fine-tuning models to LUKE (#18353) · 62098b93

Ikuya Yamada authored Aug 02, 2022

* add LUKE models for downstream tasks

* add new LUKE models to docs

* fix typos

* remove commented lines

* exclude None items from tuple return values

62098b93

Fix docs (#18399) · 7b9e995b

NielsRogge authored Aug 01, 2022


Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

7b9e995b

Add balanced strategies for device_map in from_pretrained (#18349) · e0bc4c73

Sylvain Gugger authored Aug 01, 2022



* Add balanced strategies for device_map in from_pretrained

* Add safeguards for Accelerate version

* Update src/transformers/modeling_utils.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Style
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

e0bc4c73

Fix doc tests (#18397) · 39e76d76
NielsRogge authored Aug 01, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
39e76d76
Fix OPT doc tests (#18365) · 11413711
Arthur authored Aug 01, 2022

11413711
Add evaluate to test dependencies (#18396) · af1e6b4d
Sylvain Gugger authored Aug 01, 2022

af1e6b4d
Add a check regarding the number of occurrences of ``` (#18389) · bd6d1b43
Yih-Dar authored Aug 01, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
bd6d1b43

Fix from_pretrained kwargs passing (#18387) · 1cd7c6f1

YouJiacheng authored Aug 01, 2022

Fix #18385
I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.

1cd7c6f1

Remove pt-like calls on tf tensor (#18393) · 96b5d7db
amyeroberts authored Aug 01, 2022

96b5d7db