Commits · 258480864d3b5771ab07800912515bc1e859b7a3 · chenpangpang / transformers

09 Feb, 2022 15 commits

update serving_output for some TF models (#15568) · 25848086
Yih-Dar authored Feb 09, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
25848086
Fix tests hub failure (#15580) · 315e6740
Sylvain Gugger authored Feb 09, 2022
```
* Expose hub test problem

* Fix tests
```
315e6740
Fix quality · b1ba03e0
Sylvain Gugger authored Feb 09, 2022

b1ba03e0
Trigger doc build · eed3186b
Sylvain Gugger authored Feb 09, 2022

eed3186b

Constrained Beam Search [without disjunctive decoding] (#15416) · 2b5603f6

Chan Woo Kim authored Feb 10, 2022



* added classes to get started with constrained beam search

* in progress, think i can directly force tokens now but not yet with the round robin

* think now i have total control, now need to code the bank selection

* technically works as desired, need to optimize and fix design choices leading to undersirable outputs

* complete PR #1 without disjunctive decoding

* removed incorrect tests

* Delete k.txt

* Delete test.py

* Delete test.sh

* revert changes to test scripts

* genutils

* full implementation with testing, no disjunctive yet

* shifted docs

* passing all tests realistically ran locally

* removing accidentally included print statements

* fixed source of error in initial PR test

* fixing the get_device() vs device trap

* fixed documentation docstrings about constrained_beam_search

* fixed tests having failing for Speech2TextModel's floating point inputs

* fix cuda long tensor

* added examples and testing for them and founx & fixed a bug in beam_search and constrained_beam_search

* deleted accidentally added test halting code with assert False

* code reformat

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/test_generation_utils.py

* fixing based on comments on PR

* took out the testing code that should but work fails without the beam search moditification ; style changes

* fixing comments issues

* docstrings for ConstraintListState

* typo in PhrsalConstraint docstring

* docstrings improvements
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2b5603f6

Add implementation of typical sampling (#15504) · 0113aae5

Clara Meister authored Feb 09, 2022

* typical decoding

* changing arg name

* add test config params

* forgotten arg rename

* fix edge case where scores are same

* test for typical logits warper

* code quality fixes

0113aae5

[Flax tests/FlaxBert] make from_pretrained test faster (#15561) · f588cf40
Suraj Patil authored Feb 09, 2022

f588cf40
Upgrade click version (#15579) · 70292409
Lysandre Debut authored Feb 09, 2022

70292409
Add Wav2Vec2 Adapter Weights to Flax (#15566) · 9e00566b
Sanchit Gandhi authored Feb 09, 2022
```
* Add Wav2Vec2 Adapter Weights to Flax

* Suggested changes
```
9e00566b
Make sure custom configs work with Transformers (#15569) · 1f60bc46
Sylvain Gugger authored Feb 09, 2022
```
* Make sure custom configs work with Transformers

* Apply code review suggestions
```
1f60bc46
Upgrade black to version ~=22.0 (#15565) · 7732d0fe
Lysandre Debut authored Feb 09, 2022
```
* Upgrade black to version ~=22.0

* Check copies

* Fix code
```
7732d0fe

add model scaling section (#15119) · d923f762

Leandro von Werra authored Feb 09, 2022



* add model scaling section

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* integrate reviewer feedback

* initialize GPU properly

* add note about BnB optimizer

* move doc from `scaling.mdx` to `performance.mdx`

* integrate reviewer feedback

* revert section levels
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d923f762

PoC for a ProcessorMixin class (#15549) · b5c6fdec

Sylvain Gugger authored Feb 09, 2022



* PoC for a ProcessorMixin class

* Documentation

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Roll out to other processors

* Add base feature extractor class in init

* Use args and kwargs
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

b5c6fdec

logger.warn --> logger.warning (#15572) · ba3f9a71

Yih-Dar authored Feb 09, 2022



* change logger.warn to logger.warning

* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ba3f9a71

[Flax tests] fix test_model_outputs_equivalence (#15571) · a6885db9
Suraj Patil authored Feb 09, 2022
```
* fix test_model_outputs_equivalence

* fix tuple outputs for blenderbot
```
a6885db9

08 Feb, 2022 6 commits

📝 Add codecarbon callback to docs (#15563) · fcb4f11c
Nathan Raw authored Feb 08, 2022

fcb4f11c

feat(flax): allow encoder_outputs in generate (#15554) · 077c00c0

Boris Dayma authored Feb 08, 2022

* feat(flax): allow encoder_outputs in generate

* doc(flax): encoder_outputs in generate

* fix: style

* fix: style

077c00c0

Add TFSpeech2Text (#15113) · 8406fa6d

Joao Gante authored Feb 08, 2022

* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format

8406fa6d

Force use_cache to be False in PyTorch (#15385) · 6a5472a8

Yih-Dar authored Feb 08, 2022



* use_cache = False for PT models if labels is passed

* Fix for BigBirdPegasusForConditionalGeneration

* add warning if users specify use_cache=True

* Use logger.warning instead of warnings.warn
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6a5472a8

[GPTJ] fix docs (#15558) · 0acd84f7
Suraj Patil authored Feb 08, 2022

0acd84f7

electra is added to onnx supported model (#15084) · 87d08afb

aaron authored Feb 08, 2022



* electra is added to onnx supported model

* add google/electra-base-generator for test onnx module
Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>

87d08afb

07 Feb, 2022 12 commits

FX tracing improvement (#14321) · 0fe17f37

Michael Benayoun authored Feb 07, 2022

* Change the way tracing happens, enabling dynamic axes out of the box

* Update the tests and modeling xlnet

* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).

* Comments and making tracing work for gpt-j and xlnet

* Refactore things related to num_choices (and batch_size, sequence_length)

* Update fx to work on PyTorch 1.10

* Postpone autowrap_function feature usage for later

* Add copyrights

* Remove unnecessary file

* Fix issue with add_new_model_like

* Apply suggestions

0fe17f37

Create a custom model guide (#15489) · 552f8d30

Steven Liu authored Feb 07, 2022

* 📝 add config section

* 📝 finish first draft

* 📝 add feature extractor and processor

* 🖍 apply feedback from review

* 📝 minor edits

* last review

552f8d30

Make TF Wav2Vec2 outputs the same as PT's version (#15530) · ad1d3c4d

Yih-Dar authored Feb 07, 2022



* fix outputs

* fix for CTC

* fix doc

* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ad1d3c4d

Fix TF T5/LED missing cross attn in retrun values (#15511) · 131e2584

Yih-Dar authored Feb 07, 2022



* add cross attn to outputs

* add cross attn to outputs for TFLED

* add undo padding

* remove unused import

* fix style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

131e2584

Remove Longformers from ONNX-supported models (#15273) · 6775b211
lewtun authored Feb 07, 2022

6775b211

Wav2Vec2 models must either throw or deal with add_apater (#15409) · 7a1412e1

François REMY authored Feb 07, 2022



* Wav2Vec2 models must either throw or deal with add_apater
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add pre-add_adapter backwards compatibility

* Add pre-add_adapter backwards compatibility

* Fix issue in tests/test_modeling_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

7a1412e1

Add ASR CTC streaming example (#15309) · a459f7f9

Anton Lozhkov authored Feb 07, 2022



* Single-epoch run

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Infinite dataset

* Trainer fix + distributed benchmark

* Benchmark fix

* unused import

* interleaved splits

* interleaved splits

* has_length util

* Move to research projects

* Leftover Sized checks

* Bump min version

* Unused import

* Revert trainer changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a459f7f9

[Trainer] Deeper length checks for IterableDatasetShard (#15539) · 75b13f82

Anton Lozhkov authored Feb 07, 2022



* Unused import

* Make `has_length()` torch-independent to use in callbacks

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

75b13f82

Add ConvNeXT (#15277) · 84eec9e6

NielsRogge authored Feb 07, 2022



* First draft

* Add conversion script

* Improve conversion script

* Improve docs and implement tests

* Define model output class

* Fix tests

* Fix more tests

* Add model to README

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply more suggestions from code review

* Apply suggestions from code review

* Rename dims to hidden_sizes

* Fix equivalence test

* Rename gamma to gamma_parameter

* Clean up conversion script

* Add ConvNextFeatureExtractor

* Add corresponding tests

* Implement feature extractor correctly

* Make implementation cleaner

* Add ConvNextStem class

* Improve design

* Update design to also include encoder

* Fix gamma parameter

* Use sample docstrings

* Finish conversion, add center cropping

* Replace nielsr by facebook, make feature extractor tests smaller

* Fix integration test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

84eec9e6

[torch_int_div] Correct true division in generation (#15498) · c47d2592
Patrick von Platen authored Feb 07, 2022
```
* [torch_int_div] Correct true division in generation

* up

* up
```
c47d2592
[ASR pipeline] correct asr pipeline for seq2seq models (#15541) · 5f1918a4
Patrick von Platen authored Feb 07, 2022

5f1918a4
Revert "Handle PyTorch to Flax conversion of 1D convolutions (#15519)" (#15540) · e02bdce7
Patrick von Platen authored Feb 07, 2022
```
This reverts commit 854a0d52.
```
e02bdce7

04 Feb, 2022 6 commits

[deepspeed docs] DeepSpeed ZeRO Inference (#15486) · 8ce13306

Stas Bekman authored Feb 04, 2022



* [deepspeed docs] DeepSpeed ZeRO Inference

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweak

* deal with black

* extra cleanup, better comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8ce13306

Standardize semantic segmentation models outputs (#15469) · ac6aa10f

Sylvain Gugger authored Feb 04, 2022



* Standardize instance segmentation models outputs

* Rename output

* Update src/transformers/modeling_outputs.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add legacy argument to the config and model forward

* Update src/transformers/models/beit/modeling_beit.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Copy fix in Segformer
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

ac6aa10f

[deepspeed docs] Megatron-Deepspeed info (#15488) · 31be2f45
Stas Bekman authored Feb 04, 2022

31be2f45

Fix TFRemBertEncoder all_hidden_states (#15510) · bbe9c698

Yih-Dar authored Feb 04, 2022



* fix

* fix test

* remove expected_num_hidden_layers
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

bbe9c698

Handle PyTorch to Flax conversion of 1D convolutions (#15519) · 854a0d52
Sanchit Gandhi authored Feb 04, 2022

854a0d52
use kwargs (#15509) · 486260c6
Yih-Dar authored Feb 04, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
486260c6

03 Feb, 2022 1 commit

Remove loss from some flax models docs & examples (#15492) · 525dbbf8

Yih-Dar authored Feb 03, 2022



* Remove return_loss from Flax models

* fix more

* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

525dbbf8