Commits · fcb4f11c9232cee2adce8140a3a7689578ea97de · chenpangpang / transformers

08 Feb, 2022 6 commits

📝 Add codecarbon callback to docs (#15563) · fcb4f11c
Nathan Raw authored Feb 08, 2022

fcb4f11c

feat(flax): allow encoder_outputs in generate (#15554) · 077c00c0

Boris Dayma authored Feb 08, 2022

* feat(flax): allow encoder_outputs in generate

* doc(flax): encoder_outputs in generate

* fix: style

* fix: style

077c00c0

Add TFSpeech2Text (#15113) · 8406fa6d

Joao Gante authored Feb 08, 2022

* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format

8406fa6d

Force use_cache to be False in PyTorch (#15385) · 6a5472a8

Yih-Dar authored Feb 08, 2022



* use_cache = False for PT models if labels is passed

* Fix for BigBirdPegasusForConditionalGeneration

* add warning if users specify use_cache=True

* Use logger.warning instead of warnings.warn
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6a5472a8

[GPTJ] fix docs (#15558) · 0acd84f7
Suraj Patil authored Feb 08, 2022

0acd84f7

electra is added to onnx supported model (#15084) · 87d08afb

aaron authored Feb 08, 2022



* electra is added to onnx supported model

* add google/electra-base-generator for test onnx module
Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>

87d08afb

07 Feb, 2022 12 commits

FX tracing improvement (#14321) · 0fe17f37

Michael Benayoun authored Feb 07, 2022

* Change the way tracing happens, enabling dynamic axes out of the box

* Update the tests and modeling xlnet

* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).

* Comments and making tracing work for gpt-j and xlnet

* Refactore things related to num_choices (and batch_size, sequence_length)

* Update fx to work on PyTorch 1.10

* Postpone autowrap_function feature usage for later

* Add copyrights

* Remove unnecessary file

* Fix issue with add_new_model_like

* Apply suggestions

0fe17f37

Create a custom model guide (#15489) · 552f8d30

Steven Liu authored Feb 07, 2022

* 📝 add config section

* 📝 finish first draft

* 📝 add feature extractor and processor

* 🖍 apply feedback from review

* 📝 minor edits

* last review

552f8d30

Make TF Wav2Vec2 outputs the same as PT's version (#15530) · ad1d3c4d

Yih-Dar authored Feb 07, 2022



* fix outputs

* fix for CTC

* fix doc

* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ad1d3c4d

Fix TF T5/LED missing cross attn in retrun values (#15511) · 131e2584

Yih-Dar authored Feb 07, 2022



* add cross attn to outputs

* add cross attn to outputs for TFLED

* add undo padding

* remove unused import

* fix style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

131e2584

Remove Longformers from ONNX-supported models (#15273) · 6775b211
lewtun authored Feb 07, 2022

6775b211

Wav2Vec2 models must either throw or deal with add_apater (#15409) · 7a1412e1

François REMY authored Feb 07, 2022



* Wav2Vec2 models must either throw or deal with add_apater
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add pre-add_adapter backwards compatibility

* Add pre-add_adapter backwards compatibility

* Fix issue in tests/test_modeling_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

7a1412e1

Add ASR CTC streaming example (#15309) · a459f7f9

Anton Lozhkov authored Feb 07, 2022



* Single-epoch run

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Infinite dataset

* Trainer fix + distributed benchmark

* Benchmark fix

* unused import

* interleaved splits

* interleaved splits

* has_length util

* Move to research projects

* Leftover Sized checks

* Bump min version

* Unused import

* Revert trainer changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a459f7f9

[Trainer] Deeper length checks for IterableDatasetShard (#15539) · 75b13f82

Anton Lozhkov authored Feb 07, 2022



* Unused import

* Make `has_length()` torch-independent to use in callbacks

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

75b13f82

Add ConvNeXT (#15277) · 84eec9e6

NielsRogge authored Feb 07, 2022



* First draft

* Add conversion script

* Improve conversion script

* Improve docs and implement tests

* Define model output class

* Fix tests

* Fix more tests

* Add model to README

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply more suggestions from code review

* Apply suggestions from code review

* Rename dims to hidden_sizes

* Fix equivalence test

* Rename gamma to gamma_parameter

* Clean up conversion script

* Add ConvNextFeatureExtractor

* Add corresponding tests

* Implement feature extractor correctly

* Make implementation cleaner

* Add ConvNextStem class

* Improve design

* Update design to also include encoder

* Fix gamma parameter

* Use sample docstrings

* Finish conversion, add center cropping

* Replace nielsr by facebook, make feature extractor tests smaller

* Fix integration test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

84eec9e6

[torch_int_div] Correct true division in generation (#15498) · c47d2592
Patrick von Platen authored Feb 07, 2022
```
* [torch_int_div] Correct true division in generation

* up

* up
```
c47d2592
[ASR pipeline] correct asr pipeline for seq2seq models (#15541) · 5f1918a4
Patrick von Platen authored Feb 07, 2022

5f1918a4
Revert "Handle PyTorch to Flax conversion of 1D convolutions (#15519)" (#15540) · e02bdce7
Patrick von Platen authored Feb 07, 2022
```
This reverts commit 854a0d52.
```
e02bdce7

04 Feb, 2022 6 commits

[deepspeed docs] DeepSpeed ZeRO Inference (#15486) · 8ce13306

Stas Bekman authored Feb 04, 2022



* [deepspeed docs] DeepSpeed ZeRO Inference

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweak

* deal with black

* extra cleanup, better comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8ce13306

Standardize semantic segmentation models outputs (#15469) · ac6aa10f

Sylvain Gugger authored Feb 04, 2022



* Standardize instance segmentation models outputs

* Rename output

* Update src/transformers/modeling_outputs.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add legacy argument to the config and model forward

* Update src/transformers/models/beit/modeling_beit.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Copy fix in Segformer
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

ac6aa10f

[deepspeed docs] Megatron-Deepspeed info (#15488) · 31be2f45
Stas Bekman authored Feb 04, 2022

31be2f45

Fix TFRemBertEncoder all_hidden_states (#15510) · bbe9c698

Yih-Dar authored Feb 04, 2022



* fix

* fix test

* remove expected_num_hidden_layers
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

bbe9c698

Handle PyTorch to Flax conversion of 1D convolutions (#15519) · 854a0d52
Sanchit Gandhi authored Feb 04, 2022

854a0d52
use kwargs (#15509) · 486260c6
Yih-Dar authored Feb 04, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
486260c6

03 Feb, 2022 8 commits
- Remove loss from some flax models docs & examples (#15492) · 525dbbf8
  Yih-Dar authored Feb 03, 2022
```
* Remove return_loss from Flax models

* fix more

* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  525dbbf8
- [deepspeed docs] memory requirements (#15506) · 21dcaec5
  Stas Bekman authored Feb 03, 2022
  
  21dcaec5
- [WIP] Add preprocess_logits_for_metrics Trainer param (#15473) · f1a4c4ea
  davidleonfdez authored Feb 03, 2022
```
* Add preprocess_logits_for_metrics Trainer param

* Compute accuracy in LM examples

* Improve comments
```
  f1a4c4ea
- [deepspeed] fix a bug in a test (#15493) · 4f5faaf0
  Stas Bekman authored Feb 03, 2022
```
* [deepspeed] fix a bug in a test

* consistency
```
  4f5faaf0
- Add general vision docstrings (#15501) · 90166121
  NielsRogge authored Feb 03, 2022
```
* Add general docstrings

* Remove legacy docstrings

* Add BEiT

* Add DEiT

* Add SegFormer

* Fix beit output class

* Fix missing return_dict
```
  90166121
- [Flax tests] Disable scheduled GPU tests (#15503) · e2b6e73f
  Patrick von Platen authored Feb 03, 2022
  
  e2b6e73f
- fix load_weight_prefix (#15101) · f5d98da2
  Yih-Dar authored Feb 03, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  f5d98da2
- fix (#15494) · 71dccd07
  Yih-Dar authored Feb 03, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  71dccd07
02 Feb, 2022 8 commits

Correct eos_token_id settings in generate (#15403) · 5ec368d7

CHI LIU authored Feb 03, 2022

* Correct eos_token_id set in generate

* Set eos_token_id in test

* Correct eos_token_id set in generate

* Set eos_token_id in test

5ec368d7

fix set truncation attribute in `__init__` of `PreTrainedTokenizerBase` (#15456) · 39b5d1a6

SaulLu authored Feb 02, 2022



* change truncation_side in init of `PreTrainedTokenizerBase`
Co-authored-by: LSinev <LSinev@users.noreply.github.com>

* add test

* Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`"

This reverts commit 7a98b87962d2635c7e4d4f00db3948b694624843.

* fix kwargs

* Revert "fix kwargs"

This reverts commit 67b0a5270e8cf1dbf70e6b0232e94c0452b6946f.

* Update tests/test_tokenization_common.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* delete truncation_side variable

* reorganize test

* format

* complete doc

* Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`""

This reverts commit d5a10a7e2680539e5d9e98ae5d896c893d224b80.

* fix typo

* fix typos to render documentation

* Revert "Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`"""

This reverts commit 16cf58811943a08f43409a7c83eaa330686591d0.

* format
Co-authored-by: LSinev <LSinev@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

39b5d1a6

Fix labels stored in model config for token classification examples (#15482) · 45cac3fa
Sylvain Gugger authored Feb 02, 2022
```
* Playing

* Properly set labels in model config for token classification example

* Port to run_ner_no_trainer

* Quality
```
45cac3fa

Add W&B backend for hyperparameter sweep (#14582) · c74f3d4c

Ayush Chaurasia authored Feb 03, 2022

# Add support for W&B hyperparameter sweep
This PR:
* allows using wandb for running hyperparameter search.
* The runs are visualized on W&B sweeps dashboard
* This supports runnning sweeps on parallel devices, all reporting to the same central dashboard.

### Usage
**To run new a hyperparameter search:**
```
trainer.hyperparameter_search(
    backend="wandb", 
    project="transformers_sweep", # name of the project
    n_trials=5,
    metric="eval/loss", # metric to be optimized, default 'eval/loss'. A warning is raised if the passed metric is not found
)
```
This outputs a sweep id. Eg. `my_project/sweep_id`

**To run sweeps on parallel devices:**
Just pass sweep id which you want to run parallel
```
trainer.hyperparameter_search(
    backend="wandb", 
    sweep_id = "my_project/sweep_id"
)
```

c74f3d4c

Fic docstring of ASR pipeline (#15481) · 13297ac7
Sylvain Gugger authored Feb 02, 2022

13297ac7

fix error posted in issue #15448 (#15480) · dd360d58

bugface authored Feb 02, 2022



* fix error posted in issue #15448
Signed-off-by: bugface <alexgre@ufl.edu>

* clean up - remove commented line
Signed-off-by: bugface <alexgre@ufl.edu>

dd360d58

Save code of registered custom models (#15379) · 44b21f11

Sylvain Gugger authored Feb 02, 2022



* Allow dynamic modules to use relative imports

* Work for configs

* Fix last merge conflict

* Save code of registered custom objects

* Map strings to strings

* Fix test

* Add tokenizer

* Rework tests

* Tests

* Ignore fixtures py files for tests

* Tokenizer test + fix collection

* With full path

* Rework integration

* Fix typo

* Remove changes in conftest

* Test for tokenizers

* Add documentation

* Update docs/source/custom_models.mdx
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add file structure and file content

* Add more doc

* Style

* Update docs/source/custom_models.mdx
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

44b21f11

Adding support for `microphone` streaming within pipeline. (#15046) · 623d8cb4

Nicolas Patry authored Feb 02, 2022



* Adding support for `microphone` streaming within pipeline.

- Uses `ffmpeg` to get microphone data.
- Makes sure alignment is made to `size_of_sample`.
- Works by sending `{"raw": ..data.., "stride": (n, left, right),
"partial": bool}`
directly to the pipeline enabling to stream partial results and still
get inference.
- Let's `partial` information flow through the pipeline to enable caller
  to get it back and choose to display text or not.

- The striding reconstitution is bound to have errors since CTC does not
keep previous state. Currently most of the errors are we don't know if
there's a space or not between two chunks.
Since we have some left striding info, we could use that during decoding
to choose what to do with those spaces and even extra letters maybe (if
the stride is long enough, it's bound to cover at least a few symbols)

Fixing tests.

Protecting with `require_torch`.

`raw_ctc` support for nicer demo.

Post rebase fixes.

Revamp to split raw_mic_data from it's live chunking.

- Requires a refactor to make everything a bit cleaner.

Automatic resampling.

Small fix.

Small fix.

* Post rebase fix (need to let super handle more logic, reorder args.)

* Update docstrings

* Docstring format.

* Remove print.

* Prevent flow of `input_values`.

* Fixing `stride` too.

* Fixing the PR by removing `raw_ctc`.

* Better docstrings.

* Fixing init.

* Update src/transformers/pipelines/audio_utils.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Quality.
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

623d8cb4