Commits · 97f9b8a27b80cdaf0eea9c18eba63960b1c34ed3 · chenpangpang / transformers

28 Feb, 2022 4 commits

Fixing the timestamps with chunking. (#15843) · 97f9b8a2

Nicolas Patry authored Feb 28, 2022



* Fixing the timestamps with chunking.

* The changes modified (and fixed) the striding tests.

* Adding a tokenizer test.

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Defense -> comment.

* Update src/transformers/models/wav2vec2/tokenization_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

97f9b8a2

Fix (deprecated) ONNX exporter to account for new tf2onnx API (#15856) · 410e26c7
lewtun authored Feb 28, 2022
```
* Fix (deprecated) ONNX exporter to account for new tf2onnx API
```
410e26c7

Flax Speech-Encoder-Decoder Model (#15613) · e3342edc

Sanchit Gandhi authored Feb 28, 2022



* rebase

* Delete shift tokens func

* downsample decoder input seq len for init

* correct attention mask

* add tests

* pt flax cross test

* make fixup

* init file for import

* change pt-flax cross test threshold

* pt-flax test logits only

* move tests

* make repo-consistency

* consistent indentation
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

e3342edc

[UniSpeechSat] correct unispeech sat (#15847) · 935a76d9
Patrick von Platen authored Feb 28, 2022

935a76d9

25 Feb, 2022 11 commits

Add TFConvNextModel (#15750) · 84eaa6ac

Sayak Paul authored Feb 25, 2022

* feat: initial implementation of convnext in tensorflow.

* fix: sample code for the classification model.

* chore: added checked for from the classification model.

* chore: set bias initializer in the classification head.

* chore: updated license terms.

* chore: removed ununsed imports

* feat: enabled argument during using drop_path.

* chore: replaced tf.identity with layers.Activation(linear).

* chore: edited default checkpoint.

* fix: minor bugs in the initializations.

* partial-fix: tf model errors for loading pretrained pt weights.

* partial-fix: call method updated

* partial-fix: cross loading of weights (4x3 variables to be matched)

* chore: removed unneeded comment.

* removed playground.py

* rebasing

* rebasing and removing playground.py.

* fix: renaming TFConvNextStage conv and layer norm layers

* chore: added initializers and other minor additions.

* add: tests for convnext.

* fix: integration tester class.

* fix: issues mentioned in pr feedback (round 1).

* fix: how output_hidden_states arg is propoagated inside the network.

* feat: handling of arg for pure cnn models.

* chore: added a note on equal contribution in model docs.

* rebasing

* rebasing and removing playground.py.

* feat: encapsulation for the convnext trunk.

* Fix variable naming; Test-related corrections; Run make fixup

* chore: added Joao as a contributor to convnext.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: corrected copyright year and added comment on NHWC.

* chore: fixed the black version and ran formatting.

* chore: ran make style.

* chore: removed from_pt argument from test, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* fix: tests in the convnext subclass, ran make style.

* rebasing

* rebasing and removing playground.py.

* rebasing

* rebasing and removing playground.py.

* chore: moved convnext test to the correct location

* fix: locations for the test file of convnext.

* fix: convnext tests.

* chore: applied sgugger's suggestion for dealing w/ output_attentions.

* chore: added comments.

* chore: applied updated quality enviornment style.

* chore: applied formatting with quality enviornment.

* chore: revert to the previous tests/test_modeling_common.py.

* chore: revert to the original test_modeling_common.py

* chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py

* fix: tests for convnext.

* chore: removed output_attentions argument from convnext config.

* chore: revert to the earlier tf utils.

* fix: output shapes of the hidden states

* chore: removed unnecessary comment

* chore: reverting to the right test_modeling_tf_common.py.

* Styling nits
Co-authored-by: ariG23498 <aritra.born2fly@gmail.com>
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

84eaa6ac

Framework split model report (#15825) · 0b5bf6ab
Lysandre Debut authored Feb 25, 2022

0b5bf6ab

Re-enable doctests for the quicktour (#15828) · 0118c4f6

Sylvain Gugger authored Feb 25, 2022

* Re-enable doctests for the quicktour

* Re-enable doctests for task_summary (#15830)

* Remove &

0118c4f6

Add ONNX Runtime quantization for text classification notebook (#15817) · fd5b05eb
Ella Charlaix authored Feb 25, 2022

fd5b05eb
[examples/summarization and translation] fix readme (#15833) · bf1fe328
Suraj Patil authored Feb 25, 2022

bf1fe328

Fix tf.concatenate + test past_key_values for TF models (#15774) · 8635407b

Yih-Dar authored Feb 25, 2022

* fix wrong method name tf.concatenate

* add tests related to causal LM / decoder

* make style and quality

* clean-up

* Fix TFBertModel's extended_attention_mask when past_key_values is provided

* Fix tests

* fix copies

* More tf.int8 -> tf.int32 in TF test template

* clean-up

* Update TF test template

* revert the previous commit + update the TF test template

* Fix TF template extended_attention_mask when past_key_values is provided

* Fix some styles manually

* clean-up

* Fix ValueError: too many values to unpack in the test

* Fix more: too many values to unpack in the test

* Add a comment for extended_attention_mask when there is past_key_values

* Fix TFElectra extended_attention_mask when past_key_values is provided

* Add tests to other TF models

* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder

* Fix not passing training arg to lm_head in TFRobertaForCausalLM...

8635407b

HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824) · 4818bf7a
Pavel Belevich authored Feb 25, 2022

4818bf7a

Adding the option to return_timestamps on pure CTC ASR models. (#15792) · ad0d7d17

Nicolas Patry authored Feb 25, 2022



* Adding the option to return_timestamps on pure CTC ASR models.

* Remove `math.prod` which was introduced in Python 3.8

* int are not floats.

* Reworking the PR to support "char" vs "word" output.

* Fixup!

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Quality.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ad0d7d17

Add model specific output classes to PoolFormer model docs (#15746) · 7566734d
Tanay Mehta authored Feb 25, 2022
```
* Added model specific output classes to poolformer docs

* Fixed Segformer typo in Poolformer docs
```
7566734d
Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776) · 7963578f
Pavel Belevich authored Feb 25, 2022

7963578f
Fix semantic segmentation pipeline test (#15826) · 074645e3
Sylvain Gugger authored Feb 25, 2022

074645e3

24 Feb, 2022 7 commits
- Fix the push run (#15807) · b7e292ae
  Lysandre Debut authored Feb 24, 2022
  
  b7e292ae
- [TFXLNet] Correct tf xlnet generate (#15822) · cbf43911
  Patrick von Platen authored Feb 24, 2022
```
* [TFXLNet] Correct tf xlnet

* adapt test comment
```
  cbf43911
- [Barthez Tokenizer] Fix saving (#15815) · 2f0f9038
  Patrick von Platen authored Feb 24, 2022
  
  2f0f9038
- [Unispeech] Fix slow tests (#15818) · ca57b450
  Patrick von Platen authored Feb 24, 2022
```
* remove soundfile old way of loading audio

* Adapt slow test
```
  ca57b450
- Revert changes in logit size for semantic segmentation models (#15722) · 35ecf99c
  Sylvain Gugger authored Feb 24, 2022
```
* Revert changes in logit size for semantic segmentation models

* Address review comments
```
  35ecf99c
- Fix from_pretrained with default base_model_prefix (#15814) · d1fcc90a
  Sylvain Gugger authored Feb 24, 2022
  
  d1fcc90a
- Fix add-new-model-like when old model checkpoint is not found (#15805) · 7f921bcf
  Sylvain Gugger authored Feb 24, 2022
```
* Fix add-new-model-like command when old checkpoint can't be recovered

* Style
```
  7f921bcf
23 Feb, 2022 18 commits

Fix model templates (#15806) · bb7949b3
Lysandre Debut authored Feb 23, 2022
```
* Fix model templates

* Update paths
```
bb7949b3
Docker images should only run on a daily basis · 309e87e2
Lysandre authored Feb 23, 2022

309e87e2
Scheduled tests should only run on a daily basis · c475f3ce
Lysandre authored Feb 23, 2022

c475f3ce
Fix build_documentation CI (#15803) · 6336017c
Eliott C authored Feb 23, 2022

6336017c
[Test refactor 5/5] Build docker images (#15729) · a0e34806
Lysandre Debut authored Feb 23, 2022

a0e34806
[Test refactor 4/5] Improve the scheduled tests (#15728) · 4c737f0e
Lysandre Debut authored Feb 23, 2022

4c737f0e

[Test refactor 3/5] Notification service improvement (#15727) · d3ae2bd3

Lysandre Debut authored Feb 23, 2022



* Per-folder tests reorganization

* Review comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>

d3ae2bd3

[Test refactor 2/5] Tests fetcher (#15726) · 0400b226

Lysandre Debut authored Feb 23, 2022



* Tests fetcher

* Review comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Review comments

0400b226

[Test refactor 1/5] Per-folder tests reorganization (#15725) · 29c10a41

Lysandre Debut authored Feb 23, 2022



* Per-folder tests reorganization
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>

29c10a41

🧼 NLP task guides (#15731) · fecb08c2

Steven Liu authored Feb 23, 2022

* clean commit of changes to NLP tasks

* 🖍 apply feedback

* 📝

 move tf data collator in multiple choice
Co-authored-by: Steven <stevhliu@gmail.com>

fecb08c2

Fix indent in doc-builder CI (#15798) · 86636f52
Eliott C authored Feb 23, 2022

86636f52

HTML dev docs (#15678) · a1efc823

Eliott C authored Feb 23, 2022


Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

a1efc823

Align documentation with code defaults (#15468) · 3f76bf54
lsb authored Feb 23, 2022
```
In the code, `do_normalize` defaults to True
```
3f76bf54

[doc] custom_models: mention security features of the Hub (#15768) · 32f5de10

Julien Chaumond authored Feb 23, 2022



* custom_models: tiny doc addition

* mention security feature earlier in the section
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

32f5de10

Enable `image-segmentation` on `AutoModelForSemanticSegmentation` (#15647) · 9e71d464

Nicolas Patry authored Feb 23, 2022

* Enabling Beit SegFormer to `image-segmentation`.

* Fixing the score.

* Fix import ?

* Missing in type hint.

* Multiple test fixes:

- Add `raw_image` support. It should be the default IMHO since in Python
  world it doesn't make any sense to base64 encode the image (Sorry
  @mishig, didn't catch that in my review). I really think we should
  consider breaking BC here.
- Add support for Segformer tiny test (needed
  `SegformerModelTester.get_config` to enable TinyConfig
  @NielsRogge)
- Add the check that `batch_size` works correctly on that pipeline.
  Uncovered that it doesn't for Detr, which IMO is OK since images
  after `feature_extractor` don't have the same size. Comment should
  explain.

* Type hint as a string.

* Make fixup + update black.

* torch+vision protections.

* Don't use torchvision, use F.interpolate instead (no new dep).

* Last fixes for Segformer.

* Update test to reflect new image (which was broken)

* Update tests.

* Major BC modification:

- Removed the string compressed PNG string, that's a job for users
`transformers` stays in python land.
- Removed the `score` for semantic segmentation. It has hardly a meaning
  on its own in this context.
- Don't include the grayscale with logits for now (which could enable
  users to get a sense of confidence). Might be done later.
- Don't include the surface of the mask (could be used for sorting by
  users, to filter out small masks). It's already calculable, and
  it's easier to add later, than to add now and break later if we need.

* `make fixup`.

* Small changes.

* Rebase + doc fixup.

9e71d464

[ViLT] Fix checkpoint url in config (#15790) · 1b239797

Suraj Patil authored Feb 23, 2022



* [ViLT] Fix checkpoint url in config

* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

1b239797

[CLIP] fix grad ckpt (#15789) · de737866
Suraj Patil authored Feb 23, 2022

de737866
Supporting Merges.txt files than contain an endline. (#15782) · a3e607d1
Nicolas Patry authored Feb 23, 2022
```
(`hf-internal-testing/tiny-clip` for instance)
```
a3e607d1