Commits · 961732c276241620e80a81ec28e95374963a166e · chenpangpang / transformers

08 Dec, 2021 2 commits

[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339) · 961732c2

Patrick von Platen authored Dec 08, 2021



* up

* up

* up

* make it cleaner

* correct

* make styhahalal

* add more tests

* finish

* small fix

* make style

* up

* tryout to solve cicrle ci

* up

* fix more tests

* fix more tests

* apply sylvains suggestions

* fix import

* correct docs

* add pyctcdecode only to speech tests

* fix more tests

* add tf, flax and pt tests

* add pt

* fix last tests

* fix more tests

* Apply suggestions from code review

* change lines

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* correct tests

* correct tests

* add doc string
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

961732c2

Fixing Dataset for TQA + token-classification. (#14658) · 2e12d90b

Nicolas Patry authored Dec 08, 2021

* Fixing Dataset for TQA + token-classification.

* Fixing the tests.

* Making sure `offset_mappings` is a valid argument.

2e12d90b

07 Dec, 2021 2 commits

[deepspeed] fix --load_best_model_at_end (#14652) · b66c5ab2

Stas Bekman authored Dec 06, 2021

* [deepspeed] fix load_best_model_at_end

* try with pull_request_target

* revert: try with pull_request_target

* style

* add test

* cleanup

b66c5ab2

Add mLUKE (#14640) · 30646a0a

Ryokan RI authored Dec 07, 2021

* implement MLukeTokenizer and LukeForMaskedLM

* update tests

* update docs

* add LukeForMaskedLM to check_repo.py

* update README

* fix test and specify the entity pad id in tokenization_(m)luke

* fix EntityPredictionHeadTransform

30646a0a

06 Dec, 2021 3 commits

Use cross_attention_hidden_size in Encoder-Decoder models (#14378) · 4cdb67ca

Yih-Dar authored Dec 07, 2021



* add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)

* for TFEncoderDecoderModel

* add equivalence test for TFEncoderDecoderModel

* fix

* fix failed equivalence tests

* remove unused import

* add detailed comment

* Fix check_equivalence_tf_to_pt by using encoder/decoder

* cleaning

* Use cross_attention_hidden_size in speech-to-text

* clean fast init logging msg in encoder decoder models

* increase tol from 1e-5 to 1e-3 for tf test

* style

* style

* make sure projection layer can run

* remove type conversion + add check

* fix conflict (config.output_hidden_size)

* Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4cdb67ca

Auto processor fix (#14623) · e9688875

Lysandre Debut authored Dec 06, 2021



* Add AutoProcessor class
Init and tests
Add doc
Fix init
Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Reverts to tokenizer or feature extractor when available
Adapt test

* Revert "Adapt test"

This reverts commit bbdde5fab02465f24b54b227390073082cb32093.

* Revert "Reverts to tokenizer or feature extractor when available"

This reverts commit 77659ff5d21b6cc0baf6f443017e35e056a525bb.

* Don't revert everything Lysandre!
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

e9688875

Add GPTJForQuestionAnswering (#14503) · 0f3f045e

tucan9389 authored Dec 07, 2021



* Add GPTJForQuestionAnswering

* Reformat for GPTJForQuestionAnswering

* Fix isort error

* make style for GPTJForQA

* Add _keys_to_ignore_on_load_missing

* Change the sequence of qa and classification
Co-authored-by: Suraj Patil <surajp815@gmail.com>

0f3f045e

03 Dec, 2021 2 commits

[trainer] add tf32-mode control (#14606) · 71b1bf7e

Stas Bekman authored Dec 03, 2021



* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71b1bf7e

Improve tokenizer tests (#13594) · 66ea7391

Li-Huai (Allan) Lin authored Dec 03, 2021

* Use new method to acquire tokenizers

* Resolve TODOs.

* Style

* Fix

* Enable do_lower_case in test_tokenize_special_tokens

* Apply suggestion from code review

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291b1f71e5bc3604ca281c000000c4648.

* Fix FNet mask token tokenization

* Complete everything

* Apply suggestions from code review

66ea7391

02 Dec, 2021 2 commits

fix #14524 (IndexError when mask prob is too low) (#14525) · 6645eb61

Nik authored Dec 02, 2021

* fix #14524 (IndexError when mask prob is too low)

* fix formatting

* correct documentation, add option for setting min_num_masks

* change the semantic meaning of `mask_prob` in _compute_mask_indices

With this commit the meaing of `mask_prob` actually adhered to the probability for each
vector to be the start of a masked span of length.

* fix check_copies test

* fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices

* fix typo

6645eb61

[Flax] Add FlaxBlenderbotSmall (#14576) · 50d909be

Daniel Stancl authored Dec 02, 2021



* [WIP] Add FlaxBlenderbotSmall

* Revert some unintentionally changed files

Revert some unintentionally files changed by improperly filled cookiecutter instructions.

* Fix repo consistency

* Fix Flax-PT equivalence

* Apply suggestions from code review

* Update index.mdx

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

50d909be

01 Dec, 2021 2 commits

FlaxGPTJ (#14396) · 4c0dd199

Suraj Patil authored Dec 01, 2021

* add flax gptj

* no bias in attention dense

* no wpe

* fix rotary embeddings

* fix rotary embeds

* fix rotray embeds

* quality

* doc and quality

* fix equivalence tests

4c0dd199

WIP: Support for Training with BF16 (#13207) · 70996a54

Jamie DeAntonis authored Nov 30, 2021



* started bf16 integration

* minor changes

* code now runs

* style

* lay foundation for bf16 testing

* lay foundation for bf16 testing

* start the tests

* better bf16 check

* style

* 2 separate checkers - one for bf16 support, another for bf16+autocast

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* a couple of comment resolutions

* more comment resolutions

* resolved a small bug

* just some print statemtns

* added todo marking

* added a todo

* adjust for API change s/fast_dtype/dtype/

* fix style

* merge 2 bf16 util functions

* bf16 now does scaling too

* Add support for bfloat16

* Revert T5 layernorm to float32

This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920

 .

* Add comment about conversion to float32 before returning the numpy data

* Add comment about AMP-bfloat16 incompatibility

* Fix formatting

* typo

* reformer / bf16

* cleanup

* require at least pt-1.10

* fix

* will deal with deepspeed separately

* cleanup

* revert

* cleanup

* fp16_full_eval and bf16_full_eval are separate modes

* proper deprecation

* cleanup

* test and fixes

* spelling

* cleanup

* add a note that this API is experimental
Co-authored-by: jamie <jamie@cortx.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: suriya <suriya@cortx.com>
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>

70996a54

30 Nov, 2021 3 commits

VisionTextDualEncoder (#13511) · fc1d97f2

Suraj Patil authored Nov 30, 2021



* init vision_text_dual_encoder

* fix merge

* remove extra heads

* fix tests

* remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP

* remove archive map

* fix imports

* fix more imports

* fix init

* delete tokenizers

* fix imports

* clean

* support clip's vision model

* handle None config

* begin tests

* more test and few fixes

* warn about newly init weights

* more tests

* add loss to model

* remove extra classes from doc

* add processor

* doc and small fixes

* add start docstr

* update flax model

* flax tests

* more flax tests

* doc

* quality

* doc and quality

* fix doc

* doc

* remove comments

* update warning

* quality

* fix docs

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* replace asserts, fix imports

* update imports

* fix import

* address some review comments

* fix check

* reduce tolerance

* fix test

* add flax integration test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

* fix style

* add pt_flax_equivalence test in PT tests

* add pt integration test

* update test

* use pre-trained checkpoint in examples
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fc1d97f2

[Flax] Add FlaxBlenderbot (#13633) · faacd747

Daniel Stancl authored Nov 30, 2021



* Init Flax implementation for Blenderbot

* Add a majority of stuff except for tests

* make style quality

* Add tests and fix some bugs

* Add tests

* Clean source code and fix some bugs

* Fix copies and docs

* Fix jax device condition for tests

* Fix layer norm in the encoder

* Fix a few typos in the test file

* make fix-copies

* make fix-copies

* fix layer norm

* Fix Flax params dtype (#13090)

* Fix PR reference (#13098)

* make fix-copies

* Update tests/test_modeling_flax_blenderbot.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

faacd747

Tapas tf (#13393) · c468a87a

Kamal Raj authored Nov 30, 2021

* TF Tapas first commit

* updated docs

* updated logger message

* updated pytorch weight conversion
script to support scalar array

* added use_cache to tapas model config to
work properly with tf input_processing

* 1. rm embeddings_sum
2. added # Copied
3. + TFTapasMLMHead
4. and lot other small fixes

* updated docs

* + test for tapas

* updated testing_utils to check
is_tensorflow_probability_available

* converted model logits post processing using
numpy to work with both PT and TF models

* + TFAutoModelForTableQuestionAnswering

* added TF support

* added test for
TFAutoModelForTableQuestionAnswering

* added test for
TFAutoModelForTableQuestionAnswering pipeline

* updated auto model docs

* fixed typo in import

* added tensorflow_probability to run tests

* updated MLM head

* updated tapas.rst with TF  model docs

* fixed optimizer import in docs

* updated convert to np
data from pt model is not
`transformers.tokenization_utils_base.BatchEncoding`
after pipeline upgrade

* updated pipeline:
1. with torch.no_gard removed, pipeline forward handles
2. token_type_ids converted to numpy

* updated docs.

* removed `use_cache` from config

* removed floats_tensor

* updated code comment

* updated Copyright Year and
logits_aggregation Optional

* updated docs and comments

* updated docstring

* fixed model weight loading

* make fixup

* fix indentation

* added tf slow pipeline test

* pip upgrade

* upgrade python to 3.7

* removed from_pt from tests

* revert commit f18cfa9

c468a87a

29 Nov, 2021 1 commit
- Rename ImageGPT (#14526) · 25156eb2
  NielsRogge authored Nov 29, 2021
```
* Rename

* Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING
```
  25156eb2
25 Nov, 2021 1 commit
- Fix a slow test. (#14527) · 04683c06
  Nicolas Patry authored Nov 25, 2021
  
  04683c06
24 Nov, 2021 1 commit
- [Tests] Improve vision tests (#14458) · 3772af49
  NielsRogge authored Nov 24, 2021
```
* Improve tests

* Install vision for tf tests
```
  3772af49
23 Nov, 2021 1 commit

[deepspeed] zero inference (#14253) · 956a4831

Stas Bekman authored Nov 23, 2021



* [deepspeed] zero inference

* only z3 makes sense for inference

* fix and style

* docs

* rework

* fix test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* responding to suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

956a4831

22 Nov, 2021 2 commits

Auto processor (#14465) · 204d2513

Sylvain Gugger authored Nov 22, 2021



* Add AutoProcessor class

* Init and tests

* Add doc

* Fix init

* Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverts to tokenizer or feature extractor when available

* Adapt test
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

204d2513

Moving pipeline tests from `Narsil` to `hf-internal-testing`. (#14463) · a4553e6c
Nicolas Patry authored Nov 22, 2021
```
* Moving everything to `hf-internal-testing`.

* Fixing test values.

* Moving to other repo.

* Last touch?
```
a4553e6c

19 Nov, 2021 4 commits

Add QDQBert model and quantization examples of SQUAD task (#14066) · a59e7c1e

Shang Zhang authored Nov 19, 2021



* clean up branch for add-qdqbert-model

* README update for QAT example; update docstrings in modeling_qdqbert.py

* Update qdqbert.rst

* Update README.md

* Update README.md

* calibration data using traning set; QAT example runs in fp32

* re-use BERTtokenizer for qdqbert

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove qdqbert tokenizer

* Update qdqbert.rst

* update evaluate-hf-trt-qa.py

* update configuration_qdqbert.py

* update modeling_qdqbert.py: add copied statement; replace assert with ValueError

* update copied from statement

* add is_quantization_available; run make fix-copies

* unittest add require_quantization

* add backend dependency to qdqbert model

* update README; update evaluate script; make style

* lint

* docs qdqbert update

* circleci build_doc add pytorch-quantization for qdqbert

* update README

* update example readme with instructions to upgrade TensorRT to 8.2

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change quantization to pytorch_quantization for backend requirement

* feed_forward_chunking not supported in QDQBert

* make style

* update model docstrings and comments in testing scripts

* rename example to quantization-qdqbert; rename example scripts from qat to quant

* Update src/transformers/models/qdqbert/modeling_qdqbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rm experimental functions in quant_trainer

* qa cleanup

* make fix-copies for docs index.rst

* fix doctree; use post_init() for qdqbert

* fix early device assignment for qdqbert

* fix CI:Model templates runner
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a59e7c1e

Adding support for `hidden_states` and `attentions` in unbatching (#14420) · 81fe8afa
Nicolas Patry authored Nov 19, 2021
```
support.
```
81fe8afa
[Generation] Allow `inputs_embeds` as an input (#14443) · f25a9332
Patrick von Platen authored Nov 19, 2021
```
* up

* finalize

* finalize

* finish

* Update src/transformers/generation_utils.py

* apply feedback
```
f25a9332
[ImageGPT] Small fixes (#14460) · 0490b988
NielsRogge authored Nov 19, 2021
```
* Add integration test

* Fix typo
```
0490b988

18 Nov, 2021 3 commits

Fix finite IterableDataset test on multiple GPUs (#14445) · 83ef8bca
Sylvain Gugger authored Nov 18, 2021

83ef8bca

Add ImageGPT (#14240) · da36c557

NielsRogge authored Nov 18, 2021

* First draft

* More improvements

* Improve conversion script

* Fix init weights for layer norm

* Fix correct model for conversion script

* Don't tie input and output embeddings

* Add print statements for debugging

* Add print statements for debugging

* Fix vocab size of model

* Improve documentation, remove fast tokenizer

* Add ImageGPTForImageClassification, improve docs

* Fix docs issue

* Set verbosity level back to info

* Improve tests

* Fix tests and add figure

* Delete tokenizer file

* Remove ImageGPTTokenizer from init files

* Remove ImageGPTLayer from init files

* Remove ImageGPT tokenizer from docs

* First draft of ImageGPTFeatureExtractor

* Fix typo

* Fix bug

* More improvements

* Apply suggestions from code review, add tests for feature extractor

* Fix layernorm

* Update save_pretrained method

* Fix issue

* Make all tests of ImageGPTFeatureExtractor pass

* Update code examples

* Rename model inputs to pixel_values

* Improve code examples

* Update init_weights to post_init

* Fix post_init

da36c557

Add a post init method to all models (#14431) · d83b0e0c

Sylvain Gugger authored Nov 18, 2021

* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save

d83b0e0c

17 Nov, 2021 3 commits

[WIP] Ensure TF model configs can be converted to proper JSON (#14415) · 1991da07

N authored Nov 17, 2021



* test: make sure model configs are jsonifiable

* fix: return python dict instead of config object

* fix: accept pretrained config and use correct class

* Re-enabling slow tests and applying them to core models only

* Re-enabling slow tests and applying them to core models only

* Add new test file to fetcher

* Remove tooslow tests from test_modeling_tf_common.py

* make style

* Style fixes

* Style fixes

* Style fixes

* Style fixes

* Adding core tests to GPT2 and BART

* Removing unused imports
Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com>
Co-authored-by: matt <rocketknight1@gmail.com>

1991da07

Improve semantic segmentation models (#14355) · a2864a50

NielsRogge authored Nov 17, 2021

* Improve tests

* Improve documentation

* Add ignore_index attribute

* Add semantic_ignore_index to BEiT model

* Add segmentation maps argument to BEiTFeatureExtractor

* Simplify SegformerFeatureExtractor and corresponding tests

* Improve tests

* Apply suggestions from code review

* Minor docs improvements

* Streamline segmentation map tests of SegFormer and BEiT

* Improve reduce_labels docs and test

* Fix code quality

* Fix code quality again

a2864a50

[Wav2Vec2] Add New Wav2Vec2 Translation (#14392) · 700a748f

Patrick von Platen authored Nov 17, 2021

* add new wav2vec2 translation

* correct

* up

* add tests

* correct end copy

* correct more

* up

* correct unispeech sat

* finish

* finalize

* finish

* up

700a748f

16 Nov, 2021 2 commits

Avoid looping when data exhausted (#14413) · a33168aa

Valentin authored Nov 16, 2021

* stop training when a finite IterableDataset is exhausted

when using an iterable dataset num_epochs is set to
sys.maxsize to make sure all data is consumed
likewise we want to set max_steps high enough
but still stop when all data is consumed

(cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12)

* fix typo flase -> false

* add test for stopping training on exhausted finite iterable dataset

* remove redundant gradient_accumulation_steps

* run make style

reformat training_args docstring

a33168aa

Fix gradient_checkpointing backward compatibility (#14408) · 040fd471

Sylvain Gugger authored Nov 16, 2021



* Fix gradient_checkpointing backward compatibility

* Remove needless line

* make sure mask prob is big enough and length small enough

* Fix tests
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

040fd471

15 Nov, 2021 4 commits

Allow per-version configurations (#14344) · 1cc453d3

Lysandre Debut authored Nov 15, 2021



* Allow per-version configurations

* Update tests/test_configuration_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_configuration_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1cc453d3

Fix weight loading issue (#14016) · a67d47b4
Yih-Dar authored Nov 15, 2021
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a67d47b4
Fix test and docs (#14399) · 74e6111b
NielsRogge authored Nov 15, 2021

74e6111b

[Speech2Text2] Enable tokenizers (#14390) · 4ce74edf

Patrick von Platen authored Nov 15, 2021



* [Speech2Text2] Enable tokenizers

* minor fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4ce74edf

13 Nov, 2021 1 commit

[M2M100Tokenizer] fix _build_translation_inputs (#14382) · 2e60276b

Suraj Patil authored Nov 13, 2021



* add return_tensors paramter

* fix test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

2e60276b

12 Nov, 2021 1 commit

Adding support for raw python `generator` in addition to `Dataset` for pipelines (#14352) · ed5d1551

Nicolas Patry authored Nov 12, 2021

* Adding support for raw python `generator` in addition to `Dataset`

The main goal is to ease the create of streaming data to the pipe.

`Dataset` is more involved and pytorch specific.

This PR, provides a way to use a python iterator too.
This enabled #14250 but can be proposed as a standalone PR.

```python
from transformers import pipeline

def read_data(filename):
    with open(filename, 'r') as f:
        for line in f:
            yield f

pipe = pipeline("text-classification")
for classified in pipe(read_data("large_file.txt")):
    print("Success ! ", classified)
```

The main caveat of this, is the interaction with `DataLoader` with
`num_workers>1`. When you have multiple workers, each receive a copy
of the generator (like `IterableDataset`). That means the naive Iterator
will fail since all workers iterate on all items of the generator.

There are ways to do clever "skipping", but it could be bad still
because all workers still do have to pass through all items of the
generator (they just ignore items they don't handle), depending on
the case it might be bad.

Using `num_workers=1` is the simplest fix and if the cost of loading
your data is small enough should be good enough. In the above example
trying to do smart tricks to skip some lines is unlikely to be a net
positive for instance.

If there are better ways to do "jumps" on some data, then using
`Dataset` is more advised (since then differents workers can just jump
themselves).

* Adding iterator support for `tf` too.

ed5d1551