Commits · 65b20b739b69cc13a10a6b10351ba31f4ecbef8c · chenpangpang / transformers

08 Dec, 2021 3 commits

NielsRogge authored Dec 08, 2021

* First draft

* Style and remove mlm

* Make forward pass work

* More improvements

* More improvements

* Fix bug

* More improvements

* More improvements

* Add PerceiverTokenizer first draft

* Improve conversion script

* More improvements

* Make conversion script work for the encoder

* Make conversion script work with local pickle files

* Style & quality, fix-copies

* Add dummy input to conversion script

* Add absolute position embeddings to TextPreProcessor

* Make forward pass of encoder work

* More improvements

* Move text preprocessor to separate script

* More improvements

* More improvements

* Add post processor

* Make MLM model work

* Style

* Add PerceiverForMaskedLM

* Add PerceiverImagePreprocessor

* Make style

* Make PerceiverForImageClassification work

* More improvements

* More improvements

* Use tokenizer in conversion script

* Use PerceiverForMaskedLM in conversion script

* Define custom PerceiverModelOutput

* Improve PerceiverAttention to make it work for both MLM and image classification

* More improvements

* More improvements

* More improvements to the conversion script

* Make conversion script work for both MLM and image classification

* Add PerceiverFeatureExtractor

* More improvements

* Style and quality

* Add center cropping

* Fix bug

* Small fix

* Add print statement

* Fix bug in image preprocessor

* Fix bug with conversion script

* Make output position embeddings an nn.Parameter layer instead of nn.Embedding

* Comment out print statements

* Add position encoding classes

* More improvements

* Use position_encoding_kwargs

* Add PerceiverForImageClassificationFourier

* Make style & quality

* Add PerceiverForImageClassificationConvProcessing

* Style & quality

* Add flow model

* Move processors to modeling file

* Make position encodings modular

* Make basic decoder use modular position encodings

* Add PerceiverForOpticalFlow to conversion script

* Add AudioPreprocessor

* Make it possible for the basic decoder to use Fourier position embeddings

* Add PerceiverForMultimodalAutoencoding

* Improve model for optical flow

* Improve _build_network_inputs method

* Add print statement

* Fix device issue

* Fix device of Fourier embeddings

* Add print statements for debugging

* Add another print statement

* Add another print statement

* Add another print statement

* Add another print statement

* Improve PerceiverAudioPreprocessor

* Improve conversion script for multimodal modal

* More improvements

* More improvements

* Improve multimodal model

* Make forward pass multimodal model work

* More improvements

* Improve tests

* Fix some more tests

* Add output dataclasses

* Make more tests pass

* Add print statements for debuggin

* Add tests for image classification

* Add PerceiverClassifierOutput

* More improvements

* Make more tests pass for the optical flow model

* Make style & quality

* Small improvements

* Don't support training for optical flow model for now

* Fix _prepare_for_class for tests

* Make more tests pass, add some docs

* Add multimodal model to tests

* Minor fixes

* Fix tests

* Improve conversion script

* Make fixup

* Remove pos_dim argument

* Fix device issue

* Potential fix for OOM

* Revert previous commit

* Fix test_initialization

* Add print statements for debugging

* Fix print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Add print statement

* Remove need for output_shape

* Comment out output_shape

* Remove unnecessary code

* Improve docs

* Fix make fixup

* Remove PerceiverTextProcessor from init

* Improve docs

* Small improvement

* Apply first batch of suggestions from code review

* Apply more suggestions from code review

* Update docstrings

* Define dicts beforehand for readability

* Rename task to architecture in conversion script, include PerceiverModel in tests

* Add print statements for debugging

* Fix tests on GPU

* Remove preprocessors, postprocessors and decoders from main init

* Add integration test

* Fix docs

* Replace einops by torch

* Update for new docs frontend

* Rename PerceiverForImageClassification

* Improve docs

* Improve docs

* Improve docs of PerceiverModel

* Fix some more tests

* Improve center_crop

* Add PerceiverForSequenceClassification

* Small improvements

* Fix tests

* Add integration test for optical flow model

* Clean up

* Add tests for tokenizer

* Fix tokenizer by adding special tokens properly

* Fix CI

65b20b73

[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339) · 961732c2

Patrick von Platen authored Dec 08, 2021



* up

* up

* up

* make it cleaner

* correct

* make styhahalal

* add more tests

* finish

* small fix

* make style

* up

* tryout to solve cicrle ci

* up

* fix more tests

* fix more tests

* apply sylvains suggestions

* fix import

* correct docs

* add pyctcdecode only to speech tests

* fix more tests

* add tf, flax and pt tests

* add pt

* fix last tests

* fix more tests

* Apply suggestions from code review

* change lines

* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* correct tests

* correct tests

* add doc string
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

961732c2

Fixing Dataset for TQA + token-classification. (#14658) · 2e12d90b

Nicolas Patry authored Dec 08, 2021

* Fixing Dataset for TQA + token-classification.

* Fixing the tests.

* Making sure `offset_mappings` is a valid argument.

2e12d90b

07 Dec, 2021 2 commits

[deepspeed] fix --load_best_model_at_end (#14652) · b66c5ab2

Stas Bekman authored Dec 06, 2021

* [deepspeed] fix load_best_model_at_end

* try with pull_request_target

* revert: try with pull_request_target

* style

* add test

* cleanup

b66c5ab2

Add mLUKE (#14640) · 30646a0a

Ryokan RI authored Dec 07, 2021

* implement MLukeTokenizer and LukeForMaskedLM

* update tests

* update docs

* add LukeForMaskedLM to check_repo.py

* update README

* fix test and specify the entity pad id in tokenization_(m)luke

* fix EntityPredictionHeadTransform

30646a0a

06 Dec, 2021 3 commits

Use cross_attention_hidden_size in Encoder-Decoder models (#14378) · 4cdb67ca

Yih-Dar authored Dec 07, 2021



* add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)

* for TFEncoderDecoderModel

* add equivalence test for TFEncoderDecoderModel

* fix

* fix failed equivalence tests

* remove unused import

* add detailed comment

* Fix check_equivalence_tf_to_pt by using encoder/decoder

* cleaning

* Use cross_attention_hidden_size in speech-to-text

* clean fast init logging msg in encoder decoder models

* increase tol from 1e-5 to 1e-3 for tf test

* style

* style

* make sure projection layer can run

* remove type conversion + add check

* fix conflict (config.output_hidden_size)

* Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4cdb67ca

Auto processor fix (#14623) · e9688875

Lysandre Debut authored Dec 06, 2021



* Add AutoProcessor class
Init and tests
Add doc
Fix init
Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Reverts to tokenizer or feature extractor when available
Adapt test

* Revert "Adapt test"

This reverts commit bbdde5fab02465f24b54b227390073082cb32093.

* Revert "Reverts to tokenizer or feature extractor when available"

This reverts commit 77659ff5d21b6cc0baf6f443017e35e056a525bb.

* Don't revert everything Lysandre!
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

e9688875

Add GPTJForQuestionAnswering (#14503) · 0f3f045e

tucan9389 authored Dec 07, 2021



* Add GPTJForQuestionAnswering

* Reformat for GPTJForQuestionAnswering

* Fix isort error

* make style for GPTJForQA

* Add _keys_to_ignore_on_load_missing

* Change the sequence of qa and classification
Co-authored-by: Suraj Patil <surajp815@gmail.com>

0f3f045e

03 Dec, 2021 2 commits

[trainer] add tf32-mode control (#14606) · 71b1bf7e

Stas Bekman authored Dec 03, 2021



* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71b1bf7e

Improve tokenizer tests (#13594) · 66ea7391

Li-Huai (Allan) Lin authored Dec 03, 2021

* Use new method to acquire tokenizers

* Resolve TODOs.

* Style

* Fix

* Enable do_lower_case in test_tokenize_special_tokens

* Apply suggestion from code review

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291b1f71e5bc3604ca281c000000c4648.

* Fix FNet mask token tokenization

* Complete everything

* Apply suggestions from code review

66ea7391

02 Dec, 2021 2 commits

fix #14524 (IndexError when mask prob is too low) (#14525) · 6645eb61

Nik authored Dec 02, 2021

* fix #14524 (IndexError when mask prob is too low)

* fix formatting

* correct documentation, add option for setting min_num_masks

* change the semantic meaning of `mask_prob` in _compute_mask_indices

With this commit the meaing of `mask_prob` actually adhered to the probability for each
vector to be the start of a masked span of length.

* fix check_copies test

* fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices

* fix typo

6645eb61

[Flax] Add FlaxBlenderbotSmall (#14576) · 50d909be

Daniel Stancl authored Dec 02, 2021



* [WIP] Add FlaxBlenderbotSmall

* Revert some unintentionally changed files

Revert some unintentionally files changed by improperly filled cookiecutter instructions.

* Fix repo consistency

* Fix Flax-PT equivalence

* Apply suggestions from code review

* Update index.mdx

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

50d909be

01 Dec, 2021 2 commits

FlaxGPTJ (#14396) · 4c0dd199

Suraj Patil authored Dec 01, 2021

* add flax gptj

* no bias in attention dense

* no wpe

* fix rotary embeddings

* fix rotary embeds

* fix rotray embeds

* quality

* doc and quality

* fix equivalence tests

4c0dd199

WIP: Support for Training with BF16 (#13207) · 70996a54

Jamie DeAntonis authored Nov 30, 2021



* started bf16 integration

* minor changes

* code now runs

* style

* lay foundation for bf16 testing

* lay foundation for bf16 testing

* start the tests

* better bf16 check

* style

* 2 separate checkers - one for bf16 support, another for bf16+autocast

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* a couple of comment resolutions

* more comment resolutions

* resolved a small bug

* just some print statemtns

* added todo marking

* added a todo

* adjust for API change s/fast_dtype/dtype/

* fix style

* merge 2 bf16 util functions

* bf16 now does scaling too

* Add support for bfloat16

* Revert T5 layernorm to float32

This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920

 .

* Add comment about conversion to float32 before returning the numpy data

* Add comment about AMP-bfloat16 incompatibility

* Fix formatting

* typo

* reformer / bf16

* cleanup

* require at least pt-1.10

* fix

* will deal with deepspeed separately

* cleanup

* revert

* cleanup

* fp16_full_eval and bf16_full_eval are separate modes

* proper deprecation

* cleanup

* test and fixes

* spelling

* cleanup

* add a note that this API is experimental
Co-authored-by: jamie <jamie@cortx.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: suriya <suriya@cortx.com>
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>

70996a54

30 Nov, 2021 3 commits

VisionTextDualEncoder (#13511) · fc1d97f2

Suraj Patil authored Nov 30, 2021



* init vision_text_dual_encoder

* fix merge

* remove extra heads

* fix tests

* remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP

* remove archive map

* fix imports

* fix more imports

* fix init

* delete tokenizers

* fix imports

* clean

* support clip's vision model

* handle None config

* begin tests

* more test and few fixes

* warn about newly init weights

* more tests

* add loss to model

* remove extra classes from doc

* add processor

* doc and small fixes

* add start docstr

* update flax model

* flax tests

* more flax tests

* doc

* quality

* doc and quality

* fix doc

* doc

* remove comments

* update warning

* quality

* fix docs

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* replace asserts, fix imports

* update imports

* fix import

* address some review comments

* fix check

* reduce tolerance

* fix test

* add flax integration test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

* fix style

* add pt_flax_equivalence test in PT tests

* add pt integration test

* update test

* use pre-trained checkpoint in examples
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fc1d97f2

[Flax] Add FlaxBlenderbot (#13633) · faacd747

Daniel Stancl authored Nov 30, 2021



* Init Flax implementation for Blenderbot

* Add a majority of stuff except for tests

* make style quality

* Add tests and fix some bugs

* Add tests

* Clean source code and fix some bugs

* Fix copies and docs

* Fix jax device condition for tests

* Fix layer norm in the encoder

* Fix a few typos in the test file

* make fix-copies

* make fix-copies

* fix layer norm

* Fix Flax params dtype (#13090)

* Fix PR reference (#13098)

* make fix-copies

* Update tests/test_modeling_flax_blenderbot.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

faacd747

Tapas tf (#13393) · c468a87a

Kamal Raj authored Nov 30, 2021

* TF Tapas first commit

* updated docs

* updated logger message

* updated pytorch weight conversion
script to support scalar array

* added use_cache to tapas model config to
work properly with tf input_processing

* 1. rm embeddings_sum
2. added # Copied
3. + TFTapasMLMHead
4. and lot other small fixes

* updated docs

* + test for tapas

* updated testing_utils to check
is_tensorflow_probability_available

* converted model logits post processing using
numpy to work with both PT and TF models

* + TFAutoModelForTableQuestionAnswering

* added TF support

* added test for
TFAutoModelForTableQuestionAnswering

* added test for
TFAutoModelForTableQuestionAnswering pipeline

* updated auto model docs

* fixed typo in import

* added tensorflow_probability to run tests

* updated MLM head

* updated tapas.rst with TF  model docs

* fixed optimizer import in docs

* updated convert to np
data from pt model is not
`transformers.tokenization_utils_base.BatchEncoding`
after pipeline upgrade

* updated pipeline:
1. with torch.no_gard removed, pipeline forward handles
2. token_type_ids converted to numpy

* updated docs.

* removed `use_cache` from config

* removed floats_tensor

* updated code comment

* updated Copyright Year and
logits_aggregation Optional

* updated docs and comments

* updated docstring

* fixed model weight loading

* make fixup

* fix indentation

* added tf slow pipeline test

* pip upgrade

* upgrade python to 3.7

* removed from_pt from tests

* revert commit f18cfa9

c468a87a

29 Nov, 2021 1 commit
- Rename ImageGPT (#14526) · 25156eb2
  NielsRogge authored Nov 29, 2021
```
* Rename

* Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING
```
  25156eb2
25 Nov, 2021 1 commit
- Fix a slow test. (#14527) · 04683c06
  Nicolas Patry authored Nov 25, 2021
  
  04683c06
24 Nov, 2021 1 commit
- [Tests] Improve vision tests (#14458) · 3772af49
  NielsRogge authored Nov 24, 2021
```
* Improve tests

* Install vision for tf tests
```
  3772af49
23 Nov, 2021 1 commit

[deepspeed] zero inference (#14253) · 956a4831

Stas Bekman authored Nov 23, 2021



* [deepspeed] zero inference

* only z3 makes sense for inference

* fix and style

* docs

* rework

* fix test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* responding to suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

956a4831

22 Nov, 2021 2 commits

Auto processor (#14465) · 204d2513

Sylvain Gugger authored Nov 22, 2021



* Add AutoProcessor class

* Init and tests

* Add doc

* Fix init

* Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverts to tokenizer or feature extractor when available

* Adapt test
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

204d2513

Moving pipeline tests from `Narsil` to `hf-internal-testing`. (#14463) · a4553e6c
Nicolas Patry authored Nov 22, 2021
```
* Moving everything to `hf-internal-testing`.

* Fixing test values.

* Moving to other repo.

* Last touch?
```
a4553e6c

19 Nov, 2021 4 commits

Add QDQBert model and quantization examples of SQUAD task (#14066) · a59e7c1e

Shang Zhang authored Nov 19, 2021



* clean up branch for add-qdqbert-model

* README update for QAT example; update docstrings in modeling_qdqbert.py

* Update qdqbert.rst

* Update README.md

* Update README.md

* calibration data using traning set; QAT example runs in fp32

* re-use BERTtokenizer for qdqbert

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/qdqbert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove qdqbert tokenizer

* Update qdqbert.rst

* update evaluate-hf-trt-qa.py

* update configuration_qdqbert.py

* update modeling_qdqbert.py: add copied statement; replace assert with ValueError

* update copied from statement

* add is_quantization_available; run make fix-copies

* unittest add require_quantization

* add backend dependency to qdqbert model

* update README; update evaluate script; make style

* lint

* docs qdqbert update

* circleci build_doc add pytorch-quantization for qdqbert

* update README

* update example readme with instructions to upgrade TensorRT to 8.2

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/qdqbert/configuration_qdqbert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* change quantization to pytorch_quantization for backend requirement

* feed_forward_chunking not supported in QDQBert

* make style

* update model docstrings and comments in testing scripts

* rename example to quantization-qdqbert; rename example scripts from qat to quant

* Update src/transformers/models/qdqbert/modeling_qdqbert.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rm experimental functions in quant_trainer

* qa cleanup

* make fix-copies for docs index.rst

* fix doctree; use post_init() for qdqbert

* fix early device assignment for qdqbert

* fix CI:Model templates runner
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a59e7c1e

Adding support for `hidden_states` and `attentions` in unbatching (#14420) · 81fe8afa
Nicolas Patry authored Nov 19, 2021
```
support.
```
81fe8afa
[Generation] Allow `inputs_embeds` as an input (#14443) · f25a9332
Patrick von Platen authored Nov 19, 2021
```
* up

* finalize

* finalize

* finish

* Update src/transformers/generation_utils.py

* apply feedback
```
f25a9332
[ImageGPT] Small fixes (#14460) · 0490b988
NielsRogge authored Nov 19, 2021
```
* Add integration test

* Fix typo
```
0490b988

18 Nov, 2021 3 commits

Fix finite IterableDataset test on multiple GPUs (#14445) · 83ef8bca
Sylvain Gugger authored Nov 18, 2021

83ef8bca

Add ImageGPT (#14240) · da36c557

NielsRogge authored Nov 18, 2021

* First draft

* More improvements

* Improve conversion script

* Fix init weights for layer norm

* Fix correct model for conversion script

* Don't tie input and output embeddings

* Add print statements for debugging

* Add print statements for debugging

* Fix vocab size of model

* Improve documentation, remove fast tokenizer

* Add ImageGPTForImageClassification, improve docs

* Fix docs issue

* Set verbosity level back to info

* Improve tests

* Fix tests and add figure

* Delete tokenizer file

* Remove ImageGPTTokenizer from init files

* Remove ImageGPTLayer from init files

* Remove ImageGPT tokenizer from docs

* First draft of ImageGPTFeatureExtractor

* Fix typo

* Fix bug

* More improvements

* Apply suggestions from code review, add tests for feature extractor

* Fix layernorm

* Update save_pretrained method

* Fix issue

* Make all tests of ImageGPTFeatureExtractor pass

* Update code examples

* Rename model inputs to pixel_values

* Improve code examples

* Update init_weights to post_init

* Fix post_init

da36c557

Add a post init method to all models (#14431) · d83b0e0c

Sylvain Gugger authored Nov 18, 2021

* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save

d83b0e0c

17 Nov, 2021 3 commits

[WIP] Ensure TF model configs can be converted to proper JSON (#14415) · 1991da07

N authored Nov 17, 2021



* test: make sure model configs are jsonifiable

* fix: return python dict instead of config object

* fix: accept pretrained config and use correct class

* Re-enabling slow tests and applying them to core models only

* Re-enabling slow tests and applying them to core models only

* Add new test file to fetcher

* Remove tooslow tests from test_modeling_tf_common.py

* make style

* Style fixes

* Style fixes

* Style fixes

* Style fixes

* Adding core tests to GPT2 and BART

* Removing unused imports
Co-authored-by: niklas.fruehauf <niklas.fruehauf@sovanta.com>
Co-authored-by: matt <rocketknight1@gmail.com>

1991da07

Improve semantic segmentation models (#14355) · a2864a50

NielsRogge authored Nov 17, 2021

* Improve tests

* Improve documentation

* Add ignore_index attribute

* Add semantic_ignore_index to BEiT model

* Add segmentation maps argument to BEiTFeatureExtractor

* Simplify SegformerFeatureExtractor and corresponding tests

* Improve tests

* Apply suggestions from code review

* Minor docs improvements

* Streamline segmentation map tests of SegFormer and BEiT

* Improve reduce_labels docs and test

* Fix code quality

* Fix code quality again

a2864a50

[Wav2Vec2] Add New Wav2Vec2 Translation (#14392) · 700a748f

Patrick von Platen authored Nov 17, 2021

* add new wav2vec2 translation

* correct

* up

* add tests

* correct end copy

* correct more

* up

* correct unispeech sat

* finish

* finalize

* finish

* up

700a748f

16 Nov, 2021 2 commits

Avoid looping when data exhausted (#14413) · a33168aa

Valentin authored Nov 16, 2021

* stop training when a finite IterableDataset is exhausted

when using an iterable dataset num_epochs is set to
sys.maxsize to make sure all data is consumed
likewise we want to set max_steps high enough
but still stop when all data is consumed

(cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12)

* fix typo flase -> false

* add test for stopping training on exhausted finite iterable dataset

* remove redundant gradient_accumulation_steps

* run make style

reformat training_args docstring

a33168aa

Fix gradient_checkpointing backward compatibility (#14408) · 040fd471

Sylvain Gugger authored Nov 16, 2021



* Fix gradient_checkpointing backward compatibility

* Remove needless line

* make sure mask prob is big enough and length small enough

* Fix tests
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

040fd471

15 Nov, 2021 4 commits

Allow per-version configurations (#14344) · 1cc453d3

Lysandre Debut authored Nov 15, 2021



* Allow per-version configurations

* Update tests/test_configuration_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_configuration_common.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1cc453d3

Fix weight loading issue (#14016) · a67d47b4
Yih-Dar authored Nov 15, 2021
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
a67d47b4
Fix test and docs (#14399) · 74e6111b
NielsRogge authored Nov 15, 2021

74e6111b

[Speech2Text2] Enable tokenizers (#14390) · 4ce74edf

Patrick von Platen authored Nov 15, 2021



* [Speech2Text2] Enable tokenizers

* minor fix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4ce74edf

13 Nov, 2021 1 commit

[M2M100Tokenizer] fix _build_translation_inputs (#14382) · 2e60276b

Suraj Patil authored Nov 13, 2021



* add return_tensors paramter

* fix test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

2e60276b