Commits · 8b240a06617455eae59e1116af6a1a016664e963 · chenpangpang / transformers

12 Oct, 2021 1 commit

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

08 Oct, 2021 1 commit

Image Segmentation pipeline (#13828) · 026866df

Mishig Davaadorj authored Oct 08, 2021



* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

026866df

29 Sep, 2021 1 commit

Keras callback to push to hub each epoch, or after N steps (#13773) · 3a8a8013

Matt authored Sep 29, 2021



* Keras callback to push to hub each epoch, or after N steps

* Reworked the callback to use Repository

* Use an Enum for save_strategy

* Style pass

* Correct type for tokenizer

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding print message to the final upload

* Adding print message to the final upload

* Change how we wait for the last process to finish

* is_done is a property, not a method, derp

* Docstrings and documentation

* Style pass

* Style edit

* Docstring reformat

* Docstring rewrite

* Replacing print with internal logger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3a8a8013

14 Sep, 2021 1 commit

Push to hub when saving checkpoints (#13503) · 3081d386

Sylvain Gugger authored Sep 14, 2021

* Push to hub when saving checkpoints

* Add model card

* Revert partial model card

* Small fix for checkpoint

* Add tests

* Add documentation

* Fix tests

* Bump huggingface_hub

* Fix test

3081d386

10 Sep, 2021 1 commit

[Large PR] Entire rework of pipelines. (#13308) · c63fcabf

Nicolas Patry authored Sep 10, 2021



* Enabling dataset iteration on pipelines.

Enabling dataset iteration on pipelines.

Unifying parameters under `set_parameters` function.

Small fix.

Last fixes after rebase

Remove print.

Fixing text2text `generate_kwargs`

No more `self.max_length`.

Fixing tf only conversational.

Consistency in start/stop index over TF/PT.

Speeding up drastically on TF (nasty bug where max_length would increase
a ton.)

Adding test for support for non fast tokenizers.

Fixign GPU usage on zero-shot.

Fix working on Tf.

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Small cleanup.

Remove all asserts + simple format.

* Fixing audio-classification for large PR.

* Overly explicity null checking.

* Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.

* Removed internal state for parameters of the  pipeline.

Instead of overriding implicitly internal state, we moved
to real named arguments on every `preprocess`, `_forward`,
`postprocess` function.

Instead `_sanitize_parameters` will be used to split all kwargs
of both __init__ and __call__ into the 3 kinds of named parameters.

* Move import warnings.

* Small fixes.

* Quality.

* Another small fix, using the CI to debug faster.

* Last fixes.

* Last fix.

* Small cleanup of tensor moving.

* is not None.

* Adding a bunch of docs + a iteration test.

* Fixing doc style.

* KeyDataset = None guard.

* RRemoving the Cuda test for pipelines (was testing).

* Even more simple iteration test.

* Correct import .

* Long day.

* Fixes in docs.

* [WIP] migrating object detection.

* Fixed the target_size bug.

* Fixup.

* Bad variable name.

* Fixing `ensure_on_device` respects original ModelOutput.

c63fcabf

08 Sep, 2021 3 commits

Fix typo in deepspeed documentation (#13482) · c3757380
Aleksander Smywiński-Pohl authored Sep 08, 2021
```
* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration
```
c3757380
fixed document (#13414) · 41cd52a7
Mohan Zhang authored Sep 08, 2021

41cd52a7

Object detection pipeline (#12886) · 2a15e8cc

Mishig Davaadorj authored Sep 08, 2021



* Implement object-detection pipeline

* Define threshold const

* Add `threshold` argument

* Refactor

* Uncomment test inputs

* `rm
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better doc
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm unnecessary lines
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better naming
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

* Add `detr-tiny` for tests

* Add `ObjectDetectionPipeline` to `trnsfrmrs/init`

* Implement new bbox format

* Update detr post_process

* Update `load_img` method obj det pipeline

* make style

* Implement new testing format for obj det pipeln

* Add guard pytorch specific code in pipeline

* Add doc

* Make pipeline_obj_tet tests deterministic

* Revert some changes to `post_process` COCO api

* Chore

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm timm requirement

* make fixup

* Add timm requirement to test

* Make fixup

* Guard torch.Tensor

* Chore

* Delete unnecessary comment
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

2a15e8cc

06 Sep, 2021 1 commit

Update model configs - Allow setters for common properties (#13026) · c8be8a9a

Nils Reimers authored Sep 06, 2021

* refactor GPT Config to allow dyn. properties

* make attribute_map a class attribute

* remove old code

* update unit test to test config: Add test for common properties setter

* update unit test to test config: Add test for common properties passed as parameters to __init__

* update to black code format

* Allow that setters are not defined for certain config classes

* update config classes to implement attribute_map

* bugfix lxmert config - id2labels was not defined when num_labels was set

* update broken configs - add attribute_maps

* update bart config

* update black codestyle

* update documentation on common config attributes

* update GPTJ config to new attribute map

* update docs on common attributes

* gptj config: add max_position_embeddings

* gptj config: format with black

* update speech to text 2 config

* format doc file to max_len 119

* update config template

c8be8a9a

01 Sep, 2021 1 commit

Add the `AudioClassificationPipeline` (#13342) · b9c6a976

Anton Lozhkov authored Sep 01, 2021

* Add the audio classification pipeline

* Remove autoconfig exception

* Mark ffmpeg test as slow

* Rearrange pipeline tests

* Add small test

* Replace asserts with ValueError

b9c6a976

31 Aug, 2021 2 commits

TF/Numpy variants for all DataCollator classes (#13105) · 854260ca

Matt authored Aug 31, 2021



* Adding a TF variant of the DataCollatorForTokenClassification to get feedback

* Added a Numpy variant and a post_init check to fail early if a missing import is found

* Fixed call to Numpy variant

* Added a couple more of the collators

* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixes, style pass, finished DataCollatorForSeqToSeq

* Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling

* Adding DataCollatorForPermutationLanguageModeling

* Style pass

* Add missing `__call__` for PLM

* Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks

* Remove unused imports

* First attempt at some TF tests

* A second attempt to make any of those tests actually work

* TF tests, round three

* TF tests, round four

* TF tests, round five

* TF tests, all enabled!

* Style pass

* Merging tests into `test_data_collator.py`

* Merging tests into `test_data_collator.py`

* Fixing up test imports

* Fixing up test imports

* Trying shuffling the conditionals around

* Commenting out non-functional old tests

* Completed all tests for all three frameworks

* Style pass

* Fixed test typo

* Style pass

* Move standard `__call__` method to mixin

* Rearranged imports for `test_data_collator`

* Fix data collator typo "torch" -> "pt"

* Fixed the most embarrassingly obvious bug

* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Renaming mixin

* Updating docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dalton Walker <dalton_walker@icloud.com>
Co-authored-by: Andrew Romans <andrew.romans@hotmail.com>

854260ca

Fixing a typo in the data_collator documentation (#13309) · 11fbc32e
Serhiy-Shekhovtsov authored Aug 31, 2021

11fbc32e

30 Aug, 2021 1 commit
- fix: typo spelling grammar (#13212) · 01977466
  arfy slowy authored Aug 30, 2021
```
* fix: typo spelling grammar

* fix: make fixup
```
  01977466
04 Aug, 2021 1 commit

[Flax] Correct flax docs (#12782) · fbf468b0

Patrick von Platen authored Aug 04, 2021

* fix_torch_device_generate_test

* remove @

* fix flax docs

* correct more docs in flax

* another correction

* fix flax docs

* Apply suggestions from code review

fbf468b0

21 Jul, 2021 1 commit

[Deepspeed] warmup_ratio docs (#12830) · 807b6bd1

Stas Bekman authored Jul 21, 2021



* [Deepspeed] warmup_ratio docs

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

807b6bd1

17 Jul, 2021 1 commit
- Fix push_to_hub docstring and make it appear in doc (#12770) · da72ac6e
  Sylvain Gugger authored Jul 17, 2021
  
  da72ac6e
14 Jul, 2021 1 commit
- non-native optimizers are mostly ok with zero-offload (#12690) · 5dd0c956
  Stas Bekman authored Jul 13, 2021
  
  5dd0c956
13 Jul, 2021 1 commit

[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477) · 78f5fe14

Stas Bekman authored Jul 13, 2021



* zero_to_fp32 tests

* args change

* remove unnecessary work

* use transformers.trainer_utils.get_last_checkpoint

* document the new features

* cleanup

* wip

* fix fsmt

* add bert

* cleanup

* add xlm-roberta

* electra works

* cleanup

* sync

* split off the model zoo tests

* cleanup

* cleanup

* cleanup

* cleanup

* reformat

* cleanup

* casing

* deepspeed>=0.4.3

* adjust distilbert

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

78f5fe14

29 Jun, 2021 1 commit

[models] respect dtype of the model when instantiating it (#12316) · 7682e977

Stas Bekman authored Jun 28, 2021



* [models] respect dtype of the model when instantiating it

* cleanup

* cleanup

* rework to handle non-float dtype

* fix

* switch to fp32 tiny model

* improve

* use dtype.is_floating_point

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix the doc

* recode to use explicit torch_dtype_auto_detect, torch_dtype args

* docs and tweaks

* docs and tweaks

* docs and tweaks

* merge 2 args, add docs

* fix

* fix

* better doc

* better doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7682e977

25 Jun, 2021 1 commit
- remove extra white space from log format (#12360) · 4a872cae
  Stas Bekman authored Jun 25, 2021
  
  4a872cae
23 Jun, 2021 1 commit

[Deepspeed] new docs (#12077) · 07ae6103

Stas Bekman authored Jun 23, 2021



* document sub_group_size

* style

* install + issues reporting

* style

* style

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* indent 4

* restore

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

07ae6103

22 Jun, 2021 2 commits

[trainer] 2 bug fixes and a rename (#12309) · ebe54135
Stas Bekman authored Jun 22, 2021
```
* bug fixes and a rename

* add extended DDP test
```
ebe54135

[trainer + examples] set log level from CLI (#12276) · dad414d5

Stas Bekman authored Jun 21, 2021



* set log level from CLI

* add log_level_replica + test + extended docs

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename datasets objects to allow datasets module

* improve the doc

* style

* doc improve
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

dad414d5

14 Jun, 2021 1 commit
- consistent nn. and nn.functional: part 5 docs (#12161) · 04028317
  Stas Bekman authored Jun 14, 2021
  
  04028317
08 Jun, 2021 2 commits
- typo · 0e82f0cb
  Stas Bekman authored Jun 08, 2021
  
  0e82f0cb
- [Deepspeed] various fixes (#12058) · 32290d87
  Stas Bekman authored Jun 08, 2021
```
* replace deprecated config

* sub_group_size was too big

* complete deprecation removal
```
  32290d87
04 Jun, 2021 1 commit

[Deepspeed] Assert on mismatches between ds and hf args (#12021) · 2c73b930

Stas Bekman authored Jun 04, 2021



* wip

* add mismatch validation + test

* renames

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* renames
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

2c73b930

02 Jun, 2021 1 commit
- [deepspeed] Move code and doc into standalone files (#11984) · 640318be
  Stas Bekman authored Jun 02, 2021
```
* move code and docs

* style

* moved

* restore
```
  640318be
01 Jun, 2021 2 commits

[DeepSpeed] decouple `DeepSpeedConfigHF` from `Trainer` (#11966) · 7ec596ec

Stas Bekman authored Jun 01, 2021



* decouple DeepSpeedConfigHF from Trainer

* add LoggingLevel ctx manager; add new test

* cleanup

* add docs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implemented suggested renames

* formatter workaround
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ec596ec

[deepspeed] docs (#11940) · 79712e7e
Stas Bekman authored Jun 01, 2021
```
* deepspeed docs

* cleanup

* cleanup
```
79712e7e

26 May, 2021 1 commit

Flax Generate (#11777) · 996a315e

Patrick von Platen authored May 27, 2021



* fix_torch_device_generate_test

* remove @

* add

* indexing

* correct a couple of tests

* fix tests

* add logits processor

* finish top_k, top_p, temp

* add docs

* correct flax prng key default

* improve generate

* add generation docs

* add docs

* make style

* revert model outputs change

* make style

* correct typo

* fix tests

* fix slow test

* add raise

* finish generation
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

996a315e

13 May, 2021 1 commit
- Fix doc deployment · cbbf49f6
  Sylvain Gugger authored May 13, 2021
  
  cbbf49f6
07 May, 2021 1 commit

Add the ImageClassificationPipeline (#11598) · 39084ca6

Lysandre Debut authored May 07, 2021



* Add the ImageClassificationPipeline

* Code review
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

* Have `load_image` at the module level
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

39084ca6

04 May, 2021 1 commit
- [trainer] document resume randomness (#11588) · c065025c
  Stas Bekman authored May 04, 2021
```
* document resume randomness

* fix link

* reword

* fix

* reword

* style
```
  c065025c
30 Apr, 2021 2 commits

[DeepSpeed] fp32 support (#11499) · 4e7bf94e

Stas Bekman authored Apr 30, 2021

* prep for deepspeed==0.3.16

* new version

* too soon

* support and test fp32 mode

* troubleshooting doc start

* workaround no longer needed

* add fp32 doc

* style

* cleanup, add tf32 note

* clarify

* release was made

4e7bf94e

Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c

Nicolas Patry authored Apr 30, 2021



* Adding `AutomaticSpeechRecognitionPipeline`.

- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
  sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.

Future work:

- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.

* Update tests/test_pipelines_automatic_speech_recognition.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding docs + main import + type checking + LICENSE.

* Doc style !.

* Fixing TYPE_HINT.

* Specifying waveform shape in the docs.

* Adding asserts + specify in the documentation the shape of the input
np.ndarray.

* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding require to tests + move the `feature_extractor` doc.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

db9dd09c

27 Apr, 2021 1 commit
- update QuickTour docs to reflect model output object (#11462) · 88ac60f7
  Hamel Husain authored Apr 26, 2021
```
* update docs to reflect model output object

* run make style`
```
  88ac60f7
26 Apr, 2021 1 commit

[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418) · bc2571e6

Stas Bekman authored Apr 26, 2021



* adding Z-inf

* revamp config process

* up version requirement

* wip

* massive rewrite

* cleanup

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* consistent json commas

* act on suggestions

* leave this feature for 0.3.16

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bc2571e6

23 Apr, 2021 1 commit

Trainer push to hub (#11328) · bf2e0cf7

Sylvain Gugger authored Apr 23, 2021



* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bf2e0cf7

21 Apr, 2021 1 commit

Examples reorg (#11350) · dabeb152

Sylvain Gugger authored Apr 21, 2021



* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

dabeb152