Commits · b5e2b183af5e40e33a4dc7659e697d137259d56e · chenpangpang / transformers

28 Dec, 2021 1 commit

Sylvain Gugger authored Dec 27, 2021

* Fix bad examples

* Add black formatting to style_doc

* Use first nonempty line

* Put it at the right place

* Don't add spaces to empty lines

* Better templates

* Deal with triple quotes in docstrings

* Result of style_doc

* Enable mdx treatment and fix code examples in MDXs

* Result of doc styler on doc source files

* Last fixes

* Break copy from

b5e2b183

27 Dec, 2021 1 commit

ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225) · b058490c

Nicolas Patry authored Dec 27, 2021



* Pipeline chunks.

* Batching for Chunking pipelines ?

* Batching for `question-answering` and `zero-shot-cls`.

* Fixing for FNet.

* Making ASR a chunk pipeline.

* Chunking ASR API.

* doc style.

* Fixing ASR test.

* Fixing QA eror (p_mask, padding is 1, not 0).

* Enable both vad and simple chunking.

* Max length for vad.

* remove inference mode, crashing on s2t.

* Revert ChunkPipeline for ASRpipeline.

Too many knobs for simple integration within the pipeline, better stick
to external convenience functions instead, more control to be had,
simpler pipeline and also easier to replace with other things later.

* Drop necessity for PT for these.

* Enabling generators.

* Add mic + cleanup.

* Typo.

* Typo2.

* Remove ASR work, it does not belong in this PR anymore.

* Update src/transformers/pipelines/pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/pipelines/zero_shot_classification.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding many comments.

* Doc quality.

* `hidden_states` handling.

* Adding doc.

* Bad rebase.

* Autofixing docs.

* Fixing CRITICAL bug in the new Zerocls pipeline.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

b058490c

22 Dec, 2021 1 commit

Convert rst files (#14888) · 207594be

Sylvain Gugger authored Dec 22, 2021

* Convert all tutorials and guides

* Convert all remaining rst to mdx

* Track and fix bad links

207594be

21 Dec, 2021 2 commits

[doc porting] several docs (#14858) · 18587639

Stas Bekman authored Dec 21, 2021



* [doc porting] 2 docs

* [doc porting] 2 docs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/main_classes/deepspeed.mdx

* cleanup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

18587639

[logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS (#14669) · b6ec9569
Stas Bekman authored Dec 20, 2021
```
* [logging] implement warning_advice / TRANSFORMERS_NO_ADVISORY_WARNINGS

* reword
```
b6ec9569

20 Dec, 2021 1 commit
- [doc] typo (#14849) · c1125dc2
  Stas Bekman authored Dec 20, 2021
```
fix small typo
```
  c1125dc2
16 Dec, 2021 1 commit
- Removes images to put them in a dataset (#14781) · 8010fda9
  Lysandre Debut authored Dec 16, 2021
```
* First try

* Update instructions
```
  8010fda9
15 Dec, 2021 1 commit

PoC for conserving old links (#14754) · 459677ae

Sylvain Gugger authored Dec 15, 2021



* PoC for conserving old links

* Do the same for other links

* remap the redirects section

* add instructions on how to move sections

* improve
Co-authored-by: Stas Bekman <stas@stason.org>

459677ae

13 Dec, 2021 1 commit
- Convert Trainer doc page to MarkDown (#14753) · 7533d30a
  Sylvain Gugger authored Dec 13, 2021
```
* Convert Trainer doc page to MarkDown

* Fix repo consistency

* Fix the doc build test job
```
  7533d30a
07 Dec, 2021 1 commit

Add mLUKE (#14640) · 30646a0a

Ryokan RI authored Dec 07, 2021

* implement MLukeTokenizer and LukeForMaskedLM

* update tests

* update docs

* add LukeForMaskedLM to check_repo.py

* update README

* fix test and specify the entity pad id in tokenization_(m)luke

* fix EntityPredictionHeadTransform

30646a0a

03 Dec, 2021 1 commit

Make DefaultDataCollator importable from root (#14588) · 73ec4340

Matt authored Dec 03, 2021

* Make DefaultDataCollator importable from root

* Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings

* make style

* Add DefaultDataCollator to data_collator.rst

* Add DefaultDataCollator to data_collator.rst

73ec4340

28 Nov, 2021 1 commit
- Tokenizers docs: Specify which class contains `__call__` method (#14379) · ebbe8cc3
  Xing Han Lu authored Nov 28, 2021
```
* Update tokenizer.rst

* Apply `make fixup`
```
  ebbe8cc3
23 Nov, 2021 1 commit

[deepspeed] zero inference (#14253) · 956a4831

Stas Bekman authored Nov 23, 2021



* [deepspeed] zero inference

* only z3 makes sense for inference

* fix and style

* docs

* rework

* fix test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* responding to suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

956a4831

10 Nov, 2021 1 commit

Adding some quality of life for `pipeline` function. (#14322) · 5c153079

Nicolas Patry authored Nov 10, 2021



* Adding some quality of life for `pipeline` function.

* Update docs/source/main_classes/pipelines.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Improve the tests.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5c153079

29 Oct, 2021 1 commit

Adding `batch_size` support for (almost) all pipelines (#13724) · be236361

Nicolas Patry authored Oct 29, 2021



* Tentative enabling of `batch_size` for pipelines.

* Add systematic test for pipeline batching.

* Enabling batch_size on almost all pipelines

- Not `zero-shot` (it's already passing stuff as batched so trickier)
- Not `QA` (preprocess uses squad features, we need to switch to real
tensors at this boundary.

* Adding `min_length_for_response` for conversational.

* Making CTC, speech mappings avaiable regardless of framework.

* Attempt at fixing automatic tests (ffmpeg not enabled for fast tests)

* Removing ffmpeg dependency in tests.

* Small fixes.

* Slight cleanup.

* Adding docs

and adressing comments.

* Quality.

* Update docs/source/main_classes/pipelines.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/question_answering.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/zero_shot_classification.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Improving docs.

* Update docs/source/main_classes/pipelines.rst
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

* N -> oberved_batch_size

softmax trick.

* Follow `padding_side`.

* Supporting image pipeline batching (and padding).

* Rename `unbatch` -> `loader_batch`.

* unbatch_size forgot.

* Custom padding for offset mappings.

* Attempt to remove librosa.

* Adding require_audio.

* torchaudio.

* Back to using datasets librosa.

* Adding help to set a pad_token on the tokenizer.

* Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Quality.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

be236361

12 Oct, 2021 1 commit

Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222) · 8b240a06

Yih-Dar authored Oct 13, 2021



* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

8b240a06

08 Oct, 2021 1 commit

Image Segmentation pipeline (#13828) · 026866df

Mishig Davaadorj authored Oct 08, 2021



* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

026866df

29 Sep, 2021 1 commit

Keras callback to push to hub each epoch, or after N steps (#13773) · 3a8a8013

Matt authored Sep 29, 2021



* Keras callback to push to hub each epoch, or after N steps

* Reworked the callback to use Repository

* Use an Enum for save_strategy

* Style pass

* Correct type for tokenizer

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding print message to the final upload

* Adding print message to the final upload

* Change how we wait for the last process to finish

* is_done is a property, not a method, derp

* Docstrings and documentation

* Style pass

* Style edit

* Docstring reformat

* Docstring rewrite

* Replacing print with internal logger
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3a8a8013

14 Sep, 2021 1 commit

Push to hub when saving checkpoints (#13503) · 3081d386

Sylvain Gugger authored Sep 14, 2021

* Push to hub when saving checkpoints

* Add model card

* Revert partial model card

* Small fix for checkpoint

* Add tests

* Add documentation

* Fix tests

* Bump huggingface_hub

* Fix test

3081d386

10 Sep, 2021 1 commit

[Large PR] Entire rework of pipelines. (#13308) · c63fcabf

Nicolas Patry authored Sep 10, 2021



* Enabling dataset iteration on pipelines.

Enabling dataset iteration on pipelines.

Unifying parameters under `set_parameters` function.

Small fix.

Last fixes after rebase

Remove print.

Fixing text2text `generate_kwargs`

No more `self.max_length`.

Fixing tf only conversational.

Consistency in start/stop index over TF/PT.

Speeding up drastically on TF (nasty bug where max_length would increase
a ton.)

Adding test for support for non fast tokenizers.

Fixign GPU usage on zero-shot.

Fix working on Tf.

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/pipelines/base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Small cleanup.

Remove all asserts + simple format.

* Fixing audio-classification for large PR.

* Overly explicity null checking.

* Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.

* Removed internal state for parameters of the  pipeline.

Instead of overriding implicitly internal state, we moved
to real named arguments on every `preprocess`, `_forward`,
`postprocess` function.

Instead `_sanitize_parameters` will be used to split all kwargs
of both __init__ and __call__ into the 3 kinds of named parameters.

* Move import warnings.

* Small fixes.

* Quality.

* Another small fix, using the CI to debug faster.

* Last fixes.

* Last fix.

* Small cleanup of tensor moving.

* is not None.

* Adding a bunch of docs + a iteration test.

* Fixing doc style.

* KeyDataset = None guard.

* RRemoving the Cuda test for pipelines (was testing).

* Even more simple iteration test.

* Correct import .

* Long day.

* Fixes in docs.

* [WIP] migrating object detection.

* Fixed the target_size bug.

* Fixup.

* Bad variable name.

* Fixing `ensure_on_device` respects original ModelOutput.

c63fcabf

08 Sep, 2021 3 commits

Fix typo in deepspeed documentation (#13482) · c3757380
Aleksander Smywiński-Pohl authored Sep 08, 2021
```
* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration
```
c3757380
fixed document (#13414) · 41cd52a7
Mohan Zhang authored Sep 08, 2021

41cd52a7

Object detection pipeline (#12886) · 2a15e8cc

Mishig Davaadorj authored Sep 08, 2021



* Implement object-detection pipeline

* Define threshold const

* Add `threshold` argument

* Refactor

* Uncomment test inputs

* `rm
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better doc
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm unnecessary lines
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better naming
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

* Add `detr-tiny` for tests

* Add `ObjectDetectionPipeline` to `trnsfrmrs/init`

* Implement new bbox format

* Update detr post_process

* Update `load_img` method obj det pipeline

* make style

* Implement new testing format for obj det pipeln

* Add guard pytorch specific code in pipeline

* Add doc

* Make pipeline_obj_tet tests deterministic

* Revert some changes to `post_process` COCO api

* Chore

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm timm requirement

* make fixup

* Add timm requirement to test

* Make fixup

* Guard torch.Tensor

* Chore

* Delete unnecessary comment
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

2a15e8cc

06 Sep, 2021 1 commit

Update model configs - Allow setters for common properties (#13026) · c8be8a9a

Nils Reimers authored Sep 06, 2021

* refactor GPT Config to allow dyn. properties

* make attribute_map a class attribute

* remove old code

* update unit test to test config: Add test for common properties setter

* update unit test to test config: Add test for common properties passed as parameters to __init__

* update to black code format

* Allow that setters are not defined for certain config classes

* update config classes to implement attribute_map

* bugfix lxmert config - id2labels was not defined when num_labels was set

* update broken configs - add attribute_maps

* update bart config

* update black codestyle

* update documentation on common config attributes

* update GPTJ config to new attribute map

* update docs on common attributes

* gptj config: add max_position_embeddings

* gptj config: format with black

* update speech to text 2 config

* format doc file to max_len 119

* update config template

c8be8a9a

01 Sep, 2021 1 commit

Add the `AudioClassificationPipeline` (#13342) · b9c6a976

Anton Lozhkov authored Sep 01, 2021

* Add the audio classification pipeline

* Remove autoconfig exception

* Mark ffmpeg test as slow

* Rearrange pipeline tests

* Add small test

* Replace asserts with ValueError

b9c6a976

31 Aug, 2021 2 commits

TF/Numpy variants for all DataCollator classes (#13105) · 854260ca

Matt authored Aug 31, 2021



* Adding a TF variant of the DataCollatorForTokenClassification to get feedback

* Added a Numpy variant and a post_init check to fail early if a missing import is found

* Fixed call to Numpy variant

* Added a couple more of the collators

* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixes, style pass, finished DataCollatorForSeqToSeq

* Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling

* Adding DataCollatorForPermutationLanguageModeling

* Style pass

* Add missing `__call__` for PLM

* Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks

* Remove unused imports

* First attempt at some TF tests

* A second attempt to make any of those tests actually work

* TF tests, round three

* TF tests, round four

* TF tests, round five

* TF tests, all enabled!

* Style pass

* Merging tests into `test_data_collator.py`

* Merging tests into `test_data_collator.py`

* Fixing up test imports

* Fixing up test imports

* Trying shuffling the conditionals around

* Commenting out non-functional old tests

* Completed all tests for all three frameworks

* Style pass

* Fixed test typo

* Style pass

* Move standard `__call__` method to mixin

* Rearranged imports for `test_data_collator`

* Fix data collator typo "torch" -> "pt"

* Fixed the most embarrassingly obvious bug

* Update src/transformers/data/data_collator.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Renaming mixin

* Updating docs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dalton Walker <dalton_walker@icloud.com>
Co-authored-by: Andrew Romans <andrew.romans@hotmail.com>

854260ca

Fixing a typo in the data_collator documentation (#13309) · 11fbc32e
Serhiy-Shekhovtsov authored Aug 31, 2021

11fbc32e

30 Aug, 2021 1 commit
- fix: typo spelling grammar (#13212) · 01977466
  arfy slowy authored Aug 30, 2021
```
* fix: typo spelling grammar

* fix: make fixup
```
  01977466
04 Aug, 2021 1 commit

[Flax] Correct flax docs (#12782) · fbf468b0

Patrick von Platen authored Aug 04, 2021

* fix_torch_device_generate_test

* remove @

* fix flax docs

* correct more docs in flax

* another correction

* fix flax docs

* Apply suggestions from code review

fbf468b0

21 Jul, 2021 1 commit

[Deepspeed] warmup_ratio docs (#12830) · 807b6bd1

Stas Bekman authored Jul 21, 2021



* [Deepspeed] warmup_ratio docs

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

807b6bd1

17 Jul, 2021 1 commit
- Fix push_to_hub docstring and make it appear in doc (#12770) · da72ac6e
  Sylvain Gugger authored Jul 17, 2021
  
  da72ac6e
14 Jul, 2021 1 commit
- non-native optimizers are mostly ok with zero-offload (#12690) · 5dd0c956
  Stas Bekman authored Jul 13, 2021
  
  5dd0c956
13 Jul, 2021 1 commit

[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477) · 78f5fe14

Stas Bekman authored Jul 13, 2021



* zero_to_fp32 tests

* args change

* remove unnecessary work

* use transformers.trainer_utils.get_last_checkpoint

* document the new features

* cleanup

* wip

* fix fsmt

* add bert

* cleanup

* add xlm-roberta

* electra works

* cleanup

* sync

* split off the model zoo tests

* cleanup

* cleanup

* cleanup

* cleanup

* reformat

* cleanup

* casing

* deepspeed>=0.4.3

* adjust distilbert

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

78f5fe14

29 Jun, 2021 1 commit

[models] respect dtype of the model when instantiating it (#12316) · 7682e977

Stas Bekman authored Jun 28, 2021



* [models] respect dtype of the model when instantiating it

* cleanup

* cleanup

* rework to handle non-float dtype

* fix

* switch to fp32 tiny model

* improve

* use dtype.is_floating_point

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix the doc

* recode to use explicit torch_dtype_auto_detect, torch_dtype args

* docs and tweaks

* docs and tweaks

* docs and tweaks

* merge 2 args, add docs

* fix

* fix

* better doc

* better doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7682e977

25 Jun, 2021 1 commit
- remove extra white space from log format (#12360) · 4a872cae
  Stas Bekman authored Jun 25, 2021
  
  4a872cae
23 Jun, 2021 1 commit

[Deepspeed] new docs (#12077) · 07ae6103

Stas Bekman authored Jun 23, 2021



* document sub_group_size

* style

* install + issues reporting

* style

* style

* Update docs/source/main_classes/deepspeed.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* indent 4

* restore

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

07ae6103

22 Jun, 2021 2 commits

[trainer] 2 bug fixes and a rename (#12309) · ebe54135
Stas Bekman authored Jun 22, 2021
```
* bug fixes and a rename

* add extended DDP test
```
ebe54135

[trainer + examples] set log level from CLI (#12276) · dad414d5

Stas Bekman authored Jun 21, 2021



* set log level from CLI

* add log_level_replica + test + extended docs

* cleanup

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename datasets objects to allow datasets module

* improve the doc

* style

* doc improve
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

dad414d5

14 Jun, 2021 1 commit
- consistent nn. and nn.functional: part 5 docs (#12161) · 04028317
  Stas Bekman authored Jun 14, 2021
  
  04028317
08 Jun, 2021 1 commit
- typo · 0e82f0cb
  Stas Bekman authored Jun 08, 2021
  
  0e82f0cb