Commits · d1fcc90abf34cc498c8a65a717ad0d9354ceca97 · chenpangpang / transformers

"examples/tests/deepspeed/ds_config_zero3.json" did not exist on "4c32f9f26e6a84f0d9843fec8757e6ce640bb44e"

24 Feb, 2022 1 commit
- Fix from_pretrained with default base_model_prefix (#15814) · d1fcc90a
  Sylvain Gugger authored Feb 24, 2022
  
  d1fcc90a
17 Feb, 2022 1 commit

NielsRogge authored Feb 17, 2022



* Add first draft

* Make model importable

* Make SwinForMaskedImageModeling importable

* Fix imports

* Add missing inits

* Add support for Swin

* Fix bug

* Fix bug

* Fix another bug

* Fix Swin MIM implementation

* Fix default encoder stride

* Fix Swin

* Add print statements for debugging

* Add image_size data argument

* Fix Swin

* Fix image_size

* Add print statements for debugging

* Fix print statement

* Remove print statements

* Improve reshaping of bool_masked_pos

* Add support for DeiT, fix tests

* Improve docstrings

* Apply new black version

* Improve script

* Fix bug

* Improve README

* Apply suggestions from code review

* Remove DS_Store and add to gitignore

* Apply suggestions from code review + fix BEiT Flax

* Revert BEiT changes

* Improve README

* Fix code quality

* Improve README
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

57882177

15 Feb, 2022 1 commit

Fix model equivalence tests (#15670) · 943e2aa0

Lysandre Debut authored Feb 15, 2022



* Fix model equivalence tests

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

943e2aa0

09 Feb, 2022 1 commit
- Make sure custom configs work with Transformers (#15569) · 1f60bc46
  Sylvain Gugger authored Feb 09, 2022
```
* Make sure custom configs work with Transformers

* Apply code review suggestions
```
  1f60bc46
08 Feb, 2022 1 commit

Add TFSpeech2Text (#15113) · 8406fa6d

Joao Gante authored Feb 08, 2022

* Add wrapper classes

* convert inner layers to tf

* Add TF Encoder and Decoder layers

* TFSpeech2Text models

* Loadable model

* TF model with same outputs as PT model

* test skeleton

* correct tests and run the fixup

* correct attention expansion

* TFSpeech2Text pask_key_values with TF format

8406fa6d

07 Feb, 2022 1 commit

FX tracing improvement (#14321) · 0fe17f37

Michael Benayoun authored Feb 07, 2022

* Change the way tracing happens, enabling dynamic axes out of the box

* Update the tests and modeling xlnet

* Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors).

* Comments and making tracing work for gpt-j and xlnet

* Refactore things related to num_choices (and batch_size, sequence_length)

* Update fx to work on PyTorch 1.10

* Postpone autowrap_function feature usage for later

* Add copyrights

* Remove unnecessary file

* Fix issue with add_new_model_like

* Apply suggestions

0fe17f37

02 Feb, 2022 1 commit

Save code of registered custom models (#15379) · 44b21f11

Sylvain Gugger authored Feb 02, 2022



* Allow dynamic modules to use relative imports

* Work for configs

* Fix last merge conflict

* Save code of registered custom objects

* Map strings to strings

* Fix test

* Add tokenizer

* Rework tests

* Tests

* Ignore fixtures py files for tests

* Tokenizer test + fix collection

* With full path

* Rework integration

* Fix typo

* Remove changes in conftest

* Test for tokenizers

* Add documentation

* Update docs/source/custom_models.mdx
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add file structure and file content

* Add more doc

* Style

* Update docs/source/custom_models.mdx
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Address review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

44b21f11

20 Dec, 2021 1 commit

Add a main_input_name attribute to all models (#14803) · 33f36c86

Sylvain Gugger authored Dec 20, 2021



* Add a main_input_name attribute to all models

* Fix tests

* Wtf Vs Code?

* Update src/transformers/models/imagegpt/modeling_imagegpt.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style

* Fix copies
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

33f36c86

29 Nov, 2021 1 commit
- Rename ImageGPT (#14526) · 25156eb2
  NielsRogge authored Nov 29, 2021
```
* Rename

* Add MODEL_FOR_CAUSAL_IMAGE_MODELING_MAPPING
```
  25156eb2
18 Nov, 2021 1 commit

Add a post init method to all models (#14431) · d83b0e0c

Sylvain Gugger authored Nov 18, 2021

* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save

d83b0e0c

16 Nov, 2021 1 commit

Fix gradient_checkpointing backward compatibility (#14408) · 040fd471

Sylvain Gugger authored Nov 16, 2021



* Fix gradient_checkpointing backward compatibility

* Remove needless line

* make sure mask prob is big enough and length small enough

* Fix tests
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

040fd471

09 Nov, 2021 1 commit

Add TFViTModel (#13778) · be4a6c64

Yih-Dar authored Nov 09, 2021



* Start the work for TFViTModel

* Convert to TF code - need to check in the follow up commits

* Clean up model code

* Expose TFViTModel

* make style

* make quality

* Add test

* make style & quality

* Fix some imports

* fix wrong usage - *kwargs => ** kwargs

* Fix Conv2D weight loading (PT->TF) issue

* Add tests for images with different sizes + fix model

* Fix some common tests for TFViTModel

* Use inputs instead of input_ids in test_compile_tf_model

* Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name

* Avoid transpose in TFViT call

* Fix Conv2D issue in load_tf2_weights_in_pytorch_model

* Use tf.keras.layers.Conv2D instead of tf.nn.conv2d

* Using simpler heuristic to detect Conv2D layer

* Change convert_tf_weight_name_to_pt_weight_name to return TransposeType

* Check tf_weight_shape is not None before using it

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix missing comma

* fix input dtype
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

be4a6c64

08 Nov, 2021 1 commit
- Expand dynamic supported objects to configs and tokenizers (#14296) · dfb00bf6
  Sylvain Gugger authored Nov 08, 2021
```
* Dynamic configs

* Add config test

* Better tests

* Add tokenizer and test

* Add to from_config

* With save
```
  dfb00bf6
02 Nov, 2021 1 commit
- Update Transformers to huggingface_hub >= 0.1.0 (#14251) · 558f8543
  Sylvain Gugger authored Nov 02, 2021
```
* Update Transformers to huggingface_hub >= 0.1.0

* Forgot to save...

* Style

* Fix test
```
  558f8543
01 Nov, 2021 1 commit

Add BeitForSemanticSegmentation (#14096) · e20faa6f

NielsRogge authored Nov 01, 2021



* Add first draft

* Make forward pass work

* Improve conversion script

* Add notebook that checks if it works

* Add BeitForSemanticSegmentation to the tests

* More improvements

* Make BeitForSemanticSegmentation consistent with Segformer

* Small bug fix

* Add BeitForSemanticSegmentation to docs

* Make sure model doesn't output hidden states when the user doesn't want to

* Make it possible to convert the large model

* Fix issue

* Fix conversion script for large model

* Add auxiliary_head option to semantic segmentation model

* Apply suggestions from @sgugger's review

* Apply suggestions from code review

* Fix failing test
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

e20faa6f

29 Oct, 2021 1 commit

Generalize problem_type to all sequence classification models (#14180) · c28bc80b

Sylvain Gugger authored Oct 29, 2021

* Generalize problem_type to all classification models

* Missing import

* Deberta BC and fix tests

* Fix template

* Missing imports

* Revert change to reformer test

* Fix style

c28bc80b

25 Oct, 2021 1 commit

Add TF<>PT and Flax<>PT everywhere (#14047) · 0c3174c7

Patrick von Platen authored Oct 25, 2021

* up

* up

* up

* up

* up

* up

* up

* add clip

* fix clip PyTorch

* fix clip PyTorch

* up

* up

* up

* up

* up

* up

* up

0c3174c7

21 Oct, 2021 1 commit

Fix ignore_mismatched_sizes (#14085) · 234cfefb

Li-Huai (Allan) Lin authored Oct 22, 2021

* Fix

* Style

* Name

* Fix tests

* Style

* Remove embed sizes checking

* Disable some tests

* Fix

* Apply suggestion

234cfefb

11 Oct, 2021 1 commit

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer... · dca67968

Patrick von Platen authored Oct 11, 2021

[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961)

* up

* correct test

dca67968

05 Oct, 2021 1 commit

Initial support for symbolic tracing with torch.fx allowing dynamic axes (#13579) · d4e4efce

Michael Benayoun authored Oct 05, 2021



* Symbolic trace dynamic axes support for BERT like models (albert, bert, distilbert, mobilebert, electra, megatron-bert)
* Sanity checks before tracing that make sure the model to trace is supported
* Adapted to PyTorch 1.9
Co-authored-by: Michael Benayoun <michael@huggingface.co>

d4e4efce

22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

20 Sep, 2021 1 commit

Dynamically load model code from the Hub (#13467) · 002a078a

Sylvain Gugger authored Sep 20, 2021



* Dynamic model

* Use defensive flag

* Style

* Doc and arg rename

* Arg rename

* Add tests

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

002a078a

15 Sep, 2021 1 commit
- [Pretrained Model] Add resize_position_embeddings (#13559) · 95f933ea
  Patrick von Platen authored Sep 15, 2021
```
* finish

* delete bogus file

* correct some stuff

* finish

* finish
```
  95f933ea
31 Aug, 2021 2 commits
- Clean up test file · 74b3344f
  Sylvain Gugger authored Aug 31, 2021
  
  74b3344f
- Tests fetcher tests (#13340) · 8b2de0e4
  Sylvain Gugger authored Aug 31, 2021
```
* Incorporate tests dependencies in tests_fetcher

* Harder modif

* Debug

* Loop through all files

* Last modules

* Remove debug statement
```
  8b2de0e4
24 Aug, 2021 1 commit

fix `AutoModel.from_pretrained(..., torch_dtype=...)` (#13209) · 5c6eca71

Stas Bekman authored Aug 24, 2021

* fix AutoModel.from_pretrained(..., torch_dtype=...)

* fix to_diff_dict

* add better test

* torch is not always available when a model has self.torch_dtype

5c6eca71

15 Jul, 2021 1 commit
- Fix AutoModel tests (#12733) · 3290315a
  Lysandre Debut authored Jul 15, 2021
  
  3290315a
13 Jul, 2021 1 commit

Add option to load a pretrained model with mismatched shapes (#12664) · 90178b0c

Sylvain Gugger authored Jul 13, 2021



* Add option to load a pretrained model with mismatched shapes

* Fail at loading when mismatched shapes in Flax

* Fix tests

* Update src/transformers/modeling_flax_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

90178b0c

01 Jul, 2021 1 commit

[roberta] fix lm_head.decoder.weight ignore_key handling (#12446) · 2d1d9218

Stas Bekman authored Jul 01, 2021



* fix lm_head.decoder.weight ignore_key handling

* fix the mutable class variable

* Update src/transformers/models/roberta/modeling_roberta.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* replicate the comment

* make deterministic
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2d1d9218

29 Jun, 2021 1 commit

[models] respect dtype of the model when instantiating it (#12316) · 7682e977

Stas Bekman authored Jun 28, 2021



* [models] respect dtype of the model when instantiating it

* cleanup

* cleanup

* rework to handle non-float dtype

* fix

* switch to fp32 tiny model

* improve

* use dtype.is_floating_point

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix the doc

* recode to use explicit torch_dtype_auto_detect, torch_dtype args

* docs and tweaks

* docs and tweaks

* docs and tweaks

* merge 2 args, add docs

* fix

* fix

* better doc

* better doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7682e977

24 Jun, 2021 1 commit
- Fix torchscript tests (#12336) · 8ef62ec9
  Lysandre Debut authored Jun 24, 2021
```
* Fix torchscript tests

* Better test

* Remove bogus print
```
  8ef62ec9
23 Jun, 2021 2 commits

changed modeling_fx_utils.py to utils/fx.py for clarity (#12326) · 986ac03e
Michael Benayoun authored Jun 23, 2021
```
Co-authored-by: Michael Benayoun <michael@huggingface.co>
```
986ac03e

Clean push to hub API (#12187) · 53c60bab

Sylvain Gugger authored Jun 23, 2021



* Clean push to hub API

* Create working dir if it does not exist

* Different tweak

* New API + all models + test Flax

* Adds the Trainer clean up

* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* (nit) output types

* No need to set clone_from when folder exists

* Update src/transformers/trainer.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Add generated_from_trainer tag

* Update to new version

* Fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

53c60bab

14 Jun, 2021 1 commit
- [style] consistent nn. and nn.functional: part 3 `tests` (#12155) · 372ab9cd
  Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional: p3 templates

* restore
```
  372ab9cd
09 Jun, 2021 1 commit

Add DETR (#11653) · d3eacbb8

NielsRogge authored Jun 09, 2021



* Squash all commits of modeling_detr_v7 branch into one

* Improve docs

* Fix tests

* Style

* Improve docs some more and fix most tests

* Fix slow tests of ViT, DeiT and DETR

* Improve replacement of batch norm

* Restructure timm backbone forward

* Make DetrForSegmentation support any timm backbone

* Fix name of output

* Address most comments by @LysandreJik

* Give better names for variables

* Conditional imports + timm in setup.py

* Address additional comments by @sgugger

* Make style, add require_timm and require_vision to testsé

* Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone

* Add png files to fixtures

* Fix type hint

* Add timm to workflows

* Add `BatchNorm2d` to the weight initialization

* Fix retain_grad test

* Replace model checkpoints by Facebook namespace

* Fix name of checkpoint in test

* Add user-friendly message when scipy is not available

* Address most comments by @patrickvonplaten

* Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner

* Better initialization

* Scipy is necessary to get sklearn metrics

* Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel

* Make style

* Improve docs and add 2 community notebooks
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

d3eacbb8

25 May, 2021 1 commit
- Add some tests to the slow suite #11860 · db0b2477
  Lysandre Debut authored May 25, 2021
  
  db0b2477
20 May, 2021 2 commits

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

Fix regression in regression (#11785) · 469384a7
Sylvain Gugger authored May 20, 2021
```
* Fix regression in regression

* Add test
```
469384a7

14 May, 2021 1 commit

Experimental symbolic tracing feature with torch.fx for BERT, ELECTRA and T5 (#11475) · 86d5fb0b

Michael Benayoun authored May 14, 2021



Symbolic tracing feature for BERT, ELECTRA and T5
Co-authored-by: Michael Benayoun <michael@huggingface.co>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

86d5fb0b

13 May, 2021 1 commit
- Fix loading the best model on the last stage of training (#11718) · 218d552f
  Volodymyr Byno authored May 13, 2021
  
  218d552f