Commits · ae7bae8fe768128f14b4224420bfb3aa7807c970 · chenpangpang / transformers

08 Jun, 2022 2 commits

fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549) · ae7bae8f
SaulLu authored Jun 08, 2022

ae7bae8f

Add TFData2VecVision for semantic segmentation (#17271) · 9d99489f

Sayak Paul authored Jun 08, 2022



* feat: initial implementation of data2vec segmentation model in TF.

* chore: minor corrections to make the segmenter work.

* chore: removed unncessary files.

* chore: add tests and other modifications.

* fix: loss computation for segmentation.

* chore: remove unused variable.

* chore: formatting.

* added a dummy adaptive pooling layer.

* removed unnecessary file.

* potentially add identifiers to layer names.

* fix: layer naming.

* chore: removed unnecessary print.

* Skipping unneeded test

* chore: add logging to debug tolerance.

* fix: segmentation tests for tfdata2vecvision

* chore: make style.

* fix: layer names, assertion to be resolved.

* Bumping test tolerance a bit

* chore: bump the tol in PT test.
Co-authored-by: matt <rocketknight1@gmail.com>

9d99489f

07 Jun, 2022 3 commits

M-CTC-T Model (#16402) · 119e3c0f

Chan Woo Kim authored Jun 08, 2022



* added cbs to notebooks, made copy-paste error fix in generation_utils

* initial push for mctc model

* mctc feature extractor done

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* passing attention, now struggling to figure out how attention masks make sense here

* works when excluding attention masks. ask later how one would integrate attention maskshere

* bizarre configuration error (model prefix comes first in config dict json and messes up the order)

* all passing but bizzarre config dict ordering issue when to_dict

* passing all major tests

* feature extraction, processor, tokenizer added & tests passing

* style & consistency & other logistical fixes

* copy paste fix

* model after feature extraction working

* commiting final feature extraction results; need to fix normalization

* feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?

* delete print ; format code a bit

* fixing tests

* passing major tests

* fixing styles

* completed tokenization test with real example; not sure if these values are entirely correct.

* last test fixes from local

* reverting accidentally included custom setup configs

* remove load tf weights; fix config error

* testing couldnt import featureextractor

* fix docs

* fix docs

* resolving comments

* style fixes

* style fixes

* Update to MCTCConv1dSubSampler
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* relposemb fixes

* conv1d name issue; expecting config fail with paraentheses

* fix config issue

* fix config issue

* fix config issue

* change everything to MCTCT

* fixing naming change errors

* archive list

* copyrights and docs

* copyrights and docs

* copyrights and docs

* merge resolution

* move tests, fix to changed optionaldependency structure

* test directories changed

* fixing tests

* how to avoid tf tests?

* how to avoid tf tests?

* tests passing locally

* allow mctctprocessor imported any env

* allow mctctprocessor imported any env

* fixed second round of feedback, need to fix docs

* doc changes not being applied

* all fixed

* style fix

* feedback fixes

* fix copies and feature extraction style fix

* Update tests/models/visual_bert/test_modeling_visual_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* copy paste huggingface:main visual bert

* added eof newline to visual bert; all tests are passing otherwise

* fix slow tests by adding attention mask

* change model id to speechbrain

* make fix-copies

* fix readme unwanted deletes

* fixing readmes, make fix-copies

* consistent M-CTC-T naming

* Update src/transformers/models/mctct/__init__.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* all fixed but variable naming

* adjust double quotes

* fixed variable names

* copyright and mr quilter

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* correct slow tests

* make fix-copies

* Update src/transformers/models/mctct/configuration_mctct.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mctct/configuration_mctct.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* m-ctc-t not mctct
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

119e3c0f

Fx support for Deberta-v[1-2], Hubert and LXMERT (#17539) · 5c8f6010

Michael Benayoun authored Jun 07, 2022

* Support for deberta and deberta-v2

* Support for LXMert

* Support for Hubert

* Fix for pt1.11

* Trigger CI

5c8f6010

Skip disk offload test for T5 · 9e72eb44
Sylvain Gugger authored Jun 07, 2022

9e72eb44

06 Jun, 2022 3 commits

Add magic method to our TF models to convert datasets with column inference (#17160) · 19a8a303

Matt authored Jun 06, 2022



* Add method to call to_tf_dataset() with column inference

* Add test for dataset creation

* Add a default arg for data collator

* Fix test

* Fix call with non-dev version of datasets

* Test correct column removal too

* make fixup

* More tests to make sure we remove unwanted columns

* Fix test to avoid predicting on unbuilt models

* Fix test to avoid predicting on unbuilt models

* Fix test to remove unwanted head mask columns from inputs

* Stop pushing your debug breakpoints to the main repo of the $2bn company you work for

* Skip the test in convnext because no grouped conv support

* Drop bools from the dataset dict

* Make style

* Skip the training test for models whose input dicts don't give us labels

* Skip transformerXL in the test because it doesn't return a simple loss

* Skip TFTapas because of some odd NaN losses

* make style

* make fixup

* Add docstring

* fixup

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove breakpoint from tests

* Fix assert, add requires_backends

* Protect tokenizer import with if TYPE_CHECKING

* make fixup

* Add noqa, more fixup

* More rearranging for ~* aesthetics *~

* Adding defaults for shuffle and batch_size to match to_tf_dataset()

* Update src/transformers/modeling_tf_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

19a8a303

[deepspeed / testing] reset global state (#17553) · d28b7aa8
Stas Bekman authored Jun 06, 2022
```
* [deepspeed] fix load_best_model test

* [deepspeed] add state reset on unittest tearDown
```
d28b7aa8
fix integration test levit (#17555) · da71df1a
Anugunj Naman authored Jun 06, 2022

da71df1a

03 Jun, 2022 4 commits

[deepspeed] fix load_best_model test (#17550) · 26e5e129
Stas Bekman authored Jun 03, 2022

26e5e129
Fix all offload and MP tests (#17533) · 83439012
Sylvain Gugger authored Jun 03, 2022

83439012

Add support for Perceiver ONNX export (#17213) · babeff55

Patrick Deutschmann authored Jun 03, 2022



* Start adding perceiver support for ONNX

* Fix pad token bug for fast tokenizers

* Fix formatting

* Make get_preprocesor more opinionated (processor priority, otherwise tokenizer/feature extractor)

* Clean docs format

* Minor cleanup following @sgugger's comments

* Fix typo in docs

* Fix another docs typo

* Fix one more typo in docs

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/onnx/utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

babeff55

Add Gated-SiLU to T5 (#17420) · 607acd4f

DanielHesslow authored Jun 03, 2022



* Add gated-silu to t5 architecture to support UL2

* Fix error message

* formatting

* formatting again

* refactor

* fix classnames in _init_weights

* remove is_gated

* add test

* fix test

* Try without the test?

* Add back the test.

* Improve error message.
Co-authored-by: Daniel Hesslow <daniel@lighton.ai>

607acd4f

02 Jun, 2022 2 commits

fix OPT-Flax CI tests (#17512) · 013462c5
Arthur authored Jun 02, 2022

013462c5

[trainer/deepspeed] load_best_model (reimplement re-init) (#17151) · 2f59ad16

Stas Bekman authored Jun 02, 2022



* [trainer/deepspeed] load_best_model

* to sync with DS PR #1947

* simplify

* rework load_best_model test

* cleanup

* bump deepspeed>=0.6.5
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

2f59ad16

01 Jun, 2022 8 commits

Fix Tapas tests (#17510) · 58fb3c9f
Yih-Dar authored Jun 01, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
58fb3c9f
CLI: tool to convert PT into TF weights and open hub PR (#17497) · ca1f1c86
Joao Gante authored Jun 01, 2022

ca1f1c86

Adding LeViT Model by Facebook (#17466) · 84aaadd8

Anugunj Naman authored Jun 01, 2022



* levit files

* levit tests

* weights script

* weights script

* update

* style fixes

* few minor corrections

* Added teacher model

* edit docs

* fix-copies

* style fixes

* pr error resolved

* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/index.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/configuration_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* suggested pr changes

* style fixes

* minor bug

* update

* minor doc edit

* style

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/levit/test_modeling_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* residual layer readable

* style

* Update docs/source/en/model_doc/levit.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update tests/models/levit/test_feature_extraction_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* change checkpoints and style

* update

* minor changes

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/levit/modeling_levit.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

84aaadd8

Fix CTRL tests (#17508) · 1d2b57b8

Yih-Dar authored Jun 01, 2022



* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1d2b57b8

Fix LayoutXLMProcessorTest (#17506) · 693720e5
Yih-Dar authored Jun 01, 2022
```
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
693720e5

Debug LukeForMaskedLM (#17499) · 4d1ce396

Ryokan RI authored Jun 01, 2022

* add a test for a word only input

* make LukeForMaskedLM work without entity inputs

* update test

* add LukeForMaskedLM to MODEL_FOR_MASKED_LM_MAPPING_NAMES

* restore pyproject.toml

* empty line at the end of pyproject.toml

4d1ce396

Fix MP and CPU offload tests for Funnel and GPT-Neo (#17503) · 4390151b
Sylvain Gugger authored Jun 01, 2022

4390151b

Add OnnxConfig for SqueezeBert iss17314 (#17315) · 4f38808e

Ruihua Fang authored Jun 01, 2022



* add onnx config for SqueezeBert

* add test for onnx config for SqueezeBert

* add automatically updated doc for onnx config for SqueezeBert

* Update src/transformers/onnx/features.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/models/squeezebert/configuration_squeezebert.py
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

4f38808e

31 May, 2022 7 commits

Opt in flax and tf (#17388) · 7822a9b7

Arthur authored May 31, 2022



* initial commit

* add init file

* update globakl init

* update index and dummy objects

* style

* update modelling auto

* fix initi typo in src/transformers

* fix typo in modeling tf auto, opt was in wrong mapping name

* fixed a slow test : saved_model

* style

* fix positionnal embedding if no position id is provided

* update tf test

* update test flax requirements

* fixed serialization

* update

* update tf name to allow smooth convertion

* update flax tests

* style

* fix test typo

* fix tf typo test

* add xla for generate support in causal LM

* fixed bug

* cleaned tf tests

* style

* removed from PT for slow tests

* fix typp

* opt test as slow

* trying to fix GPT2 undefined

* correct documentation and add to test doc

* update tf doc

* fix doc

* fake commit

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* update test based on review

* merged main layer for functionning test

* fixup + quality

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update long comment

* make fix copies
Co-authored-by: Arthur <arthur@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7822a9b7

[Json configs] Make json prettier for all saved tokenizer files & ensure same... · f394a2a5

Patrick von Platen authored May 31, 2022

[Json configs] Make json prettier for all saved tokenizer files & ensure same json format for all processors (tok + feat_extract) (#17457)

* [Json dump] Make json prettier

* correct more tokenizeirs

* more patterns

* add aggressive test

* the aggressive test was actually useful :-)

* more tests

* Apply suggestions from code review

f394a2a5

Added XLM onnx config (#17030) · 5af38953

Ritik Nandwal authored May 31, 2022

* Add onnx configuration for xlm

* Add supported features for xlm

* Add xlm to models exportable with onnx

* Add xlm architecture to test file

* Modify docs

* Make code quality fixes

5af38953

Disk offload fix (#17428) · 567d9c06
Sylvain Gugger authored May 31, 2022
```
* Fix offload to disk for big models

* Add test

* Fix test for other models
```
567d9c06

TF: GPT-2 generation supports left-padding (#17426) · 975dd2bb

Joao Gante authored May 31, 2022

* TF GPT-2 now properly works with left padding

* throw a warning when eos token == pad token and there is no attention mask

975dd2bb

Fix ViTMAEModelTester (#17470) · c1a13861
Yih-Dar authored May 31, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
c1a13861

Fx support for multiple model architectures (#17393) · 28d00482

Michael Benayoun authored May 31, 2022

* Support for Bart and LayoutLM, and partial support for XLNet

* Support for mbart

* A lot of new models supported

* Support for other models

* LayoutLM fix

* Use strings instead of classes

28d00482

26 May, 2022 1 commit
- Fix model parallelism test (#17439) · 98f6e1ee
  Sylvain Gugger authored May 26, 2022
  
  98f6e1ee
25 May, 2022 3 commits

Support compilation via Torchdynamo, AOT Autograd, NVFuser (#17308) · 897a8dd8

Animesh Jain authored May 25, 2022



* Support compilation via Torchdynamo, AOT Autograd, NVFuser

* Address comments

* Lint

* Stas comments - missing quality test

* Lintere

* Quality test

* Doc lint

* Reset CUDA peak mem

* Add CustomTrainer

* require a single gpu
Co-authored-by: Stas Bekman <stas@stason.org>

897a8dd8

Add test for new model parallelism features (#17401) · 31484afb
Sylvain Gugger authored May 25, 2022

31484afb
Fix expected value for OPT test `test_inference_no_head` (#17395) · 4d727bd2
Yih-Dar authored May 25, 2022
```
* Fix expected value

* 5e-5
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
4d727bd2

24 May, 2022 4 commits

[WIP] Adding GPT-NeoX-20B (#16659) · 71e60272

Jason Phang authored May 24, 2022



* initial

* first try

* working 20B

* 20B tokenizers

* Docs

* Import fixes for missing classes

* Update docs, fixup

* black formatting

* isort

* flake

* dummy objects

* documentation

* Documentation yml

* more docs

* tweaks for tests

* tokenization auto

* fix neox tests

* test

* test

* einsum

* address PR feedback

* Documentation

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neox/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_neox/configuration_gpt_neox.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove undefined LaTeX syntax

* Update to full url to avoid confusion about if that's supposed to refer to the Hub

* fix auto

* move tests

* documentation fix

* more doc fixes

* test refactor

* fix import

* fix import

* fix import

* fix import

* fix import

* style fixes

* More modeling fixes
Co-authored-by: Jason Phang <zp489@gr057.hpc.nyu.edu>
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71e60272

Clean up CLIP tests (#17380) · 374a2f69
NielsRogge authored May 24, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
374a2f69

Enabling `imageGPT` auto feature extractor. (#16871) · d9809298

Nicolas Patry authored May 24, 2022



* Enablign `imageGPT` auto feature extractor.
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Small updates.

* Update after rebase to use `input_ids` instead of `pixel_values`.
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d9809298

Add LayoutLMv3 (#17060) · 31ee80d5

NielsRogge authored May 24, 2022



* Make forward pass work

* More improvements

* Remove unused imports

* Remove timm dependency

* Improve loss calculation of token classifier

* Fix most tests

* Add docs

* Add model integration test

* Make all tests pass

* Add LayoutLMv3FeatureExtractor

* Improve integration test + make fixup

* Add example script

* Fix style

* Add LayoutLMv3Processor

* Fix style

* Add option to add visual labels

* Make more tokenizer tests pass

* Fix more tests

* Make more tests pass

* Fix bug and improve docs

* Fix import of processors

* Improve docstrings

* Fix toctree and improve docs

* Fix auto tokenizer

* Move tests to model folder

* Move tests to model folder

* change default behavior add_prefix_space

* add prefix space for fast

* add_prefix_spcae set to True for Fast

* no space before `unique_no_split` token

* add test to hightligh special treatment of added tokens

* fix `test_batch_encode_dynamic_overflowing` by building a long enough example

* fix `test_full_tokenizer` with add_prefix_token

* Fix tokenizer integration test

* Make the code more readable

* Add tests for LayoutLMv3Processor

* Fix style

* Add model to README and update init

* Apply suggestions from code review

* Replace asserts by value errors

* Add suggestion by @ducviet00

* Add model to doc tests

* Simplify script

* Improve README

* a step ahead to fix

* Update pair_input_test

* Make all tokenizer tests pass - phew

* Make style

* Add LayoutLMv3 to CI job

* Fix auto mapping

* Fix CI job name

* Make all processor tests pass

* Make tests of LayoutLMv2 and LayoutXLM consistent

* Add copied from statements to fast tokenizer

* Add copied from statements to slow tokenizer

* Remove add_visual_labels attribute

* Fix tests

* Add link to notebooks

* Improve docs of LayoutLMv3Processor

* Fix reference to section
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

31ee80d5

23 May, 2022 2 commits

Use Accelerate in `from_pretrained` for big model inference (#17341) · 56f50590

Sylvain Gugger authored May 23, 2022



* Initial work

* More or less finished with first draft

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix randomly initialized weights

* Update src/transformers/modeling_utils.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* Rename DeepSpeed folder to temporarily fix the test issue?

* Revert to try if Accelerate fix works

* Use latest Accelerate release

* Quality and fixes

* Style

* Quality

* Add doc

* Test + fix

* More blocks
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

56f50590

Traced models serialization and torchscripting fix (#17206) · 2e7e4280

Michael Benayoun authored May 23, 2022

* Fix torch.jit.script and pickling issues

* Fix get_attr issues

* Fix import in function

* Fix GPT-J and T5 tracing for torch=1.11

* Gate graph surgery on torch version

* Modeling minor changes to enable TorchScripting

* Model serialization / deserialization test

* Remove _assert_is_none users

2e7e4280

19 May, 2022 1 commit
- [Test OPT] Add batch generation test opt (#17359) · 54192058
  Patrick von Platen authored May 19, 2022
```
* up

* up
```
  54192058