Commits · 5990743fddb4780b15b8af2ed7ab55145ab40455 · chenpangpang / transformers

"vscode:/vscode.git/clone" did not exist on "f8a922e96630a213f49eb75d39635d646981cc8a"

21 Mar, 2023 8 commits

Correct NATTEN function signatures and force new version (#22298) · 5990743f
Ali Hassani authored Mar 21, 2023

5990743f
Restore fp16 support on xla gpu device (#22300) · d35f7296
Yanming W authored Mar 21, 2023

d35f7296

Time to Say Goodbye, torch 1.7 and 1.8 (#22291) · 67c2dbdb

Yih-Dar authored Mar 21, 2023



* time to say goodbye, torch 1.7 and 1.8

* clean up torch_int_div

* clean up is_torch_less_than_1_8-9

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

67c2dbdb

Add translation perf_infer_gpu_one for it (#22296) · 86c7931a
Davide Gazzè authored Mar 21, 2023
```
Add translation
```
86c7931a

fix more doctests (#22292) · d0b942d1

Yih-Dar authored Mar 21, 2023



* fix more doctests

* fix style

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

d0b942d1

More doctests (#22268) · 48327c57

Yih-Dar authored Mar 21, 2023



* all doctests

* Skip failed tests

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

48327c57

Fix error in mixed precision training of `TFCvtModel` (#22267) · 5a2b77a6

Gerald Cuder authored Mar 21, 2023



* Make sure CVT can be trained using mixed precision

* Add test for keras-fit with mixed-precision

* Update tests/models/cvt/test_modeling_tf_cvt.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------
Co-authored-by: gcuder <Gerald.Cuder@iacapps.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

5a2b77a6

replace_8bit_linear modules_to_not_convert default value fix (#22238) · 330d8b99

Andrei Panferov authored Mar 21, 2023



* Fixed modules_to_not_convert default value

* Fixed modules_to_not_convert docstring

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* ["lm_head"] if modules_to_not_convert is None

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

330d8b99

20 Mar, 2023 12 commits
- Update vision docstring bool masked pos (#22237) · c07a02a4
  amyeroberts authored Mar 20, 2023
```
* Add bool_masked_pos to forward docstrings

* Add note about mask ratio - videomae

* Fix up

* Fix indenting
```
  c07a02a4
- Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278) · 7bd86505
  Maria Khalusova authored Mar 20, 2023
```
* added an example of pad_to_multiple_of

* make style

* addressed feedback
```
  7bd86505
- Move torch.compile() wrapping after DDP/FSDP wrapping to ensure correct graph... · fb0a38b4
  Antoni Viros authored Mar 20, 2023
```
Move torch.compile() wrapping after DDP/FSDP wrapping to ensure correct graph breaks during training (#22279)
```
  fb0a38b4
- Fix doc links (#22274) · 8ac29fe0
  amyeroberts authored Mar 20, 2023
  
  8ac29fe0
- Proper map location for optimizer load (#22273) · da005253
  Sylvain Gugger authored Mar 20, 2023
```
* Proper map location for optimizer load

* What happened to my code?
```
  da005253
- Rework a bit the LLaMA conversion script (#22236) · 786092a3
  Sylvain Gugger authored Mar 20, 2023
```
* Update LLaMA conversion script

* Doc

* Fix the weight size for the 13B checkpoint

* Update src/transformers/models/llama/convert_llama_weights_to_hf.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
```
  786092a3
- Fix balanced and auto device_map (#22271) · 43efd7cb
  Sylvain Gugger authored Mar 20, 2023
  
  43efd7cb
- Fix the gradient checkpointing bug of the llama model (#22270) · 89f0fda5
  yqy2001 authored Mar 20, 2023
```
fix grad ckpt bug of llama
```
  89f0fda5
- [Trainer] Add optional communication backends for torch.distributed when using GPU (#22247) · cf0af9a3
  heya5 authored Mar 20, 2023
```
Update training_args.py
```
  cf0af9a3
- Italian translation perf_infer_cpu (#22243) · c4bf6f38
  Nicola Procopio authored Mar 20, 2023
```
* added translated files

added perf_train_cpu and perf_train_cpu_many

* updated toctree

* updated toctree

* added file

perf_infer_cpu.medx

* italian translation perf_infer_cpu.mdx
```
  c4bf6f38
- [Docs] fix typos in some tokenizer docs (#22256) · 466144d4
  yesinkim authored Mar 20, 2023
```
[Docs] fix typos
Co-authored-by: yesinkim <yesinkim@yesinkimui-MacBookAir.local>
```
  466144d4
- Update training_args.py -- a nightly install is not required anymore for torch.compile (#22266) · a48310de
  Pasquale Minervini authored Mar 20, 2023
```
Update training_args.py

A nightly install is not required anymore for `torch.compile`.
```
  a48310de
17 Mar, 2023 13 commits

[trainer] param count for deepspeed zero3 (#22193) · 60d51ef5
Stas Bekman authored Mar 17, 2023
```
[trainer] param count for zero3
```
60d51ef5
Fix Unnecessary move of tensors from CPU to GPU in LlamaRotaryEmbedding (#22234) · cf601b90
Guangyuan Ma authored Mar 18, 2023
```
push
```
cf601b90
Revert "Use `dash==2.8.1` for now for daily CI" (#22233) · bec07561
Yih-Dar authored Mar 17, 2023
```
Revert "Use `dash==2.8.1` for now for daily CI (#22227)"

This reverts commit 53218671.
```
bec07561

Fix natten (#22229) · 3028b20a

Ali Hassani authored Mar 17, 2023

* Add kernel size to NATTEN's QK arguments.

The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
argument to the QK operation to allow optional RPBs.

This ends up failing NATTEN tests.

This commit adds NATTEN back to circleci and adds the arguments to get
it working again.

* Force NATTEN >= 0.14.5

3028b20a

fix(docs): fix task guide links in model docs (#22226) · 074490b2
Seb0 authored Mar 17, 2023
```
fix(docs): task guide links in model docs
```
074490b2
Removed .mdx extension in two links (#22230) · 314cdf7c
Maria Khalusova authored Mar 17, 2023
```
removed .mdx extension
```
314cdf7c

Add LlamaForSequenceClassification (#22209) · f2514413

lewtun authored Mar 17, 2023



* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f2514413

fix AutoTP in deepspeed could not work for bloom (#22196) · 675d2a5a

Wang, Yi authored Mar 17, 2023



* fix AutoTP in deepspeed could not work for bloom
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add a method in BloomModel to build ailib
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

675d2a5a

LLaMA house-keeping (#22216) · 00934026
Sylvain Gugger authored Mar 17, 2023
```
* LLaMA house-keeping

* Doc links
```
00934026

Depth estimation task guide (#22205) · 42f8f764

Maria Khalusova authored Mar 17, 2023

* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix

42f8f764

Use `dash==2.8.1` for now for daily CI (#22227) · 53218671
Yih-Dar authored Mar 17, 2023
```
Use dash 2.8.1 for now
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
53218671
fix code example in mgp-str doc (#22219) · af1c864c
wangpeng authored Mar 17, 2023
```
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
```
af1c864c
fix typos in llama.mdx (#22223) · 33d033d6
Kevin Turner authored Mar 17, 2023

33d033d6

16 Mar, 2023 7 commits

Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218) · 97a3d16a
Yih-Dar authored Mar 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
97a3d16a

(#22204) · 5110e574

Yih-Dar authored Mar 16, 2023



* py38 + torch 2

* increment cache versions

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5110e574

fixes a typo in WhisperFeatureExtractor docs. (#22208) · fb366b9a
Susnato Dhar authored Mar 16, 2023
```
* fixes a typo

* .
```
fb366b9a
[`XGLM`] Add `accelerate` support for XGLM (#22207) · da3ba3a1
Younes Belkada authored Mar 16, 2023
```
* add `accelerate` support for XGLM

* fix order
```
da3ba3a1

Temporarily fix ONNX model exporting error (#21830) · a88a4dae

SatyaJandhyalaAtMS authored Mar 16, 2023

* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a108042118d204da447729f3834affa354fc.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8.

* Propagated changes made in SwinV2 to Swin2SR

a88a4dae

Update tiny model creation script (#22202) · 4c5c0af7

Yih-Dar authored Mar 16, 2023



* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4c5c0af7

LLaMA Implementation (#21955) · 464d4207

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

464d4207