Commits · cf0af9a31beb84e8feec77af51f72d063ba905aa · chenpangpang / transformers

20 Mar, 2023 4 commits
- [Trainer] Add optional communication backends for torch.distributed when using GPU (#22247) · cf0af9a3
  heya5 authored Mar 20, 2023
```
Update training_args.py
```
  cf0af9a3
- Italian translation perf_infer_cpu (#22243) · c4bf6f38
  Nicola Procopio authored Mar 20, 2023
```
* added translated files

added perf_train_cpu and perf_train_cpu_many

* updated toctree

* updated toctree

* added file

perf_infer_cpu.medx

* italian translation perf_infer_cpu.mdx
```
  c4bf6f38
- [Docs] fix typos in some tokenizer docs (#22256) · 466144d4
  yesinkim authored Mar 20, 2023
```
[Docs] fix typos
Co-authored-by: yesinkim <yesinkim@yesinkimui-MacBookAir.local>
```
  466144d4
- Update training_args.py -- a nightly install is not required anymore for torch.compile (#22266) · a48310de
  Pasquale Minervini authored Mar 20, 2023
```
Update training_args.py

A nightly install is not required anymore for `torch.compile`.
```
  a48310de
17 Mar, 2023 13 commits

[trainer] param count for deepspeed zero3 (#22193) · 60d51ef5
Stas Bekman authored Mar 17, 2023
```
[trainer] param count for zero3
```
60d51ef5
Fix Unnecessary move of tensors from CPU to GPU in LlamaRotaryEmbedding (#22234) · cf601b90
Guangyuan Ma authored Mar 18, 2023
```
push
```
cf601b90
Revert "Use `dash==2.8.1` for now for daily CI" (#22233) · bec07561
Yih-Dar authored Mar 17, 2023
```
Revert "Use `dash==2.8.1` for now for daily CI (#22227)"

This reverts commit 53218671.
```
bec07561

Ali Hassani authored Mar 17, 2023

* Add kernel size to NATTEN's QK arguments.

The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
argument to the QK operation to allow optional RPBs.

This ends up failing NATTEN tests.

This commit adds NATTEN back to circleci and adds the arguments to get
it working again.

* Force NATTEN >= 0.14.5

3028b20a

fix(docs): fix task guide links in model docs (#22226) · 074490b2
Seb0 authored Mar 17, 2023
```
fix(docs): task guide links in model docs
```
074490b2
Removed .mdx extension in two links (#22230) · 314cdf7c
Maria Khalusova authored Mar 17, 2023
```
removed .mdx extension
```
314cdf7c

Add LlamaForSequenceClassification (#22209) · f2514413

lewtun authored Mar 17, 2023



* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f2514413

fix AutoTP in deepspeed could not work for bloom (#22196) · 675d2a5a

Wang, Yi authored Mar 17, 2023



* fix AutoTP in deepspeed could not work for bloom
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add a method in BloomModel to build ailib
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

675d2a5a

LLaMA house-keeping (#22216) · 00934026
Sylvain Gugger authored Mar 17, 2023
```
* LLaMA house-keeping

* Doc links
```
00934026

Depth estimation task guide (#22205) · 42f8f764

Maria Khalusova authored Mar 17, 2023

* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix

42f8f764

Use `dash==2.8.1` for now for daily CI (#22227) · 53218671
Yih-Dar authored Mar 17, 2023
```
Use dash 2.8.1 for now
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
53218671
fix code example in mgp-str doc (#22219) · af1c864c
wangpeng authored Mar 17, 2023
```
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
```
af1c864c
fix typos in llama.mdx (#22223) · 33d033d6
Kevin Turner authored Mar 17, 2023

33d033d6

16 Mar, 2023 12 commits

Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218) · 97a3d16a
Yih-Dar authored Mar 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
97a3d16a

(#22204) · 5110e574

Yih-Dar authored Mar 16, 2023



* py38 + torch 2

* increment cache versions

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5110e574

fixes a typo in WhisperFeatureExtractor docs. (#22208) · fb366b9a
Susnato Dhar authored Mar 16, 2023
```
* fixes a typo

* .
```
fb366b9a
[`XGLM`] Add `accelerate` support for XGLM (#22207) · da3ba3a1
Younes Belkada authored Mar 16, 2023
```
* add `accelerate` support for XGLM

* fix order
```
da3ba3a1

Temporarily fix ONNX model exporting error (#21830) · a88a4dae

SatyaJandhyalaAtMS authored Mar 16, 2023

* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a108042118d204da447729f3834affa354fc.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8.

* Propagated changes made in SwinV2 to Swin2SR

a88a4dae

Update tiny model creation script (#22202) · 4c5c0af7

Yih-Dar authored Mar 16, 2023



* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4c5c0af7

LLaMA Implementation (#21955) · 464d4207

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

464d4207

LLaMA Implementation (#21955) · 0041be5b

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

0041be5b

Italian Translation of migration.mdx (#22183) · 09922da4

Baelish03 authored Mar 16, 2023

* Tranlstion Italian: migration

* Update migration.mdx

minor fixes

* Update _toctree.yml

* Delete migration.mdx

* Add italian translation of migration.mdx

* Update of migration.mdx translation and toctree

09922da4

Update expected values in `MgpstrModelIntegrationTest` (#22195) · 52a57f7c
Yih-Dar authored Mar 16, 2023
```
Update values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
52a57f7c
Fix typo in Align docs (#22199) · 1485bd9c
Alara Dirik authored Mar 16, 2023
```
Fix align docs typo
```
1485bd9c

Fix DeepSpeed CI (#22194) · 1c4a9acc

Yih-Dar authored Mar 16, 2023



* Deal with torch-tensorrt

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1c4a9acc

15 Mar, 2023 5 commits

t5 remove data dependency (#22097) · 7c4999e4

Prathik Rao authored Mar 15, 2023



* t5 remove data dependency

* make style

* make fix-copies

---------
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>

7c4999e4

Update BridgeTowerForContrastiveLearning (#22145) · 16121bae

Anahita Bhiwandiwalla authored Mar 15, 2023



* Use return_loss for BridgeTowerForContrastiveLearning, add example

* fix tests

* Update example in BridgeTowerForContrastiveLearning

* Update test_modeling_bridgetower.py

* update model output format

* minor update

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

* make style

---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

16121bae

Regression pipeline device (#22190) · 42ad693b
Sylvain Gugger authored Mar 15, 2023
```
* Fix regression in pipeline when device=-1 is passed

* Add regression test
```
42ad693b
Revert 22152 MaskedImageCompletionOutput changes (#22187) · 73768147
amyeroberts authored Mar 15, 2023
```
Revert changes
```
73768147

Fix: unfinished_sequences with correct device (#22184) · 7b0e2cfd

浮躁的小螃蟹 authored Mar 16, 2023

Fix: unfinished_sequences with correct device 

The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.

7b0e2cfd

14 Mar, 2023 6 commits
- Run all tests by default (#22162) · f7329751
  Sylvain Gugger authored Mar 14, 2023
  
  f7329751
- Load optimizer state on CPU to avoid CUDA OOM (#22159) · b7036f49
  Sylvain Gugger authored Mar 14, 2023
  
  b7036f49
- v4.28.0.dev0 · ebdb185b
  Sylvain Gugger authored Mar 14, 2023
  
  ebdb185b
- Revert "Enforce same behavior as PyTorch 2.0 for older versions" (#22163) · c52c5282
  Sylvain Gugger authored Mar 14, 2023
```
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)"

This reverts commit 1c801d65.
```
  c52c5282
- [trainer] add `--optim adamw_torch_fused` for pt-2.0+ (#22144) · 085bf5c1
  Stas Bekman authored Mar 14, 2023
```
* [trainer] add --optim adamw_torch_fused

* change optim default

* deal with non-torch

* revert default change; prep; add fp16/amp assert

* typo

* typo
```
  085bf5c1
- to_pil - don't rescale if int and in range 0-255 (#22158) · c6318c37
  amyeroberts authored Mar 14, 2023
```
* Don't rescale if in and in range 0-255

* Raise value error if int values too large

* Update tests/test_image_transforms.py

* Update tests/test_image_transforms.py
```
  c6318c37