Commits · f251441387f7b9ed5b7539720aa91c29fc630d19 · chenpangpang / transformers

17 Mar, 2023 7 commits

Add LlamaForSequenceClassification (#22209) · f2514413

lewtun authored Mar 17, 2023



* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f2514413

fix AutoTP in deepspeed could not work for bloom (#22196) · 675d2a5a

Wang, Yi authored Mar 17, 2023



* fix AutoTP in deepspeed could not work for bloom
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add a method in BloomModel to build ailib
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

675d2a5a

LLaMA house-keeping (#22216) · 00934026
Sylvain Gugger authored Mar 17, 2023
```
* LLaMA house-keeping

* Doc links
```
00934026

Depth estimation task guide (#22205) · 42f8f764

Maria Khalusova authored Mar 17, 2023

* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix

42f8f764

Use `dash==2.8.1` for now for daily CI (#22227) · 53218671
Yih-Dar authored Mar 17, 2023
```
Use dash 2.8.1 for now
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
53218671
fix code example in mgp-str doc (#22219) · af1c864c
wangpeng authored Mar 17, 2023
```
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
```
af1c864c
fix typos in llama.mdx (#22223) · 33d033d6
Kevin Turner authored Mar 17, 2023

33d033d6

16 Mar, 2023 12 commits

Hotfix for natten issue with torch 2.0.0 on CircleCI (#22218) · 97a3d16a
Yih-Dar authored Mar 16, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
97a3d16a

(#22204) · 5110e574

Yih-Dar authored Mar 16, 2023



* py38 + torch 2

* increment cache versions

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5110e574

fixes a typo in WhisperFeatureExtractor docs. (#22208) · fb366b9a
Susnato Dhar authored Mar 16, 2023
```
* fixes a typo

* .
```
fb366b9a
[`XGLM`] Add `accelerate` support for XGLM (#22207) · da3ba3a1
Younes Belkada authored Mar 16, 2023
```
* add `accelerate` support for XGLM

* fix order
```
da3ba3a1

Temporarily fix ONNX model exporting error (#21830) · a88a4dae

SatyaJandhyalaAtMS authored Mar 16, 2023

* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a108042118d204da447729f3834affa354fc.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8.

* Propagated changes made in SwinV2 to Swin2SR

a88a4dae

Update tiny model creation script (#22202) · 4c5c0af7

Yih-Dar authored Mar 16, 2023



* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4c5c0af7

LLaMA Implementation (#21955) · 464d4207

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

464d4207

LLaMA Implementation (#21955) · 0041be5b

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

0041be5b

Italian Translation of migration.mdx (#22183) · 09922da4

Baelish03 authored Mar 16, 2023

* Tranlstion Italian: migration

* Update migration.mdx

minor fixes

* Update _toctree.yml

* Delete migration.mdx

* Add italian translation of migration.mdx

* Update of migration.mdx translation and toctree

09922da4

Update expected values in `MgpstrModelIntegrationTest` (#22195) · 52a57f7c
Yih-Dar authored Mar 16, 2023
```
Update values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
52a57f7c
Fix typo in Align docs (#22199) · 1485bd9c
Alara Dirik authored Mar 16, 2023
```
Fix align docs typo
```
1485bd9c

Fix DeepSpeed CI (#22194) · 1c4a9acc

Yih-Dar authored Mar 16, 2023



* Deal with torch-tensorrt

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1c4a9acc

15 Mar, 2023 5 commits

t5 remove data dependency (#22097) · 7c4999e4

Prathik Rao authored Mar 15, 2023



* t5 remove data dependency

* make style

* make fix-copies

---------
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>

7c4999e4

Update BridgeTowerForContrastiveLearning (#22145) · 16121bae

Anahita Bhiwandiwalla authored Mar 15, 2023



* Use return_loss for BridgeTowerForContrastiveLearning, add example

* fix tests

* Update example in BridgeTowerForContrastiveLearning

* Update test_modeling_bridgetower.py

* update model output format

* minor update

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

* make style

---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

16121bae

Regression pipeline device (#22190) · 42ad693b
Sylvain Gugger authored Mar 15, 2023
```
* Fix regression in pipeline when device=-1 is passed

* Add regression test
```
42ad693b
Revert 22152 MaskedImageCompletionOutput changes (#22187) · 73768147
amyeroberts authored Mar 15, 2023
```
Revert changes
```
73768147

Fix: unfinished_sequences with correct device (#22184) · 7b0e2cfd

浮躁的小螃蟹 authored Mar 16, 2023

Fix: unfinished_sequences with correct device 

The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.

7b0e2cfd

14 Mar, 2023 13 commits

Run all tests by default (#22162) · f7329751
Sylvain Gugger authored Mar 14, 2023

f7329751
Load optimizer state on CPU to avoid CUDA OOM (#22159) · b7036f49
Sylvain Gugger authored Mar 14, 2023

b7036f49
v4.28.0.dev0 · ebdb185b
Sylvain Gugger authored Mar 14, 2023

ebdb185b
Revert "Enforce same behavior as PyTorch 2.0 for older versions" (#22163) · c52c5282
Sylvain Gugger authored Mar 14, 2023
```
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)"

This reverts commit 1c801d65.
```
c52c5282

[trainer] add `--optim adamw_torch_fused` for pt-2.0+ (#22144) · 085bf5c1

Stas Bekman authored Mar 14, 2023

* [trainer] add --optim adamw_torch_fused

* change optim default

* deal with non-torch

* revert default change; prep; add fp16/amp assert

* typo

* typo

085bf5c1

to_pil - don't rescale if int and in range 0-255 (#22158) · c6318c37

amyeroberts authored Mar 14, 2023

* Don't rescale if in and in range 0-255

* Raise value error if int values too large

* Update tests/test_image_transforms.py

* Update tests/test_image_transforms.py

c6318c37

Create MaskedImageCompletionOutput and fix ViT docs (#22152) · 3b22bfbc
Alara Dirik authored Mar 14, 2023
```
* create MaskedImageCompletionOutput

* fix bugs

* fix bugs
```
3b22bfbc

Fix big model inference for T5 models in float16 (#22095) · b45192ec

Sylvain Gugger authored Mar 14, 2023



* Fix big model inference for T5 models in float16

* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Style

* Trigger CI with latest release

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

b45192ec

Translation Italian: perf_train_cpu and perf_train_cpu_many (#22151) · 7f5ad6c3
Nicola Procopio authored Mar 14, 2023
```
* added translated files

added perf_train_cpu and perf_train_cpu_many

* updated toctree
```
7f5ad6c3
Update 2 doctest expected values for torch 2.0.0 (#22148) · ff887035
Yih-Dar authored Mar 14, 2023
```
update values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ff887035

Add ConvNeXT V2 (#21679) · cdddfbff

Alara Dirik authored Mar 14, 2023

* Add ConvNeXt V2 to transformers
* TF model is separated from the PR to fix issues

cdddfbff

Move `is_pipeline_test_to_skip` to specific model test classes (#21999) · 6c2ad00c

Yih-Dar authored Mar 14, 2023



* Move `is_pipeline_test_to_skip` to specific model test classes

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6c2ad00c

[

🛠

️] Fix-whisper-breaking-changes (#21965) · 2beabd24

Arthur authored Mar 14, 2023



* temp fix

* temporary fix

* update

* fix tests

* fixup

* update based on reveiew
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* update to fix tests

* update docstring

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

2beabd24

13 Mar, 2023 3 commits

docs: New terms and updates to glossary (#21982) · 101a6cd2

MichaelRipa authored Mar 13, 2023



* Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added link to 'Pipeline for inference' tutorial

* Trigger CI

* Update docs/source/en/glossary.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added entry for self supervised learning, added deleted entries + fixed broken links

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

101a6cd2

Prepare daily CI for torch 2.0.0 (#22135) · ba9e0191
Yih-Dar authored Mar 13, 2023
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ba9e0191

[Safetensors] Add explicit flag to from pretrained (#22083) · f780557a

Patrick von Platen authored Mar 13, 2023



* [Safetensors] Add explicit  flag to from pretrained

* add test

* remove @

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f780557a