Commits · a5282ab4bcb0556be5bc9c82d3e17ed978419605 · chenpangpang / transformers

13 Jun, 2022 1 commit
- Fix typo in adding_a_new_model README (#17679) · a5282ab4
  Ayush Mangal authored Jun 13, 2022
  
  a5282ab4
10 Jun, 2022 17 commits

Avoid GPU OOM for a TF Rag test (#17638) · 224bde91
Yih-Dar authored Jun 10, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
224bde91
fix typo from emtpy to empty (#17643) · 39e14614
Domenic Rosati authored Jun 10, 2022

39e14614
[Generation Test] Make fast test actually fast (#17661) · 13e875cc
Patrick von Platen authored Jun 10, 2022

13e875cc
[Data2Vec] Speed up test (#17660) · b4eef63a
Patrick von Platen authored Jun 10, 2022

b4eef63a
[BigBirdFlaxTests] Make tests slow (#17658) · 5e428b71
Patrick von Platen authored Jun 10, 2022
```
* [BigBirdFlaxTests] Make tests slow

* up

* correct black with new version
```
5e428b71
update README.md (#17657) · 3114df41
Loubna Ben Allal authored Jun 10, 2022
```
- use CodeParrot scores of v1.1
- change evaluation command to use accelerate
```
3114df41

Properly raise `RepoNotFoundError` when not authenticated (#17651) · c99ddcc4

Simon Brandeis authored Jun 10, 2022

* Raise RepoNotFoundError in case of 401

* Include changes from revert-17646-skip_repo_not_found

* Add a comment

* 💄 Code quality

* 💚 Update `get_from_cache` test

* 💚 Code quality & skip failing test

c99ddcc4

Fixes #17128 . (#17356) · 35b16032

Balaji authored Jun 10, 2022



VisibleDeprecationWarning is addressed by specifying dtype=object when creating numpy array.
Update code based on review feedback.
Undo whitespace changes to tokenization_utils_base.py.
Co-authored-by: I like data <ilikedata@nym.hush.com>

35b16032

Fix dtype getters (#17656) · b8809091
Sylvain Gugger authored Jun 10, 2022

b8809091
Add skip logic for attentions test - Levit (#17633) · fd1e6703
amyeroberts authored Jun 10, 2022

fd1e6703
Fix style · cdaed367
Lysandre authored Jun 10, 2022

cdaed367
Fix style · 2bc30510
Lysandre authored Jun 10, 2022

2bc30510

Bump cookiecutter in /examples/research_projects/decision_transformer (#17645) · 1d463303

dependabot[bot] authored Jun 10, 2022

Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1.
- [Release notes](https://github.com/cookiecutter/cookiecutter/releases)
- [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md)
- [Commits](https://github.com/cookiecutter/cookiecutter/compare/1.7.2...2.1.1

)

---
updated-dependencies:
- dependency-name: cookiecutter
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

1d463303

Enable crop_center method to handle (W, H, C) images (#17626) · 49becbaa
Alara Dirik authored Jun 10, 2022
```
* enable crop_center method to handle (W, H, C) images

* minor style and comment edits
```
49becbaa

Move Clip image utils to image_utils.py (#17628) · 6e93d947

Alara Dirik authored Jun 10, 2022

* move clip image utils to image_utils.py

* dont default to square images

* fix typo, revert change to test file

* edit convert_rgb comments

6e93d947

Skip tests until bug is fixed. (#17646) · af4a1eca
Sylvain Gugger authored Jun 09, 2022

af4a1eca

Translation/autoclass (#17615) · e0b58fb5

Martina Fumanelli authored Jun 10, 2022



* Add Italian translation for autoclass_tutorial.mdx

* Fix synthesis
Co-authored-by: martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>

e0b58fb5

09 Jun, 2022 14 commits

didn't exist in pt-1.9 (#17644) · df1ec6b1
Stas Bekman authored Jun 09, 2022

df1ec6b1

convert assertion to raised exception in debertav2 (#17619) · fba0b6a8

mrbean authored Jun 09, 2022

* convert assertion to raised exception in debertav2

* change assert to raise exception in deberta

* fix messages

fba0b6a8

Pre-build DeepSpeed (#17607) · da0bed5f

Yih-Dar authored Jun 09, 2022



* pre-build deepspeed
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

da0bed5f

[modeling_utils] torch_dtype/auto floating dtype fixes (#17614) · 75343de9

Stas Bekman authored Jun 09, 2022



* [modeling_utils] torch_dtype/auto fixes

* add test

* apply suggestions

* add missing fallback

* Renaming things

* Use for else
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

75343de9

Running a pipeline of `float16`. (#17637) · c38f4e1f

Nicolas Patry authored Jun 09, 2022

When we're preparing the tensors for CPU for postprocessing, we need
to upgrade the `float16` to `float32` since CPUs don't have instructions
for `[b]float16`.

c38f4e1f

fix use_amp rename after pr 17138 (#17636) · 90ed9ae2
Stas Bekman authored Jun 09, 2022

90ed9ae2

Fix very long job failure text in Slack report (#17630) · c70dacde

Yih-Dar authored Jun 09, 2022



* Fix very long job failure text in Slack report
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

c70dacde

Adding `top_k` argument to `text-classification` pipeline. (#17606) · 2351729f

Nicolas Patry authored Jun 09, 2022

* Adding `top_k` and `sort` arguments to `text-classification` pipeline.

- Deprecate `return_all_scores` as `top_k` is more uniform with other
  pipelines, and a superset of what `return_all_scores` can do.
  BC is maintained though.
  `return_all_scores=True` -> `top_k=None`
  `return_all_scores=False` -> `top_k=1`

- Using `top_k` will imply sorting the results, but using no argument
  will keep the results unsorted for backward compatibility.

* Remove `sort`.

* Fixing the test.

* Remove bad doc.

2351729f

Mention in the doc we drop support for fairscale (#17610) · 29080643
Sylvain Gugger authored Jun 09, 2022

29080643

Use shape_list to safely get shapes for Swin (#17591) · 9fc34235

amyeroberts authored Jun 09, 2022

* Use shape_list to safely get shapes

* Add relevant test

* Tidy and add metrics

* Resolve dynamic shaping issues and move test

* Tidy up and all samples in batch

* Formatting

9fc34235

Add ONNX support for ConvNeXT (#17627) · e0be053e
regisss authored Jun 09, 2022

e0be053e
Add ONNX support for ResNet (#17585) · 5323094a
regisss authored Jun 09, 2022
```
* Add ONNX support for ResNet

* Add ONNX test

* make fix-copies
```
5323094a

BLOOM (#17474) · ca2a55e9

Younes Belkada authored Jun 09, 2022



* adding template

* update model

* model update

* update conf for debug model

* update conversion

* update conversion script

* update conversion script

* fix missing keys check

* add tests to test the tokenizer in the local machine

* Change variable name

* add tests on xnli dataset

* add more description

* add descriptions + clearer code

* clearer code

* adding new tests + skipping few tests because of env problems

* change comment

* add dtype on the configuration

* add test embeddings

* add hardcoded test

* fix dtype issue

* adding torch.float16 to config

* adding more metrics (min, max, mean)

* add sum

* now the test passes with almost equal

* add files for conversion - test passes on cpu  gpu

* add final changes

* cleaning code

* add new args in the docstring

* fix one liner function

* remove macros

* remove forward attention

* clean up init funtion

* add comments on the issue

* rm scale mask softmax

* do make style

* fix dtype in init

* fixing for loop on att probs

* fix style with black

* fix style + doc error

* fix and debug CI errors (docs + style)

* some updates

- change new operations
- finally add scaled softmax
- added new args in the config

* make use cache working

* add changes

- save sharded models
- final changes on the modeling script

* add changes

- comment on alibi
- add TODO on seq length

* test commit

- added a text to test the commit
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* final changes

- attention mask change
- generation works on BS176b
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* changes - model + conversion

* move to correct dir

* put ,

* fex fixes

* fix tokenizer autodoc

* fix minor CI issues

* fix minor CI issues

* fix minor CI issues

* fix style issue

* fix minor import issues

* fix few issues

* remove def main on the test

* add require torch

* replace decorator with 'with'

* fix style

* change to bloom

* add quick fix tokenizer

* fix tokenizer file

* fix tokenizer

- merge tests
- small fixes

* fix import issue

* add bloom to readme

* fix consistency

* Update docs/source/en/model_doc/bloom.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

fix comment issues on file headers
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix doc issue

* small fix - modeling test

* some changes

- refactor some code
- taking into account reviews
- more tests should pass
- removed pruning tests

* remove useless division

* more tests should pass

* more tests should pass

* more tests should pass

* let's try this one

-add alibi offset
- remove all permutes to make the grad operations work
- finger crossed

* refactor

- refactor code
- style changes
- add new threshold for test

* major changes

- change BLOOM to Bloom
- add quick doc on bloom.mdx
- move embeddings test on modeling test

* modify readme

* small fixes

* small fix

- better threshold for a test

* remove old test file from fetcher

* fix small typo

* major change

- change BloomLMHead to BloomForCausalLM

* remove onnx config

* major changes

- refactor the code
- remove asserts
- change tol for test

* make style

* small change

* adding a slow test + commenting old ones for now

* make style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

* fix duplicates

* cleaning comments on config

* clean a bit conversion file

* refacor a bit modeling file

* refactor tokenizer file

* fix tokenization test issue

* fix tokenization issue #2

* fix tokenization issue second try

* fix test issue

* make style + add suggestions

* change test fetcher

* try this one

- slow tests should pass
- finger crossed

* possible final changes

* make style

* try fix padding side issue

* fix side

* fix padding issue

* fix ko-readme

* fix config auto

* cleaning modeling file

* keep bloom in caps in ko

* update config docs

* remove pretraining_pp

* remove model parallel

* update config

- add correct config files

* fix duplicates

* fix fetcher

* fix refactor issue

- remove divide function

* try to remove alibi

* small fixes

- fix alibi
- remove seq length
- refactor a bit the code

* put correct values

- fix bos and eos token ids

* fix attention mask loop
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>

* small fixes:

- remove skip bias add

* small fixes

- fix typo in readme
- fix typos in config

* small changes

- remove a test
- add reconstruction test
- change config

* small changes

- change Scaled Softmax to BloomScaledSoftmax

* small fixes

- fix alibi dtype

* major changes

- removing explicit dtype when loading modules
- fixing test args (torch_dtype=auto)
- add dosctring

* fix readmes

* major changes

- now bloom supports alibi shifting
- refactor a bit the code
- better test tolerance now

* refactor a bit

* refactor a bit

* put correct name on test

* change docstring

* small changes

- fix docstring modeling
- fix test tolerance

* fix small nit

- take dtype from tensors in the conversion script

* minor fix

- fix mdx issue

* minor fix

- change config docstring

* forward contrib credits from PR14084

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply modifications
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* resolve softmax upcast

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

* final changes modeling
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Merge commit 'd156898f

'

* merge commit

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions

Apply suggestions from Stas comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix gradient checkpointing
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add slow but exact

* add accelerate compatibility
Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>

* forward contrib credits
Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix torch device on tests

* make style

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix nits

Co-authored-by: patrickvonplaten<patrickvonplaten@users.noreply.github.com>

* remove final nits

* fix doc

- add more details on the doc
- add links to checkpoints

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions
Co-authored-by: sgugger <sgugger@users.noreply.github.com>

* put test torchscript to false

* Update src/transformers/models/bloom/modeling_bloom.py
Co-authored-by: justheuristic <justheuristic@gmail.com>

* fix alibi

- create alibi only once

* add small doc

* make quality

* replace torch.nn

* remove token type emb

* fix fused op + output bias

* add fused op

- now can control fused operation from config

* remove fused op

* make quality

* small changes

- remove unsed args on config
- removed bias gelu file
- make the model torchscriptable
- add torchscript slow tests

* Update src/transformers/models/bloom/modeling_bloom.py

* fix slow

* make style

* add accelerate support

* add bloom to deepspeed tests

* minor changes

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* minor change

* slow tests pass

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/model_doc/bloom.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor changes:

- change docstring
- add link to paper
Co-authored-by: Thomwolf <thomwolf@gmail.com>
Co-authored-by: Thomas Wolf <thomas@huggingface.co>
Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: sIncerass <sheng.s@berkeley.edu>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
Co-authored-by: Nicolas Patry <Narsil@users.noreply.github.com>
Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
Co-authored-by: sgugger <sgugger@users.noreply.github.com>
Co-authored-by: patrickvonplaten <patrickvonplaten@users.noreply.github.com>
Co-authored-by: LysandreJik <LysandreJik@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: justheuristic <justheuristic@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>

ca2a55e9

has_attentions - consistent test skipping logic and tf tests (#17495) · dfc76b25
amyeroberts authored Jun 09, 2022

dfc76b25

08 Jun, 2022 8 commits

CLI: Print all different tensors on exception (#17612) · 66e86567
Joao Gante authored Jun 08, 2022

66e86567
TF: Merge PT and TF behavior for Bart when no decoder_input_ids are passed (#17593) · e9d51387
Joao Gante authored Jun 08, 2022
```
* Merge PT and TF behavior
```
e9d51387
Fix telemetry URL (#17608) · e160a5dd
Sylvain Gugger authored Jun 08, 2022

e160a5dd
CLI: Properly detect encoder-decoder models (#17605) · 7d0b6fc3
Joao Gante authored Jun 08, 2022

7d0b6fc3

Fix link for community notebooks (#17602) · ee82c86b

Ngo Quang Huy authored Jun 08, 2022



* Fix link for community notebooks

This fixes the link for community notebooks due to reorganization.

* Replace old link with fully link to the doc page
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ee82c86b

Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel... · 34097b33

jianan-gu authored Jun 08, 2022


Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138)

* init PR

* fix import ipex

* minor fix on bf16

* refine optimizer

* refine args notes

* refine code

* refine ipex optimize args

* refine half_precision_backend

* black format

* isort format

* isort format files

* flake8 format

* doc builder format

* refine codes

* remove jit and optim bits

* black preview format

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refine code

* refine notes

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* code refine

* add ipex ut

* add performance cpu doc

* link to the cpu doc from main perf doc

* install ipex into CI's docker

* Update perf_train_cpu.mdx

* Update docs/source/en/perf_train_cpu.mdx
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

34097b33

fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549) · ae7bae8f
SaulLu authored Jun 08, 2022

ae7bae8f
Explicit versions in docker files (#17586) · 264128cb
Yih-Dar authored Jun 08, 2022
```
* Update docker file
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
264128cb