Commits · b76290f44ce432e2ee7678a76036e8509167bae6 · chenpangpang / transformers

15 Jun, 2022 1 commit
- Change push CI to run on workflow_run event (#17692) · b76290f4
  Yih-Dar authored Jun 15, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b76290f4
14 Jun, 2022 10 commits

fix tolerance for a bloom slow test (#17634) · d453ea61
Younes Belkada authored Jun 14, 2022

d453ea61
[LongT5] disable model parallel test (#17702) · 120649bf
Suraj Patil authored Jun 14, 2022

120649bf

FX function refactor (#17625) · 7ec9128e

Michael Benayoun authored Jun 14, 2022



* Function refactor

* Update src/transformers/utils/fx.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7ec9128e

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes (#17639) · edb672ac

Hailey Schoelkopf authored Jun 14, 2022



* add new bloom classes

* (feat) add bloom classification tests; make style

* style: change import in test

* add some typehints to bloom classes

* merge main into branch

* fix: input checking in bloom seq classification

* fix tests

* change model class tests

* fix few tests

- more tests should pass
- one test left

* make token classifier return hidden states

* style: make BLOOM typehints consistent
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

edb672ac

Swin main layer (#17693) · bd43151a
amyeroberts authored Jun 14, 2022
```
* Swin models call TFSwinMainLayer

* Tidy up
```
bd43151a

Include a comment to reflect Amy's contributions (#17689) · 3960ce91

Sayak Paul authored Jun 14, 2022



* Add note on amy's contribution.
Co-authored-by: Amy Roberts <aeroberts4444@gmail.com>

* remove non-tech comment.

Co-authored by: Amy Roberts <aeroberts4444@gmail.com>
Co-authored-by: Amy Roberts <aeroberts4444@gmail.com>

3960ce91

Rag end2end new (#17650) · 9068fa6c

Shamane Siri authored Jun 15, 2022

* check

* update the RAG-end2end with new PL and RAY

* removed unwanted comments

9068fa6c

[LongT5] Rename checkpoitns (#17700) · 53496ac5
Patrick von Platen authored Jun 14, 2022

53496ac5

Extend Transformers Trainer Class to Enable PyTorch Torchscript for Inference (#17153) · 3b29c9fd

jianan-gu authored Jun 14, 2022



* add jit mode option and model wrap

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refine code

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add ut and refine code

* code refine

* refine code

* add inference doc

* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add cpu inference performance doc

* Update perf_infer_cpu.mdx

* Update perf_infer_cpu.mdx

* Update performance.mdx

* Update _toctree.yml

* refine jit func naming

* Update _toctree.yml

* Delete perf_infer_gpu_one.mdx

* Update perf_infer_cpu.mdx

* Update docs/source/en/perf_infer_cpu.mdx
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add none check before jit

* Update docs/source/en/perf_infer_cpu.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/perf_infer_cpu.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

3b29c9fd

Fix doc builder Dockerfile (#17435) · df15703b

Yih-Dar authored Jun 14, 2022



* Fix doc builder Dockerfile
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

df15703b

13 Jun, 2022 10 commits

Add `LongT5` model (#16792) · a72f1c9f

Daniel Stancl authored Jun 13, 2022



* Initial commit

* Make some fixes

* Make PT model full forward pass

* Drop TF & Flax implementation, fix copies etc

* Add Flax model and update some corresponding stuff

* Drop some TF things

* Update config and flax local attn

* Add encoder_attention_type to config

* .

* Update docs

* Do some cleansing

* Fix some issues -> make style; add some docs

* Fix position_bias + mask addition + Update tests

* Fix repo consistency

* Fix model consistency by removing flax operation over attn_mask

* [WIP] Add PT TGlobal LongT5

* .

* [WIP] Add flax tglobal model

* [WIP] Update flax model to use the right attention type in the encoder

* Fix flax tglobal model forward pass

* Make the use of global_relative_attention_bias

* Add test suites for TGlobal model

* Fix minor bugs, clean code

* Fix pt-flax equivalence though not convinced with correctness

* Fix LocalAttn implementation to match the original impl. + update READMEs

* Few updates

* Update: [Flax] improve large model init and loading #16148

* Add ckpt conversion script accoring to #16853 + handle torch device placement

* Minor updates to conversion script.

* Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM

* gpu support + dtype fix

* Apply some suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* * Remove (de)parallelize stuff
* Edit shape comments
* Update README.md
* make fix-copies

* Remove caching logic for local & tglobal attention

* Apply another batch of suggestions from code review

* Add missing checkpoints
* Format converting scripts
* Drop (de)parallelize links from longT5 mdx

* Fix converting script + revert config file change

* Revert "Remove caching logic for local & tglobal attention"

This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.

* Stash caching logic in Flax model

* Make side relative bias used always

* Drop caching logic in PT model

* Return side bias as it was

* Drop all remaining model parallel logic

* Remove clamp statements

* Move test files to the proper place

* Update docs with new version of hf-doc-builder

* Fix test imports

* Make some minor improvements

* Add missing checkpoints to docs
* Make TGlobal model compatible with torch.onnx.export
* Replace some np.ndarray with jnp.ndarray

* Fix TGlobal for ONNX conversion + update docs

* fix _make_global_fixed_block_ids and masked neg  value

* update flax model

* style and quality

* fix imports

* remove load_tf_weights_in_longt5 from init and fix copies

* add slow test for TGlobal model

* typo fix

* Drop obsolete is_parallelizable and one warning

* Update __init__ files to fix repo-consistency

* fix pipeline test

* Fix some device placements

* [wip]: Update tests -- need to generate summaries to update expected_summary

* Fix quality

* Update LongT5 model card

* Update (slow) summarization tests

* make style

* rename checkpoitns

* finish

* fix flax tests
Co-authored-by: phungvanduy <pvduy23@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patil-suraj <surajp815@gmail.com>

a72f1c9f

Add FP16 Support for SageMaker Model Parallel (#17386) · 1690094b

haohanchen-yagao authored Jun 13, 2022

* Add FP16 supporot for sagemaker model parallel

* minor fix

* fix indentation

* handle mix precision exception for smmp

* minor fix

* remove amp implementation on SMMP

* remove redundant stuff

* reformat trainer

* restyling

* reformat

1690094b

enable cpu distribution training using mpirun (#17570) · 4aabf9b5

Wang, Yi authored Jun 14, 2022



* enable cpu distribution training using mpirun

*command like
*    mpirun -n 2 python3 run_qa.py --no_cuda --xpu_backend ccl xxxx
*MASTER_ADDR and MASTER_PORT should be set as env
*export MASTER_ADDR=127.0.0.1
*export MASTER_PORT=29500
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* fix according to the review comment
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* use accelerate logic for cpu distribution training to set "RANK","LOCAL_RANK","WORLD_SIZE" environment
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

4aabf9b5

Add Ray's scope to training arguments (#17629) · 457d4a32

Bram Vanroy authored Jun 13, 2022



* allow scope from trainer arg

* add ray_scope to training args

* escape double quotes

* make style && quality

* attempt to solve doc style issues

* splitting up URLs for style

* make fixup

* Update src/transformers/training_args.py
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

* make style
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

457d4a32

Update modeling_gpt_neox.py (#17575) · 54833886

Will Frey authored Jun 13, 2022

I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.

If this is incorrect, please feel free to just close the PR.

Thanks!

54833886

Fix dtype getter (#17668) · a1344dbf

Sylvain Gugger authored Jun 13, 2022

* Fix dtype getters

* Proper fix for dtype getter

* Style and commant

* Always use last for consistency

* Quality

a1344dbf

explicitly set utf8 for Windows (#17664) · 73083581
Bram Vanroy authored Jun 13, 2022

73083581
Fixed documentation typo, parameter name is evaluation_strategy, not eval_strategy (#17669) · c1daf724
Saint authored Jun 13, 2022
```
Co-authored-by: Saint <saint@st-mini.local>
```
c1daf724

Add Visual Question Answering (VQA) pipeline (#17286) · 66336dc1

Sijun He authored Jun 13, 2022



* wip

* rebase

* all tests pass

* rebase

* ready for PR

* address comments

* fix styles

* add require_torch to pipeline test

* remove remote image to improve CI consistency

* address comments; fix tf/flax tests

* address comments; fix tf/flax tests

* fix tests; add alias

* repo consistency tests

* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* address comments

* Update src/transformers/pipelines/visual_question_answering.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* merge

* Update src/transformers/models/auto/modeling_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* merge
Co-authored-by: Sijun He <sijunhe@Sijuns-MacBook-Pro.local>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

66336dc1

Fix typo in adding_a_new_model README (#17679) · a5282ab4
Ayush Mangal authored Jun 13, 2022

a5282ab4

10 Jun, 2022 17 commits

Avoid GPU OOM for a TF Rag test (#17638) · 224bde91
Yih-Dar authored Jun 10, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
224bde91
fix typo from emtpy to empty (#17643) · 39e14614
Domenic Rosati authored Jun 10, 2022

39e14614
[Generation Test] Make fast test actually fast (#17661) · 13e875cc
Patrick von Platen authored Jun 10, 2022

13e875cc
[Data2Vec] Speed up test (#17660) · b4eef63a
Patrick von Platen authored Jun 10, 2022

b4eef63a
[BigBirdFlaxTests] Make tests slow (#17658) · 5e428b71
Patrick von Platen authored Jun 10, 2022
```
* [BigBirdFlaxTests] Make tests slow

* up

* correct black with new version
```
5e428b71
update README.md (#17657) · 3114df41
Loubna Ben Allal authored Jun 10, 2022
```
- use CodeParrot scores of v1.1
- change evaluation command to use accelerate
```
3114df41

🐛

Properly raise `RepoNotFoundError` when not authenticated (#17651) · c99ddcc4

Simon Brandeis authored Jun 10, 2022

* Raise RepoNotFoundError in case of 401

* Include changes from revert-17646-skip_repo_not_found

* Add a comment

* 💄 Code quality

* 💚 Update `get_from_cache` test

* 💚 Code quality & skip failing test

c99ddcc4

Fixes #17128 . (#17356) · 35b16032

Balaji authored Jun 10, 2022



VisibleDeprecationWarning is addressed by specifying dtype=object when creating numpy array.
Update code based on review feedback.
Undo whitespace changes to tokenization_utils_base.py.
Co-authored-by: I like data <ilikedata@nym.hush.com>

35b16032

Fix dtype getters (#17656) · b8809091
Sylvain Gugger authored Jun 10, 2022

b8809091
Add skip logic for attentions test - Levit (#17633) · fd1e6703
amyeroberts authored Jun 10, 2022

fd1e6703
Fix style · cdaed367
Lysandre authored Jun 10, 2022

cdaed367
Fix style · 2bc30510
Lysandre authored Jun 10, 2022

2bc30510

Bump cookiecutter in /examples/research_projects/decision_transformer (#17645) · 1d463303

dependabot[bot] authored Jun 10, 2022

Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1.
- [Release notes](https://github.com/cookiecutter/cookiecutter/releases)
- [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md)
- [Commits](https://github.com/cookiecutter/cookiecutter/compare/1.7.2...2.1.1

)

---
updated-dependencies:
- dependency-name: cookiecutter
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

1d463303

Enable crop_center method to handle (W, H, C) images (#17626) · 49becbaa
Alara Dirik authored Jun 10, 2022
```
* enable crop_center method to handle (W, H, C) images

* minor style and comment edits
```
49becbaa

Move Clip image utils to image_utils.py (#17628) · 6e93d947

Alara Dirik authored Jun 10, 2022

* move clip image utils to image_utils.py

* dont default to square images

* fix typo, revert change to test file

* edit convert_rgb comments

6e93d947

Skip tests until bug is fixed. (#17646) · af4a1eca
Sylvain Gugger authored Jun 09, 2022

af4a1eca

Translation/autoclass (#17615) · e0b58fb5

Martina Fumanelli authored Jun 10, 2022



* Add Italian translation for autoclass_tutorial.mdx

* Fix synthesis
Co-authored-by: martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>

e0b58fb5

09 Jun, 2022 2 commits
- didn't exist in pt-1.9 (#17644) · df1ec6b1
  Stas Bekman authored Jun 09, 2022
  
  df1ec6b1
- convert assertion to raised exception in debertav2 (#17619) · fba0b6a8
  mrbean authored Jun 09, 2022
```
* convert assertion to raised exception in debertav2

* change assert to raise exception in deberta

* fix messages
```
  fba0b6a8