Commits · c0f99b4d2ec73090595914dde4c16da207e21d73 · chenpangpang / transformers

03 Apr, 2023 4 commits

Arthur authored Apr 03, 2023

* draft

* update tokenization limma and conversion script

* more udpates

* initial commit

* style

* default pad to None

* draft tokenization tests

* update test

* update tokenization tests

* nits

* update

* versioning test

* major fix

* fix more testst

* finish fixing special masks

* last nit

* more nits

* add encode decode tests

* add more

* fix token type ids

* style

c0f99b4d

[Time-Series] fix past_observed_mask type (#22076) · 9eae4aa5
Eli Simhayev authored Apr 03, 2023
```
added > 0.5 to `past_observed_mask`
```
9eae4aa5

Backbone add out indices (#22493) · 559a45d1

amyeroberts authored Apr 03, 2023

* Add out_indices to backbones, deprecate out_features

* Update - can specify both out_features and out_indices but not both

* Can specify both

* Fix copies

* Add out_indices to convnextv2 configuration

559a45d1

Update convert_llama_weights_to_hf.py (#22525) · db803b69
kevinpro authored Apr 03, 2023

db803b69

31 Mar, 2023 6 commits

Test fetch v2 (#22367) · c6126280

Sylvain Gugger authored Mar 31, 2023



* Test fetcher v2

* Fix regexes

* Remove sanity check

* Fake modification to OPT

* Fixes some .sep issues

* Remove fake OPT change

* Fake modif for BERT

* Fake modif for init

* Exclude SageMaker tests

* Fix test and remove fake modif

* Fake setup modif

* Fake pipeline modif

* Remove all fake modifs

* Adds options to skip/force tests

* [test-all-models] Fake modif for BERT

* Try this way

* Does the command actually work?

* [test-all-models] Try again!

* [skip circleci] Remove fake modif

* Remove debug statements

* Add the list of important models

* Quality

* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Address review comments

* Address review comments

* Fix and add test

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Address review comments

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

c6126280

Update Neptune callback docstring (#22497) · 3a9464bd

Sabine authored Mar 31, 2023



* update NeptuneCallback docstring

* formatting

* apply make style

---------
Co-authored-by: Aleksander Wojnarowicz <alwojnarowicz@gmail.com>

3a9464bd

Bump redis from 4.5.3 to 4.5.4 in /examples/research_projects/decision_transformer (#22494) · 6fc44656

dependabot[bot] authored Mar 31, 2023

Bump redis in /examples/research_projects/decision_transformer

Bumps [redis](https://github.com/redis/redis-py) from 4.5.3 to 4.5.4.
- [Release notes](https://github.com/redis/redis-py/releases)
- [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES)
- [Commits](https://github.com/redis/redis-py/compare/v4.5.3...v4.5.4

)

---
updated-dependencies:
- dependency-name: redis
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

6fc44656

Making sure we can use safetensors to serialize all the time. (#22437) · d143087d

Nicolas Patry authored Mar 31, 2023



* Making sure we can use safetensors to serialize all the time.

* Expanding the tests for increased coverage.

* Update the test.

* Getting current state of affairs.

* Tentative fix.

* Fixing black version.

* Fixing the worst offenders.

* Try to modify less files.

* Fixing blip_2 (Weird solution right now).

* Fixing deta.

* Fix blip ?

* Missing extra newline.

* No deta modification.

* Adding some comments.

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Addressing comments.

* Addressing comments.

* creating warn_once.

* Warning_once !

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d143087d

Update `Wav2Vec2ProcessorWithLM` doc example (#22474) · 516077b3
Yih-Dar authored Mar 31, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
516077b3
Relax `eos_token_id < 0` checks in `generate()` from `ValueError` to warning (#22472) · da68fd69
lewtun authored Mar 31, 2023
```
* Relax  checks from  to warning

* Fix style

* Replace warnings with logger

* Use warning vs warn
```
da68fd69

30 Mar, 2023 10 commits
- (Re-)Enable Nightly + Past CI (#22393) · 0fe6c6bd
  Yih-Dar authored Mar 30, 2023
```
* Enable Nightly + Past CI

* put schedule

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  0fe6c6bd
- Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default... · d5de578c
  Manuel de Prada authored Mar 30, 2023
```
Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473)

Fix: Multinomial sampling needs "num_beams=1", since by default is 5.
```
  d5de578c
- Llama: support for `max_position_embeddings` (#22471) · 165dd6dc
  Joao Gante authored Mar 30, 2023
```
* Llama now supports max_position_embeddings

* Save config; Cosmetic edits
```
  165dd6dc
- [NLLB-MoE] `model_type` update for auto mapping (#22470) · 349e1242
  Arthur authored Mar 30, 2023
```
edit default model type and testing path set to hf-internal-testing
```
  349e1242
- Guard imports of PreTrainedTokenizerFast on is_tokenizers_available (#22285) · 11426641
  Roy Hvaara authored Mar 30, 2023
```
Guard imports that use the tokenizers library
```
  11426641
- 🚨🚨🚨 Fix ordering of height, width for BLIP image processor (#22466) · 4d7a5b5b
  amyeroberts authored Mar 30, 2023
```
Fix ordering of height,width for BLIP
```
  4d7a5b5b
- Generate: basic token streaming (#22449) · 228792a9
  Joao Gante authored Mar 30, 2023
```
* haha tokens go brrrr
```
  228792a9
- Skip flaky NLLB Moe test for now (#22463) · f0aeb1be
  amyeroberts authored Mar 30, 2023
```
Skip flaky test for now
```
  f0aeb1be
- Rescale image back if it was scaled during PIL conversion (#22458) · 154c6bb7
  amyeroberts authored Mar 30, 2023
```
* Rescale image back if it was scaled during PIL conversion

* do_rescale is defined if PIL image passed in
```
  154c6bb7
- Move common properties to BackboneMixin (#21855) · c15f9375
  amyeroberts authored Mar 30, 2023
```
* Move common properties to BackboneMixin

* Fix failing tests

* Update ConvNextV2 backbone
```
  c15f9375
29 Mar, 2023 14 commits
- Update: ignore padding support for TransfoXL training when n_clusters==0 (#22457) · cd73b9a8
  Stefan Heng authored Mar 29, 2023
```
* Update: ignore padding support for TransfoXL training when n_clusters==0

* Update: transformer XL always pad

* Update: drop doc
```
  cd73b9a8
- Pin ruff (#22455) · 2194943a
  Sylvain Gugger authored Mar 29, 2023
  
  2194943a
- Update release instructions (#22454) · 4c295a26
  Sylvain Gugger authored Mar 29, 2023
  
  4c295a26
- Avoid using personal HF token in CI (#22453) · 97440e9c
  Yih-Dar authored Mar 29, 2023
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  97440e9c
- Update Neptune docs (#22452) · 173193cc
  Sabine authored Mar 29, 2023
  
  173193cc
- Revert "Fix --bf16 option support for Neuron after PR #22300" (#22451) · 5e89a435
  jeffhataws authored Mar 29, 2023
```
This reverts commit fd81746dbec5f17c8285a0fdc72ca4b4c025cc33.
```
  5e89a435
- [`Pix2Struct`] Fix slow test (#22448) · b844f8a9
  Younes Belkada authored Mar 29, 2023
```
fix slow test
```
  b844f8a9
- Revert "Error (also in original) model, scaling only q matrix not qk.T dot... · 55dae94c
  Sylvain Gugger authored Mar 29, 2023
```
Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head))" (#22444)

Revert "Error (also in original) model, scaling only q matrix not qk.T dot product (qk.T/sqrt(dim_per_head)) (#21627)"

This reverts commit bad83008.
```
  55dae94c
- Use real tokenizers if tiny version(s) creation has issue(s) (#22428) · 8894b817
  Yih-Dar authored Mar 29, 2023
```
Fix some tiny model creation issues
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  8894b817
- Don't hard error when cache version can't be converted to int (#22427) · 9b494a15
  Sylvain Gugger authored Mar 29, 2023
  
  9b494a15
- [`Generate`] Add conditional generation for multimodal models (#22424) · 8252e24a
  Younes Belkada authored Mar 29, 2023
```
* add conditional generation

* add comments
```
  8252e24a
- [`bnb`] fix bnb failing test (#22439) · 33f4cb10
  Younes Belkada authored Mar 29, 2023
```
* fix bnb failing test

* fix

* fix

* fixup
```
  33f4cb10
- Hyperparameter search reporting to W&B (#22440) · fab1de72
  Nolwenn Bernard authored Mar 29, 2023
```
Fixes #22429
```
  fab1de72
- Add clean_up_tokenization_spaces to config (#22341) · 8d9c3836
  Arthur authored Mar 29, 2023
```
* add draft changes

* fix failing wav2vec

* style

* make sure that the argument is saved + add tests

* style

* fixup

* update test

* default clean_up_tokenization_spaces to False for Bloom and Llama

* Update code based on review
Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com>

* style

* quality

---------
Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com>
```
  8d9c3836
28 Mar, 2023 4 commits

MBart: Fix docs and doctests (#22422) · b29fd697
Joao Gante authored Mar 28, 2023
```
Fix docs and doctests
```
b29fd697

[performance] ensure `causal_mask` is created directly on device (#22378) · ae5fc2db

Jeff Rasley authored Mar 28, 2023

* ensure causal_mask is created directly on device

* add copy tag to opt, update bart implementation

* add device to all _make_causal_mask copies

* formatting fixes

* more manual fixes due to unlinked versions of _prepare_decoder_attention_mask

ae5fc2db

Fix bug in perplexity guide calculations and update perplexity numbers. Fixes #22348 (#22411) · ed57c979
fpgaminer authored Mar 28, 2023
```
Fix bug in perplexity guide calculations and update perplexity numbers.
```
ed57c979

Bump redis from 4.1.4 to 4.5.3 in /examples/research_projects/decision_transformer (#22410) · 32ff0640

dependabot[bot] authored Mar 27, 2023

Bump redis in /examples/research_projects/decision_transformer

Bumps [redis](https://github.com/redis/redis-py) from 4.1.4 to 4.5.3.
- [Release notes](https://github.com/redis/redis-py/releases)
- [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES)
- [Commits](https://github.com/redis/redis-py/compare/v4.1.4...v4.5.3

)

---
updated-dependencies:
- dependency-name: redis
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

32ff0640

27 Mar, 2023 2 commits

[neptune] fix checkpoint bug with relative out_dir (#22102) · 3ec7a476

Kshiteej K authored Mar 28, 2023



* [neptune] fix checkpoint bug with relative out_dir

* update imports

* reformat with black

* check neptune without imports

* fix typing-related issue

* run black on code

* use os.path.sep instead of raw \

* simplify imports and remove type annotation

* make ruff happy

* apply review suggestions

---------
Co-authored-by: Aleksander Wojnarowicz <alwojnarowicz@gmail.com>

3ec7a476

[WIP]`NLLB-MoE` Adds the moe model (#22024) · 19ade242

Arthur authored Mar 27, 2023

* Initial commit

* update modeling code

* update doc

* add functions necessary

* fix impotrs

* revert changes

* fixup

* more styling to get going

* remove standalone encoder

* update code

* styling

* fix config and model

* update code and some refactoring

* make more tests pass

* Adding NLLB-200 - MoE - 54.5B for no language left behind
Fixes #21300

* fix mor common tests

* styke

* update testing file

* update

* update

* Router2 doc

* update check config with sparse layer

* add dummy router

* update current conversion script

* create on the fly conversion script

* Fixup

* style

* style 2

* fix empty return

* fix return

* Update default config sparse layers

* easier to create sparse layers

* update

* update conversion script

* update modeling

* add to toctree

* styling

* make ruff happy

* update docstring

* update conversion script

* update, will break tests but impelemting top2

* update

* ❗local groups are supported here

* ⚠️ Support for local groups is now removed ⚠️

This is because it has to work with model parallelism that we do not support

* finish simplificaiton

* Fix forward

* style

* fixup

* Update modelling and test, refactoring

* update tests

* remove final layer)norm as it is done in the FF

* routing works! Logits test added

* nit in test

* remove top1router

* style

* make sure sparse are tested. Had to change route_tokens a liottle bit

* add support for unslip models when converting

* fixup

* style

* update test s

* update test

* REFACTOR

* encoder outputs match!

* style

* update testing

* 🎉encoder and decoder logits match 🎉



* styleing

* update tests

* cleanup tests

* fix router test and CIs

* cleanup

* cleanup test styling

* fix tests

* Finally the generation tests match!

* cleanup

* update test

* style testing file

* remove script

* cleanup

* more cleanup

* nits

* update

* NLLB tokenizer is wrong and will be fixed soon

* use LongTensors

* update tests

* revert some small changes

* fix second expert sampling and batch prioritized routing

* update tests

* finish last tests

* make ruff happy

* update

* ruff again

* style

* Update docs/source/en/model_doc/nllb-moe.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updates based on review

* style and fix import issue

* nit

* more nits

* cleanup

* styling

* update test_seconde_expert_policy

* fix name

* last nit on the markdown examples

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

19ade242