Commits · efb2ba666da57a92a8ea47c62ed9e90fbf8aa344 · chenpangpang / transformers

24 Jul, 2023 6 commits

Better handling missing SYS in llama conversation tokenizer (#24997) · efb2ba66

Iskren Ivov Chernev authored Jul 24, 2023

* Better handling missing SYS in llama conversation tokenizer

The existing code failed to add SYS if the conversation has history
without SYS, but did modify the passed conversation as it did.

Rearrange the code so modification to the conversation object are taken
into account for token id generation.

* Fix formatting with black

* Avoid one-liners

* Also fix fast tokenizer

* Drop List decl

efb2ba66

Support GatedRepoError + use raise from (#25034) · 67049231

Lucain authored Jul 24, 2023



* Support GatedRepoError + use raise from

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Use token instead of use_auth_token in error messages

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

67049231

[docs] Performance docs tidy up, part 1 (#23963) · 75317aef

Maria Khalusova authored Jul 24, 2023



* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

75317aef

fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. (#25035) · 54ba8608

Bharat Ramanathan authored Jul 24, 2023

fix: store training args to wandb config without sanitization.

Allows resuming runs by reusing the wandb config.
Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>

54ba8608

[`logging.py`] set default `stderr` path if `None` (#25033) · 0906d212
Arthur authored Jul 24, 2023
```
set default logger
```
0906d212
[check_config_docstrings.py] improve diagnostics (#25012) · c9a82be5
Stas Bekman authored Jul 23, 2023
```
* [check_config_docstrings.py] improve diagnostics

* style

* rephrase

* fix
```
c9a82be5

21 Jul, 2023 16 commits

🌐 [i18n-KO] Updated Korean `serialization.md` (#24686) · b257c46a
Wonhyeong Seo authored Jul 22, 2023
```
fix: update ko/serialization.md

* chatgpt draft
```
b257c46a
Move template doc file to md (#25004) · 87fba947
Sylvain Gugger authored Jul 21, 2023

87fba947

improve from_pretrained for zero3 multi gpus mode (#24964) · ea41e18c

Ivan Sorokin authored Jul 21, 2023



* improve from_pretrained for zero3 multi gpus mode

* Add check if torch.distributed.is_initialized

* Revert torch.distributed

---------
Co-authored-by: Stas Bekman <stas@stason.org>

ea41e18c

[`Llama`] remove persistent `inv_freq` tensor (#24998) · 95f96b45
Arthur authored Jul 21, 2023
```
remove persistent tensor
```
95f96b45
[`bnb`] Add simple check for bnb import (#24995) · d3ce048c
Younes Belkada authored Jul 21, 2023
```
add simple check for bnb
```
d3ce048c
Fix `llama` tokenization doctest (#24990) · f1a1eb4a
Yih-Dar authored Jul 21, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
f1a1eb4a
Use main_input_name for include_inputs_for_metrics (#24993) · a7d21318
Sylvain Gugger authored Jul 21, 2023

a7d21318
Fix type annotation for deepspeed training arg (#24988) · a6484c89
Sylvain Gugger authored Jul 21, 2023

a6484c89
Avoid importing all models when instantiating a pipeline (#24960) · 5b7ffd54
Sylvain Gugger authored Jul 21, 2023
```
* Avoid importing all models when instantiating a pipeline

* Remove sums that don't work
```
5b7ffd54
Remove tokenizers from the doc table (#24963) · 640e1b6c
Sylvain Gugger authored Jul 21, 2023

640e1b6c
[`LlamaConfig`] Nit: pad token should be None by default (#24958) · 0511369a
Arthur authored Jul 21, 2023
```
* pad token should be None by default

* fix tests

* nits
```
0511369a

Fix missing spaces in system prompt of Llama2 tokenizer (#24930) · f74560d0

Joya Chen authored Jul 21, 2023



* Update tokenization_llama.py

* Update tokenization_llama_fast.py

* Update src/transformers/models/llama/tokenization_llama_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

f74560d0

fsdp fixes and enhancements (#24980) · f4eb459e

Sourab Mangrulkar authored Jul 21, 2023

* fix fsdp prepare to remove the warnings and fix excess memory usage

* Update training_args.py

* parity for FSDP+XLA

* Update trainer.py

f4eb459e

🌐

[i18n-KO] Fixed Korean and English `quicktour.md` (#24664) · ec3dfe5e

Wonhyeong Seo authored Jul 21, 2023



* fix: english/korean quicktour.md

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

* fix: follow glossary

* 파인튜닝 -> 미세조정

---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>

ec3dfe5e

fix: cast input pixels to appropriate dtype for image_to_text pipelines (#24947) · 83f9314d

Jim Allanson authored Jul 21, 2023

* fix: cast input pixels to appropriate dtype for image_to_text tasks

* fix: add casting to pixel inputs of additional models after running copy checks

83f9314d

fix fsdp checkpointing issues (#24926) · 1c7e5e23
Sourab Mangrulkar authored Jul 21, 2023
```
* fix fsdp load

* Update trainer.py

* remove saving duplicate state_dict
```
1c7e5e23

20 Jul, 2023 15 commits

Fallback for missing attribute `Parameter.ds_numel` (#24942) · 9ef5256d
Apoorv Khandelwal authored Jul 20, 2023
```
* [trainer] fallback for deepspeed param count

* [trainer] more readable numel count
```
9ef5256d
Contrastive Search peak memory reduction (#24120) · caf5e369
Benjamin Badger authored Jul 20, 2023
```
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
```
caf5e369
Change logic for logging in the examples (#24956) · aa1b09c5
Zach Mueller authored Jul 20, 2023
```
Change logic
```
aa1b09c5
[`RWKV`] Add Gradient Checkpointing support for RWKV (#24955) · 89a1f342
Younes Belkada authored Jul 20, 2023
```
add GC support for RWKV
```
89a1f342

Bump aiohttp from 3.8.1 to 3.8.5 in /examples/research_projects/decision_transformer (#24954) · 9f912ef6

dependabot[bot] authored Jul 20, 2023

Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.1 to 3.8.5.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/v3.8.5/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.8.1...v3.8.5

)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

9f912ef6

fix type annotations for arguments in training_args (#24550) · e75cb0cb

Shauray Singh authored Jul 20, 2023

* testing

* example script

* fix typehinting

* some tests

* make test

* optional update

* Union of arguments

* does this fix the issue

* remove reports

* set default to False

* documentation change

* None support

* does not need None

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments" (#24574)

Revert "Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)"

This reverts commit c5e29d43

.

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments (#24549)

* Fix typing annotations for FSDP and DeepSpeed in TrainingArguments

* Change dict to Dict

* merge

* hacky fix

* fixup

---------
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

e75cb0cb

[DOCS] Example for `LogitsProcessor` class (#24848) · 0c41765d

Shauray Singh authored Jul 20, 2023

* make docs

* fixup

* resolved

* remove debugs

* Revert "fixup"

This reverts commit 5e0f636aae0bf8707bc8bdaa6a9427fbf66834ed.

* prev (ignore)

* fixup broke some files

* remove files

* reverting modeling_reformer

* lang fix

0c41765d

Fix `main_input_name` in `src/transformers/keras_callbacks.py` (#24916) · 35c04596
Yih-Dar authored Jul 20, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
35c04596
Update processing_vision_text_dual_encoder.py (#24950) · 85514c17
Premtim Sa authored Jul 20, 2023
```
Fixing small typo: kwrags -> kwargs
```
85514c17

Bump pygments from 2.11.2 to 2.15.0 in /examples/research_projects/decision_transformer (#24949) · 98598066

dependabot[bot] authored Jul 20, 2023

Bump pygments in /examples/research_projects/decision_transformer

Bumps [pygments](https://github.com/pygments/pygments) from 2.11.2 to 2.15.0.
- [Release notes](https://github.com/pygments/pygments/releases)
- [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES)
- [Commits](https://github.com/pygments/pygments/compare/2.11.2...2.15.0

)

---
updated-dependencies:
- dependency-name: pygments
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

98598066

Generate: sequence bias can handle same terminations (#24822) · 89136ff7
Joao Gante authored Jul 20, 2023

89136ff7
replace no_cuda with use_cpu in test_pytorch_examples (#24944) · 37d8611a
statelesshz authored Jul 20, 2023
```
* replace no_cuda with use_cpu in test_pytorch_examples

* remove codes that never be used

* fix style
```
37d8611a

Deprecate unused OpenLlama architecture (#24922) · 79444f37

Tom Aarsen authored Jul 20, 2023

* Resolve typo in check_repo.py

* Specify encoding when opening modeling files

* Deprecate the OpenLlama architecture

* Add disclaimer pointing to Llama

I'm open to different wordings here

* Match the capitalisation of LLaMA

79444f37

Add multi-label text classification support to pytorch example (#24770) · 8fd8c8e4

ranchlai authored Jul 20, 2023

* Add text classification example

* set the problem type and finetuning task

* ruff reformated

* fix bug for unseting label_to_id for regression

* update README.md

* fixed finetuning task

* update comment

* check if label exists in feature before removing

* add useful logging

8fd8c8e4

🌐

[i18n-KO] Translated`tasks/document_question_answering.md` to Korean (#24588) · 7381987f

Jungnerd authored Jul 20, 2023



* docs: ko: `document_question_answering.md`

* fix: resolve suggestions
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>

---------
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>

7381987f

19 Jul, 2023 3 commits
- [doc] `image_processing_vilt.py` wrong default documented (#24931) · 6112b1c6
  Stas Bekman authored Jul 19, 2023
```
[doc] image_processing_vilt.py wrong default
```
  6112b1c6
- [`Llama2`] replace `self.pretraining_tp` with `self.config.pretraining_tp` (#24906) · ee4250a3
  Younes Belkada authored Jul 19, 2023
```
* add possibility to disable TP

* fixup

* adapt from offline discussions
```
  ee4250a3
- Fix minor llama2.md model doc typos (#24909) · 3a43794d
  Travis Cline authored Jul 19, 2023
```
Update llama2.md

 Fix typos in the llama2 model doc
```
  3a43794d