Commits · 95a904104ee65cda49495e205ddad263cb3d9ee5 · chenpangpang / transformers

"docs/source/en/model_doc/efficientformer.md" did not exist on "78a53d59cb6fa444a95d6be4d15fb3a25e6a8a2e"

13 Sep, 2023 1 commit
- Fix `test_finetune_bert2bert` (#25984) · 95a90410
  Yih-Dar authored Sep 13, 2023
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  95a90410
05 Sep, 2023 1 commit
- Trainer: delegate default generation values to `generation_config` (#25987) · 9a70d6e5
  Joao Gante authored Sep 05, 2023
  
  9a70d6e5
25 Aug, 2023 1 commit

[`Refactor`] Move third-party related utility files into `integrations/` folder

(#25599) · 4b796978

Younes Belkada authored Aug 25, 2023



* move deepspeed to `lib_integrations.deepspeed`

* more refactor

* oops

* fix slow tests

* Fix docs

* fix docs

* addess feedback

* address feedback

* final modifs for PEFT

* fixup

* ok now

* trigger CI

* trigger CI again

* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* import from `integrations`

* address feedback

* revert removal of `deepspeed` module

* revert removal of `deepspeed` module

* fix conflicts

* ooops

* oops

* add deprecation warning

* place it on the top

* put `FutureWarning`

* fix conflicts with not_doctested.txt

* add back `bitsandbytes` module with a depr warning

* fix

* fix

* fixup

* oops

* fix doctests

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4b796978

11 Jul, 2023 1 commit

🐛

Handle empty gen_kwargs for seq2seq trainer prediction_step function (#24759) · 4c0e251d

Gaurav Kumbhat authored Jul 11, 2023

* 🐛

 Handle empty gen_kwargs for seq2seq trainer prediction_step fn
Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

* Update src/transformers/trainer_seq2seq.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4c0e251d

14 Apr, 2023 1 commit
- Seq2SeqTrainer: Evict decoder_input_ids only when it is created from labels (#22772) · 895ae3b5
  Joao Gante authored Apr 14, 2023
  
  895ae3b5
06 Apr, 2023 1 commit
- Seq2SeqTrainer: use unwrapped model to retrieve the generation config (#22584) · 48706c71
  Joao Gante authored Apr 06, 2023
  
  48706c71
27 Mar, 2023 3 commits

Trainer: missing None check (#22404) · 738944c9
Joao Gante authored Mar 27, 2023
```
missing None check
```
738944c9
Trainer: move Seq2SeqTrainer imports under the typing guard (#22401) · 53155b52
Joao Gante authored Mar 27, 2023

53155b52

Seq2seq trainer generation config arg (#22323) · 5506d049

Nathan Fradet authored Mar 27, 2023



* seq2seq trainer and training arguments accepting GenerationConfig arg

* seq2seq Trainer and training arguments docstring fixes

* Update training_args_seq2seq.py docstring
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fixing trainer_seq2seq.py docstring
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* seq2seq trainer: legacy gen args back & GenerationConfig created at init

* Seq2seq trainer: fix in case gen_config.max_new_tokens is None
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* seq2seq trainer: adding legacy arg retrocompatibility

* seq2seq trainer and training arguments accepting GenerationConfig arg

* seq2seq Trainer and training arguments docstring fixes

* Update training_args_seq2seq.py docstring
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fixing trainer_seq2seq.py docstring
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* seq2seq trainer: legacy gen args back & GenerationConfig created at init

* Seq2seq trainer: fix in case gen_config.max_new_tokens is None
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* seq2seq trainer: adding legacy arg retrocompatibility

* seq2seq trainer: evaluate and predict untouched

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* seq2seq trainer: adding init args, keeping IDEs hints

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

5506d049

13 Mar, 2023 1 commit
- Trainer: let generate pick its inputs (#22108) · e16cbe88
  Joao Gante authored Mar 13, 2023
```
* Let generate pick its inputs

* fix squad seq2seq example
```
  e16cbe88
06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

31 Jan, 2023 1 commit
- Do not log the generation config for each prediction step in TrainerSeq2Seq (#21385) · d31497b1
  regisss authored Jan 31, 2023
```
Do not log the generation config for each iteration
```
  d31497b1
18 Oct, 2022 1 commit
- fix seq2seqtrainer predict without labels (#19721) · 02b63702
  Ivan Sedykh authored Oct 18, 2022
  
  02b63702
01 Sep, 2022 1 commit

reflect max_new_tokens in `Seq2SeqTrainer` (#18786) · ab663b22

kumapo authored Sep 01, 2022

* reflect max_new_tokens in gen_kwargs to `trainer.generate()`

* reflect max_new_tokens in `Seq2SeqTrainer`

* remove unnecessary variable

* Trigger CI

* fix style

ab663b22

06 Jul, 2022 1 commit
- Doc to dataset (#18037) · 2e90c3df
  Sylvain Gugger authored Jul 06, 2022
```
* Link to the Datasets doc

* Remove unwanted file
```
  2e90c3df
22 Jun, 2022 1 commit

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer`... · 13570381

Eran Hirsch authored Jun 22, 2022

Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805)

* Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`

* Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it

* Remove `self._num_beams` from trainer classes

* - Run fixup
- Fix "Constraint" not exposed
- Fix synced_gpus to actually read from param

* Use kwargs

* Copy kwargs before making changes to it

* Fix style issues unused imports

13570381

25 May, 2022 1 commit

Support compilation via Torchdynamo, AOT Autograd, NVFuser (#17308) · 897a8dd8

Animesh Jain authored May 25, 2022



* Support compilation via Torchdynamo, AOT Autograd, NVFuser

* Address comments

* Lint

* Stas comments - missing quality test

* Lintere

* Quality test

* Doc lint

* Reset CUDA peak mem

* Add CustomTrainer

* require a single gpu
Co-authored-by: Stas Bekman <stas@stason.org>

897a8dd8

05 Apr, 2022 1 commit

Add global_attention_mask to gen_kwargs (#16485) · b33ab4eb

John Giorgi authored Apr 05, 2022

If global_attention_mask is found in the models inputs (used by certain
models, like LED) in the prediction_step method of Seq2SeqTrainer,
it is added to the gen_kwargs, which are passed to model.decode().
This allows us to properly set the global attention when decoding.

b33ab4eb

10 Feb, 2022 1 commit
- Fix Seq2SeqTrainer (#15603) · 3a2ed967
  NielsRogge authored Feb 10, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
```
  3a2ed967
11 Jan, 2022 1 commit
- change metric_key_prefix in seq2seq_trainer.py (#15099) · 285131bf
  JejuWayfarer authored Jan 11, 2022
```
It solves the problem that metric_key_prefix is different from trainer.
```
  285131bf
27 Dec, 2021 2 commits

[doc] :obj: hunt (#14954) · e13f72fb
Stas Bekman authored Dec 27, 2021
```
* redo sans examples

* style
```
e13f72fb

Doc styler v2 (#14950) · 87e6e4fe

Sylvain Gugger authored Dec 27, 2021

* New doc styler

* Fix issue with args at the start

* Code sample fixes

* Style code examples in MDX

* Fix more patterns

* Typo

* Typo

* More patterns

* Do without black for now

* Get more info in error

* Docstring style

* Re-enable check

* Quality

* Fix add_end_docstring decorator

* Fix docstring

87e6e4fe

23 Dec, 2021 1 commit
- [Generate] Remove attention_mask and integrate model_main_input_name (#14856) · fe4197ab
  Patrick von Platen authored Dec 23, 2021
```
* up

* save

* correct

* up

* correct more

* up

* up

* up

* up

* up

* correct

* fix tf

* fix

* remove tokenizer
```
  fe4197ab
21 Dec, 2021 1 commit

Mass conversion of documentation from rst to Markdown (#14866) · 27b3031d

Sylvain Gugger authored Dec 21, 2021

* Convert docstrings of all configurations and tokenizers

* Processors and fixes

* Last modeling files and fixes to models

* Pipeline modules

* Utils files

* Data submodule

* All the other files

* Style

* Missing examples

* Style again

* Fix copies

* Say bye bye to rst docstrings forever

27b3031d

20 Dec, 2021 1 commit

[Seq2SeqTrainer] Remove model input name hack (#14802) · 091693b4

Patrick von Platen authored Dec 20, 2021

* [Seq2SeqTrainer] Remove model input name hack

* Update src/transformers/trainer_seq2seq.py

* make style

* finish

091693b4

07 Dec, 2021 2 commits

[trainer] conditional ctx managers into one wrapper (#14663) · fae0b9fa

Stas Bekman authored Dec 07, 2021



* [trainer] conditional ctx managers into one wrapper

* workaround for contextlib.nullcontext for py<3.7

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* one more autocast

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fae0b9fa

Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172,... · 39f1dff5

TranSirius authored Dec 08, 2021

Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict (#14546)

* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()

* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()

39f1dff5

29 Oct, 2021 1 commit
- Add a condition for checking labels (#14211) · e823d819
  Haram Lee authored Oct 30, 2021
  
  e823d819
28 Oct, 2021 1 commit

Fix EncoderDecoderModel classes to be more like BART and T5 (#14139) · ac12a5ae

NielsRogge authored Oct 28, 2021

* First draft

* Make tuple output more readable

* Replace assertions by value errors

* Make it possible to predict_with_generate for vision and speech models

* Adapt Seq2SeqTrainer to work with VisionEncoderDecoder/SpeechEncoderDecoder

* Add deprecation warning

* Add copied from statements to vision and speech encoder decoders

* Fix failing test

* Apply @patrickvonplaten's suggestion

* Use reshape instead of view for consistency

ac12a5ae

31 Aug, 2021 1 commit
- Add generate kwargs to Seq2SeqTrainingArguments (#13339) · c76de105
  Sylvain Gugger authored Aug 31, 2021
```
* Add generate kwargs to Seq2SeqTrainingArguments

* typo

* Address review comments + doc

* Style
```
  c76de105
19 Aug, 2021 1 commit
- Update namespaces inside torch.utils.data to the latest. (#13167) · 91ff480e
  Allan Lin authored Aug 19, 2021
```
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
```
  91ff480e
27 Jul, 2021 1 commit
- `Seq2SeqTrainer` set max_length and num_beams only when non None (#12899) · 12e02e33
  cchen-dialpad authored Jul 27, 2021
```
* set max_length and num_beams only when non None

* fix instance variables

* fix code style
```
  12e02e33
02 Jun, 2021 1 commit
- [deepspeed] Move code and doc into standalone files (#11984) · 640318be
  Stas Bekman authored Jun 02, 2021
```
* move code and docs

* style

* moved

* restore
```
  640318be
26 Apr, 2021 1 commit
- fix some typos in docs, comments, logging/errors (#11432) · b24ead87
  LSinev authored Apr 26, 2021
  
  b24ead87
08 Apr, 2021 1 commit

[DeepSpeed] ZeRO Stage 3 (#10753) · c6d66484

Stas Bekman authored Apr 08, 2021



* synced gpus

* fix

* fix

* need to use t5-small for quality tests

* notes

* complete merge

* fix a disappearing std stream problem

* start zero3 tests

* wip

* tune params

* sorting out the pre-trained model loading

* reworking generate loop wip

* wip

* style

* fix tests

* split the tests

* refactor tests

* wip

* parameterized

* fix

* workout the resume from non-ds checkpoint pass + test

* cleanup

* remove no longer needed code

* split getter/setter functions

* complete the docs

* suggestions

* gpus and their compute capabilities link

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* style

* remove invalid paramgd

* automatically configure zero3 params that rely on hidden size

* make _get_resized_embeddings zero3-aware

* add test exercising resize_token_embeddings()

* add docstring
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

c6d66484

01 Feb, 2021 1 commit
- Remove subclass for sortish sampler (#9907) · 115d97dd
  Sylvain Gugger authored Feb 01, 2021
```
* Remove subclass for sortish sampler

* Use old Seq2SeqTrainer in script

* Styling
```
  115d97dd
27 Dec, 2020 1 commit
- push (#9320) · 8e74eca7
  Patrick von Platen authored Dec 27, 2020
  
  8e74eca7
22 Dec, 2020 1 commit

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6