Commits · 2749e479f30ab13235b0b9b4a6bbcf4c3b29a081 · chenpangpang / transformers

08 Feb, 2024 1 commit

[Docs] Fix broken links and syntax issues (#28918) · 2749e479

Klaus Hipp authored Feb 08, 2024

* Fix model documentation links in attention.md

* Fix external link syntax

* Fix target anchor names of section links

* Fix copyright statement comments

* Fix documentation headings

2749e479

16 Jan, 2024 1 commit

Improving Training Performance and Scalability Documentation (#28497) · 002566f3

Hamza FILALI authored Jan 16, 2024



* Improving Training Performance and Scaling documentation by adding PEFT techniques to suggestions to reduce memory requirements for training

* Update docs/source/en/perf_train_gpu_one.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

002566f3

24 Nov, 2023 1 commit

Reflect RoCm support in the documentation (#27636) · c13a43aa

fxmarty authored Nov 24, 2023



* reflect RoCm support in the documentation

* Update docs/source/en/main_classes/trainer.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* fix review comments

* use ROCm instead of RoCm

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

c13a43aa

06 Nov, 2023 1 commit
- [docs] fixed links with 404 (#27327) · 9beb2737
  Maria Khalusova authored Nov 06, 2023
```
* fixed links with 404

* make style
```
  9beb2737
22 Sep, 2023 1 commit

[`core` ] Integrate Flash attention 2 in most used models (#25598) · 368a58e6

Younes Belkada authored Sep 22, 2023



* v1

* oops

* working v1

* fixup

* add some TODOs

* fixup

* padding support + try with module replacement

* nit

* alternative design

* oops

* add `use_cache` support for llama

* v1 falcon

* nit

* a bit of refactor

* nit

* nits nits

* add v1 padding support falcon (even though it seemed to work before)

* nit

* falcon works

* fixup

* v1 tests

* nit

* fix generation llama flash

* update tests

* fix tests + nits

* fix copies

* fix nit

* test- padding mask

* stype

* add more mem efficient support

* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fixup

* nit

* fixup

* remove it from config when saving

* fixup

* revert docstring

* add more checks

* use values

* oops

* new version

* fixup

* add same trick for falcon

* nit

* add another test

* change tests

* fix issues with GC and also falcon

* fixup

* oops

* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* add init_rope

* updates

* fix copies

* fixup

* fixup

* more clarification

* fixup

* right padding tests

* add docs

* add FA in docker image

* more clarifications

* add some figures

* add todo

* rectify comment

* Change to FA2

* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* split in two lines

* change test name

* add more tests

* some clean up

* remove `rearrange` deps

* add more docs

* revert changes on dockerfile

* Revert "revert changes on dockerfile"

This reverts commit 8d72a66b4b9b771abc3f15a9b9506b4246d62d8e.

* revert changes on dockerfile

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* address some comments

* docs

* use inheritance

* Update src/transformers/testing_utils.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* fixup

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py

* final comments

* clean up

* style

* add cast + warning for PEFT models

* fixup

---------
Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

368a58e6

31 Aug, 2023 1 commit

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer (#25807) · 99fc3ac8

Vibhor Kumar authored Aug 31, 2023



* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

99fc3ac8

18 Aug, 2023 1 commit

[`Docs` / `BetterTransformer` ] Added more details about flash attention + SDPA (#25265) · 940d1a76

Younes Belkada authored Aug 18, 2023



* added more details about flash attention

* correct and add more details

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* few modifs

* more details

* up

* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* adapt from suggestion

* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* trigger CI

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix nits and copies

* add new section

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

940d1a76

25 Jul, 2023 1 commit
- Set `TF32` flag for PyTorch cuDNN backend (#25075) · 6bc61aa7
  Xuehai Pan authored Jul 25, 2023
  
  6bc61aa7
24 Jul, 2023 1 commit

[docs] Performance docs tidy up, part 1 (#23963) · 75317aef

Maria Khalusova authored Jul 24, 2023



* first pass at the single gpu doc

* overview: improved clarity and navigation

* WIP

* updated intro and deepspeed sections

* improved torch.compile section

* more improvements

* minor improvements

* make style

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* feedback addressed

* mdx -> md

* link fix

* feedback addressed

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

75317aef

20 Jun, 2023 1 commit

Migrate doc files to Markdown. (#24376) · eb849f66

Sylvain Gugger authored Jun 20, 2023



* Rename index.mdx to index.md

* With saved modifs

* Address review comment

* Treat all files

* .mdx -> .md

* Remove special char

* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

eb849f66

27 Apr, 2023 1 commit

Add methods to PreTrainedModel to use PyTorch's BetterTransformer (#21259) · 3042c63a

fxmarty authored Apr 27, 2023



* fix mess

* better documentation

* typo

* fix doc

* update

* add test

* fix test

* more tests

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* move to utils

* Apply suggestions from code review
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* nit

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

3042c63a

08 Dec, 2022 1 commit

Migrate torchdynamo to torch.compile (#20634) · 9cc65f87

Sylvain Gugger authored Dec 08, 2022

* Migrate torchdynamo to torch.compile

* Add docstring and generic option

* Properly use the function...

* Reorg args

9cc65f87

30 Nov, 2022 1 commit
- Repurpose torchdynamo training args towards torch._dynamo (#20498) · 08b46218
  Sylvain Gugger authored Nov 30, 2022
```
* Repurpose torchdynamo training args towards torch._dynamo

* Add doc
```
  08b46218
07 Nov, 2022 2 commits

docs: Resolve many typos in the English docs (#20088) · 3222fc64

Tom Aarsen authored Nov 07, 2022

* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance'

* docs: Resolve many typos in the English docs

Typos found via 'codespell ./docs/source/en'

3222fc64

Replace unsupported facebookresearch/bitsandbytes (#20093) · b8112edd
Tom Aarsen authored Nov 07, 2022
```
With https://github.com/TimDettmers/bitsandbytes, which is by the same author and is still being updated
```
b8112edd

24 Oct, 2022 1 commit
- fixed typo in fp16 training section for perf_train_gpu_one (#19736) · 5cbf1fa8
  Dhruv Singal authored Oct 24, 2022
  
  5cbf1fa8
18 Oct, 2022 1 commit
- Fix typo in perf docs (#19705) · 71ca7944
  Christopher Akiki authored Oct 18, 2022
  
  71ca7944
17 Oct, 2022 1 commit
- Update perf_train_gpu_one.mdx (#19676) · aa629e7a
  Christopher Akiki authored Oct 17, 2022
  
  aa629e7a
05 Sep, 2022 1 commit
- Update perf_train_gpu_one.mdx (#18442) · 17c634fd
  Surya Prakash Sahu authored Sep 05, 2022
  
  17c634fd
18 Aug, 2022 1 commit

[bnb] Move documentation (#18671) · a123eee9

Younes Belkada authored Aug 18, 2022



* fix bnb documentation

- move bnb documentation to `infer_gpu_many`

* small refactoring

- added text on infer_gpu_one
- added a small note on infer_gpu_many
- added customized multi gpu example on infer_gpu_many

* Update docs/source/en/perf_infer_gpu_many.mdx
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* apply suggestions
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

a123eee9

16 Aug, 2022 1 commit

[bnb] Minor modifications (#18631) · 6d175c11

Younes Belkada authored Aug 17, 2022



* bnb minor modifications

- refactor documentation
- add troubleshooting README
- add PyPi library on DockerFile

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* put in one block

- put bash instructions in one block

* update readme

- refactor a bit hardware requirements

* change text a bit

* Apply suggestions from code review
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* apply suggestions
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* add link to paper

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update tests/mixed_int8/README.md

* Apply suggestions from code review

* refactor a bit

* add instructions Turing & Amperer
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* add A6000

* clarify a bit

* remove small part

* Update tests/mixed_int8/README.md
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

6d175c11

08 Aug, 2022 1 commit
- Update perf_train_gpu_one.mdx (#18532) · f1f5de31
  Mishig Davaadorj authored Aug 08, 2022
  
  f1f5de31
06 Aug, 2022 1 commit

Just re-reading the whole doc every couple of months

😬

(#18489) · 8d1f9039

Julien Chaumond authored Aug 06, 2022

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

8d1f9039

13 Jul, 2022 1 commit

Enable torchdynamo with torch_tensorrt(fx path) (#17765) · 7ea6ccc2

Wei authored Jul 13, 2022



* enable fx2trt

* Update perf_train_gpu_one.mdx

* Update perf_train_gpu_one.mdx

* add lib check

* update

* format

* update

* fix import check

* fix isort

* improve doc

* refactor ctx manager

* fix isort

* black format

* isort fix

* fix format

* update args

* update black

* cleanups

* Update perf_train_gpu_one.mdx

* code refactor

* code refactor to init

* remove redundancy

* isort

* replace self.args with args
Co-authored-by: Stas Bekman <stas@stason.org>

7ea6ccc2

06 Jul, 2022 1 commit
- Doc to dataset (#18037) · 2e90c3df
  Sylvain Gugger authored Jul 06, 2022
```
* Link to the Datasets doc

* Remove unwanted file
```
  2e90c3df
01 Jul, 2022 1 commit
- Fix typo in perf_train_gpu_one.mdx (#17983) · cb425024
  Billy Cao authored Jul 01, 2022
  
  cb425024
16 May, 2022 1 commit

[WIP] [doc] performance/scalability revamp (#15723) · 71abd3ad

Stas Bekman authored May 16, 2022



* [doc] performance/scalability revamp

* link the new docs

* no :

* mixed precision

* work on the first doc

* expand the main doc

* Trigger CI

* style

* revamp single GPU training section

* work on training performance

* remove files not used anymore or will be added later

* final touches

* fix rebase

* Add hardware section to toctree

* fix toctree again

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove `fast_tokenizers` entry that was copied in rebase

* add warning about DP vs DDP

* remove todo

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix missing closure of codeblock

* Update docs/source/en/perf_train_gpu_many.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* sync with #16860

* update toc
Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71abd3ad

20 Apr, 2022 1 commit
- [docs] fix url (#16860) · 67ed0e43
  Stas Bekman authored Apr 20, 2022
  
  67ed0e43
04 Apr, 2022 1 commit

Enable doc in Spanish (#16518) · b9a768b3

Sylvain Gugger authored Apr 04, 2022

* Reorganize doc for multilingual support

* Fix style

* Style

* Toc trees

* Adapt templates

b9a768b3

25 Mar, 2022 1 commit
- Big file_utils cleanup (#16396) · 088c1880
  Sylvain Gugger authored Mar 25, 2022
```
* Big file_utils cleanup

* This one still needs to be treated separately
```
  088c1880
09 Feb, 2022 1 commit

add model scaling section (#15119) · d923f762

Leandro von Werra authored Feb 09, 2022



* add model scaling section

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* integrate reviewer feedback

* initialize GPU properly

* add note about BnB optimizer

* move doc from `scaling.mdx` to `performance.mdx`

* integrate reviewer feedback

* revert section levels
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d923f762

17 Jan, 2022 1 commit
- [doc] new MoE paper (#15184) · edd3fce2
  Stas Bekman authored Jan 17, 2022
```
add new paper
```
  edd3fce2
15 Jan, 2022 1 commit
- [doc] performance: Efficient Software Prebuilds (#15147) · 669e3c50
  Stas Bekman authored Jan 14, 2022
```
* Efficient Software Prebuilds

* improve
```
  669e3c50
10 Jan, 2022 1 commit

[performance doc] Power and Cooling (#14935) · 37bc0b4e

Stas Bekman authored Jan 10, 2022



* [performance doc] Power and Cooling

* more docs

* Update docs/source/performance.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* reword
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

37bc0b4e

22 Dec, 2021 1 commit

Convert rst files (#14888) · 207594be

Sylvain Gugger authored Dec 22, 2021

* Convert all tutorials and guides

* Convert all remaining rst to mdx

* Track and fix bad links

207594be

16 Dec, 2021 1 commit
- Removes images to put them in a dataset (#14781) · 8010fda9
  Lysandre Debut authored Dec 16, 2021
```
* First try

* Update instructions
```
  8010fda9
15 Dec, 2021 1 commit
- [doc] performance: groups of operations by compute-intensity (#14757) · fdf3ce28
  Stas Bekman authored Dec 14, 2021
  
  fdf3ce28
11 Dec, 2021 1 commit
- [doc] document MoE model approach and current solutions (#14725) · 027074f4
  Stas Bekman authored Dec 10, 2021
```
* document MoE model approach

* additional info from Samyam

* fix
```
  027074f4
08 Dec, 2021 1 commit

[bf16 support] tweaks (#14580) · 12286612

Stas Bekman authored Dec 08, 2021



* [bf16 support] tweaks

* corrections
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>

12286612

03 Dec, 2021 1 commit

[trainer] add tf32-mode control (#14606) · 71b1bf7e

Stas Bekman authored Dec 03, 2021



* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71b1bf7e