Commits · 25e5e3fa56d73d07d6a1c46306a5f3f1fd862463 · chenpangpang / transformers

16 Jul, 2024 1 commit

Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771) · e0dfd7bc

Zach Mueller authored Jul 16, 2024



* 1,100%!

* Clean

* Don't touch DS

* Experiment with dtype allocation

* skip test_load_save_without_tied_weights test

* A little faster

* Include proper upscaling?

* Fixup tests

* Potentially skip?

* Let's see if this fixes git history

* Maintain new dtype

* Fin

* Rm hook idea for now

* New approach, see what breaks

* stage

* Clean

* Stash

* Should be fin now, just need to mark failing models

* Clean up

* Simplify

* Deal with weird models

* Enc/Dec

* Skip w/ reason

* Adjust test

* Fix test

* one more test

* Keep experimenting

* Fix ref

* TO REMOVE: testing feedback CI

* Right push

* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* disable

* Add new func

* Test nits from Amy

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Adjust comment

* Adjust comment on skip

* make private

* Fin

* Should be a not flag

* Clarify and rename test

---------
Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

e0dfd7bc

15 Jul, 2024 1 commit

Avoid race condition (#31973) · a1a34657

Yih-Dar authored Jul 15, 2024



* [test_all] hub

* remove delete

* remove delete

* remove delete

* remove delete

* remove delete

* remove delete

* [test_all]

* [test_all]

* [test_all]

* [test_all]

* [test_all]

* [test_all]

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

a1a34657

11 Jul, 2024 1 commit

Add warning message for beta and gamma parameters (#31654) · 1499a550

Omar Salman authored Jul 11, 2024

* Add warning message for  and  parameters

* Fix when the warning is raised

* Formatting changes

* Improve testing and remove duplicated warning from _fix_key

1499a550

09 Jul, 2024 1 commit
- Test loading generation config with safetensor weights (#31550) · 4c2538b8
  Joao Gante authored Jul 09, 2024
```
fix test
```
  4c2538b8
05 Jul, 2024 1 commit
- Fix serialization for offloaded model (#31727) · 8c5c180d
  Marc Sun authored Jul 05, 2024
```
* Fix serialization

* style

* add test
```
  8c5c180d
02 Jul, 2024 1 commit

Move some test files (`tets/test_xxx_utils.py`) to `tests/utils` (#31730) · 93cd94b7

Yih-Dar authored Jul 02, 2024



* move

* move

* move

* move

* Update tests/utils/test_image_processing_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

93cd94b7

27 Jun, 2024 1 commit
- [QoL] Allow dtype str for torch_dtype arg of from_pretrained (#31590) · 3a028101
  Billy Cao authored Jun 27, 2024
```
* Allow dtype str for torch_dtype in from_pretrained

* Update docstring

* Add tests for str torch_dtype
```
  3a028101
26 Jun, 2024 1 commit

Skip tests properly (#31308) · 1de7dc74

amyeroberts authored Jun 26, 2024

* Skip tests properly

* [test_all]

* Add 'reason' as kwarg for skipTest

* [test_all] Fix up

* [test_all]

1de7dc74

12 Jun, 2024 1 commit
- Use huggingface_hub helper function to split state dict (#31091) · 254b25ab
  Marc Sun authored Jun 12, 2024
```
* shard saving from hf hub

* index = None

* fix tests

* indent
```
  254b25ab
07 Jun, 2024 1 commit

Extend save_pretrained to offloaded models (#27412) · ff689f57

Benjamin Badger authored Jun 07, 2024



* added hidden subset

* debugged hidden subset contrastive search

* added contrastive search compression

* debugged compressed contrastive search

* memory reduction for contrastive search

* debugged mem red

* added low memory option feature

* debugged mem optmimization output stack

* debugged mem optmimization output stack

* debugged low mem

* added low mem cache

* fixed 2047 tensor view

* debugged 2042 past key val inputs

* reformatted tensors

* changed low mem output

* final clean

* removed subset hidden csearch

* fixed hidden device

* fixed hidden device

* changed compressor dtype

* removed hstate compression

* integrated csearch in generate

* test csearch integration into generation

exit()

* fixed csearch kwarg integration with generation

* final wrap and added doc

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* added debug print

* direct hstate cat

* direct hstate cat

* direct hstate cat debug

* direct hstate cat debug

* expanded full hidden state stack

* expanded full hidden state stack

* matched dims for hstates

* matched dims for hstates

* logits fix

* equality test

* equality hidden debug

* debug

* added prints for debug

* added prints for debug

* equality check

* switched squeeze dim

* input format debug

* tracing top_k_ids

* removed trace

* added test context

* added jitter

* added jitter

* added jitter

* returned state

* rebuilt past key value reconstruction

* debugged

* cleaned traces

* added selection for pkv

* changed output to dict

* cleaned

* cleaned

* cleaned up contrastive search test

* moved low_memory kwarg

* debugged

* changed low mem test batch size to 1

* removed output

* debugged test input shape

* reformatted csearch test

* added trace

* removed unsqueeze on final forward pass

* replaced unsqueeze with view

* removed traces

* cleaned

* debugged model kwargs

* removed special models from test

* ran make quality

* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* refactored

* refactored

* refactored

* make fixup

* renamed flag sequential

* renamed flag sequential

* iterative onloading

* black style and test utils

* added traces for integrated test

* debugged

* added traces

* make style

* removed traces, make style

* included suggestions and added test

* debugged test

* added offload module check and make style

* is_accelerate_available and make style

* added test decorator

* changed test model and config spec

* added offload condition

* added lazy loading for each shard

* debugged

* modified sharding

* debugged

* added traces

* removed safe serialization

* no index overload;

* trace on safe save ptrs

* added ptr condition

* debugged

* debugged ptr

* moved module map init

* remake shard only for offloaded modules

* refactored

* debugged

* refactored

* debugged

* cleaned and make style

* cleaned and make style

* added trace

* sparse module map

* debugged

* removed module map conditional

* refactored

* debug

* debugged

* added traces

* added shard mem trace

* added shard mem trace

* removed underlying storage check

* refactored

* memory leak removal and make style

* cleaned

* swapped test decs and make style

* added mem checks and make style

* added free mem warning

* implemented some suggestions

* moved onloading to accelerate

* refactored for accelerate integration

* cleaned test

* make style

* debugged offload map name

* cleaned and make style

* replaced meta device check for sharding

* cleaned and make style

* implemented some suggestions

* more suggestions

* update warning
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* more suggestions

* make style

* new make style

* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

ff689f57

28 May, 2024 1 commit

fix from_pretrained in offline mode when model is preloaded in cache (#31010) · 936ab7ba

oOraph authored May 28, 2024



* Unit test to verify fix
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

* fix from_pretrained in offline mode when model is preloaded in cache
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

* minor: fmt
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>

---------
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com>

936ab7ba

13 May, 2024 1 commit

Llama: fix custom 4D masks, v2 (#30348) · a0779b9e

Poedator authored May 13, 2024



* 4d mask fixes

* Update custom 4D mask logic

* test moved to mixin

* extra tests 4d mask

* upd 4d mask and StaticCache handling

* added Mask4DTestHard to mistral tests

* post-rebase fixes

* test fixes for StaticCache

* make fix-copies

* upd 1 after #30476

* fix common tests

* rm elif attention_mask.dim() == 4:

* tests combined, fixed, mixtral supported

* bigbird style chg reverted

* rm if attention_mask.dim() == 2

* modeling_llama formatting chg

---------
Co-authored-by: Joao Gante <joao@huggingface.co>

a0779b9e

07 May, 2024 1 commit
- Add safetensors to model not found error msg for default use_safetensors value (#30602) · cf7bed98
  David Xue authored May 07, 2024
```
* add safetensors to model not found error for default use_safetensors=None case

* format code w/ ruff

* fix assert true typo
```
  cf7bed98
26 Apr, 2024 1 commit
- Update `dtype_byte_size` to handle torch.float8_e4m3fn/float8_e5m2 types (#30488) · 20081c74
  Michael Goin authored Apr 26, 2024
```
* Update modeling_utils/dtype_byte_size to handle float8 types

* Add a test for dtype_byte_size

* Format

* Fix bool
```
  20081c74
24 Apr, 2024 1 commit
- [tests] make test device-agnostic (#30444) · 16c8e176
  Fanli Lin authored Apr 24, 2024
```
* make device-agnostic

* clean code
```
  16c8e176
19 Apr, 2024 1 commit
- Fix config + attn_implementation in AutoModelForCausalLM.from_pretrained (#30299) · 21c912e7
  hoshi-hiyouga authored Apr 20, 2024
```
* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
```
  21c912e7
10 Apr, 2024 1 commit
- [tests] make 2 tests device-agnostic (#30008) · 18546378
  Fanli Lin authored Apr 10, 2024
```
add torch device
```
  18546378
02 Apr, 2024 1 commit

Hard error when ignoring tensors. (#27484) (#29906) · 9b0a8ea7

Nicolas Patry authored Apr 02, 2024



* Hard error when ignoring tensors. (#27484)

* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add small tests.

* Dead variable.

* Fixup.

* Fixing tied_Weights_keys on generic models.

* Fixup + T5 encoder/decoder tying (with different layers)

* Code quality.

* Dynamic member.

* trigger

* Fixing encoder name for other types of encoder/decoder combos.

* Fix scoping.

* Update .github/workflows/self-scheduled.yml
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Fixing the tied_weights after the call.

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

9b0a8ea7

27 Mar, 2024 1 commit

Reimplement "Automatic safetensors conversion when lacking these files" (#29846) · 4d8427f7

Lysandre Debut authored Mar 27, 2024

* Automatic safetensors conversion when lacking these files (#29390)

* Automatic safetensors conversion when lacking these files

* Remove debug

* Thread name

* Typo

* Ensure that raises do not affect the main thread

* Catch all errors

4d8427f7

25 Mar, 2024 1 commit

Remove static pretrained maps from the library's internals (#29112) · 39114c03

Lysandre Debut authored Mar 25, 2024



* [test_all] Remove static pretrained maps from the library's internals

* Deprecate archive maps instead of removing them

* Revert init changes

* [test_all] Deprecate instead of removing

* [test_all] PVT v2 support

* [test_all] Tests should all pass

* [test_all] Style

* Address review comments

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/deprecated/_archive_maps.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* [test_all] trigger tests

* [test_all] LLAVA

* [test_all] Bad rebase

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

39114c03

19 Mar, 2024 1 commit

Llama: partial 4d masks (#29731) · 4294f0c3

Joao Gante authored Mar 19, 2024



* partial 4d masks

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

4294f0c3

15 Mar, 2024 1 commit
- [tests] remove deprecated tests for model loading (#29450) · c1993e68
  Fanli Lin authored Mar 15, 2024
```
* gix

* fix style

* remove equivalent tests

* add back for image_processor

* remove again
```
  c1993e68
13 Mar, 2024 1 commit
- Llama: allow custom 4d masks (#29618) · 1e21c4fb
  Joao Gante authored Mar 13, 2024
  
  1e21c4fb
11 Mar, 2024 1 commit

Experimental loading of MLX files (#29511) · b382a09e

Pedro Cuenca authored Mar 11, 2024

* Experimental loading of MLX files

* Update exception message

* Add test

* Style

* Use model from hf-internal-testing

b382a09e

08 Mar, 2024 1 commit
- Make sliding window size inclusive in eager attention (#29519) · 608fa549
  Jonatan Kłosko authored Mar 08, 2024
```
* Make sliding window size inclusive in eager attention

* Fix tests
```
  608fa549
07 Mar, 2024 2 commits
- test_generation_config_is_loaded_with_model - fall back to pytorch model for now (#29521) · 4ed9ae62
  amyeroberts authored Mar 07, 2024
```
* Fall back to pytorch model for now

* Fix up
```
  4ed9ae62
- Revert "Automatic safetensors conversion when lacking these files (#2… (#29507) · f6133d76
  Lysandre Debut authored Mar 07, 2024
```
Revert "Automatic safetensors conversion when lacking these files (#29390)"

This reverts commit a69cbf4e.
```
  f6133d76
06 Mar, 2024 1 commit
- [FIX] `offload_weight()` takes from 3 to 4 positional arguments but 5 were given (#29457) · 00bf4427
  Fanli Lin authored Mar 06, 2024
```
* use require_torch_gpu

* enable on XPU

* fix
```
  00bf4427
05 Mar, 2024 1 commit

Automatic safetensors conversion when lacking these files (#29390) · a69cbf4e

Lysandre Debut authored Mar 05, 2024

* Automatic safetensors conversion when lacking these files

* Remove debug

* Thread name

* Typo

* Ensure that raises do not affect the main thread

a69cbf4e

16 Feb, 2024 1 commit
- Update all references to canonical models (#29001) · f497f564
  Lysandre Debut authored Feb 16, 2024
```
* Script & Manual edition

* Update
```
  f497f564
06 Feb, 2024 1 commit
- Revert "[WIP] Hard error when ignoring tensors." (#28898) · 76b4f666
  Yih-Dar authored Feb 06, 2024
```
Revert "[WIP] Hard error when ignoring tensors. (#27484)"

This reverts commit 2da28c4b.
```
  76b4f666
05 Feb, 2024 1 commit

[WIP] Hard error when ignoring tensors. (#27484) · 2da28c4b

Nicolas Patry authored Feb 05, 2024



* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2da28c4b

02 Feb, 2024 1 commit

Add missing None check for hf_quantizer (#28804) · ec29d25d

Juri Ganitkevitch authored Feb 02, 2024



* Add missing None check for hf_quantizer

* Add test, fix logic.

* make style

* Switch test model to Mistral

* Comment

* Update tests/test_modeling_utils.py

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

ec29d25d

18 Jan, 2024 1 commit

Use `LoggingLevel` context manager in 3 tests (#28575) · 0754217c

Yih-Dar authored Jan 18, 2024



* inside with LoggingLevel

* remove is_flaky

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0754217c

16 Jan, 2024 2 commits
- Config: warning when saving generation kwargs in the model config (#28514) · f4f57f9d
  Joao Gante authored Jan 16, 2024
  
  f4f57f9d
- Fix mismatching loading in from_pretrained with/without accelerate (#28414) · 66db33dd
  fxmarty authored Jan 16, 2024
```
* fix mismatching behavior in from_pretrained with/without accelerate

* meaningful refactor

* remove added space

* add test

* fix model on the hub

* comment

* use tiny model

* style
```
  66db33dd
15 Jan, 2024 1 commit

[`core`/ FEAT] Add the possibility to push custom tags using `PreTrainedModel` itself (#28405) · 1b9a2e4c

Younes Belkada authored Jan 15, 2024



* v1 tags

* remove unneeded conversion

* v2

* rm unneeded warning

* add more utility methods

* Update src/transformers/utils/hub.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/hub.py
Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/utils/hub.py
Co-authored-by: Lucain <lucainp@gmail.com>

* more enhancements

* oops

* merge tags

* clean up

* revert unneeded change

* add extensive docs

* more docs

* more kwargs

* add test

* oops

* fix test

* Update src/transformers/modeling_utils.py
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update src/transformers/utils/hub.py
Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/transformers/modeling_utils.py

* Update src/transformers/trainer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add more conditions

* more logic

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

1b9a2e4c

12 Jan, 2024 1 commit
- Mark two logger tests as flaky (#28458) · 4e36a6cd
  amyeroberts authored Jan 12, 2024
```
* Mark two logger tests as flaky

* Add description to is_flaky
```
  4e36a6cd
17 Dec, 2023 1 commit

4D `attention_mask` support (#27539) · f85a1e82

Poedator authored Dec 17, 2023



* edits to _prepare_4d_causal_attention_mask()

* initial tests for 4d mask

* attention_mask_for_sdpa support

* added test for inner model hidden

* added autotest decorators

* test mask dtype to torch.int64

* torch.testing.assert_close
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* torch_device and @torch_gpu in tests

* upd tests

* +torch decorators

* torch decorators fixed

* more decorators!

* even more decorators

* fewer decorators

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

f85a1e82

15 Dec, 2023 1 commit

[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` (#28043) · 1e209317

Younes Belkada authored Dec 15, 2023



* fix fa-2 issue

* fix test

* Update src/transformers/modeling_utils.py
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* clenaer fix

* up

* add more robust tests

* Update src/transformers/modeling_utils.py
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* fixup

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* pop

* add test

---------
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1e209317