Commits · 7941769e557c850c8f599146a1371cf429ec0707 · chenpangpang / transformers

04 Mar, 2024 10 commits

Fix grad_norm unserializable tensor log failure (#29212) · 7941769e

Sven Schultze authored Mar 04, 2024

* Fix grad_norm unserializable tensor log failure

* Fix origin of grad_norm logs to be in deepspeed get_global_grad_norm()

7941769e

🚨 Fully revert atomic checkpointing 🚨 (#29370) · 1681a6d4
Zach Mueller authored Mar 04, 2024
```
Fully revert atomic checkpointing
```
1681a6d4

Fix OneFormer `post_process_instance_segmentation` for panoptic tasks (#29304) · 8ef98628

Nick DeGroot authored Mar 04, 2024

* 🐛 Fix oneformer instance post processing when using panoptic task type

* ✅

 Add unit test for oneformer instance post processing panoptic bug

---------
Co-authored-by: Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>

8ef98628

Fix: Fixed the previous tracking URI setting logic to prevent clashes with... · 81220cba

Sean (Seok-Won) Yi authored Mar 04, 2024


Fix: Fixed the previous tracking URI setting logic to prevent clashes with original MLflow code. (#29096)

* Changed logic for setting the tracking URI.

The previous code was calling the `mlflow.set_tracking_uri` function
regardless of whether or not the environment variable
`MLFLOW_TRACKING_URI` is even set. This led to clashes with the original
MLflow implementation and therefore the logic was changed to only
calling the function when the environment variable is explicitly set.

* Check if tracking URI has already been set.

The previous code did not consider the possibility that the tracking URI
may already be set elsewhere and was therefore (erroneously) overriding
previously set tracking URIs using the environment variable.

* Removed redundant parentheses.
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix docstring to reflect library convention properly.
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Fix docstring to reflect library convention properly.

"Unset by default" is the correct expression rather than "Default to `None`."
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

81220cba

Convert SlimSAM checkpoints (#28379) · 5e4b69dc

NielsRogge authored Mar 04, 2024



* First commit

* Improve conversion script

* Convert more checkpoints

* Update src/transformers/models/sam/convert_sam_original_to_hf_format.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Rename file

* More updates

* Update docstring

* Update script

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

5e4b69dc

Workaround for #27758 to avoid ZeroDivisionError (#28756) · c38a1227
Traun Leyden authored Mar 04, 2024

c38a1227

Add mlx support to BatchEncoding.convert_to_tensors (#29406) · 704b3f74

Y4hL authored Mar 04, 2024

* Add mlx support

* Fix import order and use def instead of lambda

* Another fix for ruff format :)

* Add detecting mlx from repr, add is_mlx_array

704b3f74

[Mixtral] Fixes attention masking in the loss (#29363) · 39ef3fb2
Siming Dai authored Mar 04, 2024
```
Fix mixtral load balancing loss
Co-authored-by: dingkunbo <dingkunbo@baidu.com>
```
39ef3fb2

update path to hub files in the error message (#29369) · 38953a75

Poedator authored Mar 04, 2024

update path to hub files

need to add `tree/` to path to files at HF hub.
see example path:
`https://huggingface.co/meta-llama/Llama-2-7b-hf/tree/main`

38953a75

[tests] enable automatic speech recognition pipeline tests on XPU (#29308) · aade711d
Fanli Lin authored Mar 04, 2024
```
* use require_torch_gpu

* enable on XPU
```
aade711d

01 Mar, 2024 10 commits
- Correct zero division error in inverse sqrt scheduler (#28982) · 831bc25d
  David Valente authored Mar 01, 2024
```
* Correct zero division error in inverse sqrt scheduler

* default timescale to 10_000
```
  831bc25d
- Fix deprecated arg issue (#29372) · 1a7c117d
  Zach Mueller authored Mar 01, 2024
```
* Fix deprecated arg issue

* Trainer check too

* Check for dict or dataclass

* Simplify, make config always AcceleratorConfig

* Upstream to Trainer
```
  1a7c117d
- Fix llama + gemma accelete tests (#29380) · cec77334
  Marc Sun authored Mar 01, 2024
  
  cec77334
- Support subfolder with `AutoProcessor` (#29169) · 15f8296a
  Jingya HUANG authored Mar 01, 2024
```
enable subfolder
```
  15f8296a
- [`YOLOS`] Fix - return padded annotations (#29300) · f1b1379f
  amyeroberts authored Mar 01, 2024
```
* Fix yolos processing

* Add back slow marker - protects for pycocotools in slow

* Slow decorator goes above copied from header
```
  f1b1379f
- 🚨🚨[Whisper Tok] Update integration test (#29368) · 0a0a279e
  Sanchit Gandhi authored Mar 01, 2024
```
* [Whisper Tok] Update integration test

* make style
```
  0a0a279e
- [`Llama + AWQ`] fix `prepare_inputs_for_generation` 🫠 (#29381) · e7b98370
  Arthur authored Mar 01, 2024
```
* use the generation config 🫠

* fixup
```
  e7b98370
- FIX [`quantization` / `ESM`] Fix ESM 8bit / 4bit with bitsandbytes (#29329) · 50db7ca4
  Younes Belkada authored Mar 01, 2024
```
* fix ESM 8bit

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
```
  50db7ca4
- Fix Base Model Name of LlamaForQuestionAnswering (#29258) · 2858d6c6
  Leon Engländer authored Mar 01, 2024
```
* LlamaForQuestionAnswering self.transformer->self.model

* fix "Copied from" string

* Llama QA model: set base_model_prefix = "transformer"
```
  2858d6c6
- Expose `offload_buffers` parameter of `accelerate` to... · 5ee0868a
  Song Fuchang authored Mar 01, 2024
```
Expose `offload_buffers` parameter of `accelerate` to `PreTrainedModel.from_pretrained` method (#28755)

Expose offload_buffers parameter to from_pretrained method
```
  5ee0868a
29 Feb, 2024 6 commits
- Fix @require_read_token in tests (#29367) · 0ad770c3
  Lucain authored Feb 29, 2024
  
  0ad770c3
- Patch YOLOS and others (#29353) · bb4f816a
  NielsRogge authored Feb 29, 2024
```
Fix issue
```
  bb4f816a
- Avoid using uncessary `get_values(MODEL_MAPPING)` (#29362) · 44fe1a1c
  Yih-Dar authored Feb 29, 2024
```
* more fixes

* more fixes

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  44fe1a1c
- FIX [`CI`] `require_read_token` in the llama FA2 test (#29361) · b647acdb
  Younes Belkada authored Feb 29, 2024
```
Update test_modeling_llama.py
```
  b647acdb
- FIX [`CI`]: Fix failing tests for peft integration (#29330) · 8d8ac9c2
  Younes Belkada authored Feb 29, 2024
```
fix failing tests for peft integration
```
  8d8ac9c2
- FIX [`CI` / `starcoder2`] Change starcoder2 path to correct one for slow tests (#29359) · 1aee9afd
  Younes Belkada authored Feb 29, 2024
```
change starcoder2 path to correct one
```
  1aee9afd
28 Feb, 2024 14 commits

[i18n-zh] Sync source/zh/index.md (#29331) · 2209b7af
Michael authored Feb 29, 2024
```
* [i18n-zh] Sync source/zh/index.md

* apply review comments
```
2209b7af

Better SDPA unmasking implementation (#29318) · 49204c1d

fxmarty authored Feb 28, 2024

* better unmask imple

* comment

* typo

* bug report pytorch

* cleanup

* fix import

* add back example

* retrigger ci

* come on

49204c1d

[CI] Quantization workflow (#29046) · f54d82ca

Marc Sun authored Feb 28, 2024



* [CI] Quantization workflow

* build dockerfile

* fix dockerfile

* update self-cheduled.yml

* test build dockerfile on push

* fix torch install

* udapte to python 3.10

* update aqlm version

* uncomment build dockerfile

* tests if the scheduler works

* fix docker

* do not trigger on psuh again

* add additional runs

* test again

* all good

* style

* Update .github/workflows/self-scheduled.yml
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* test build dockerfile with torch 2.2.0

* fix extra

* clean

* revert changes

* Revert "revert changes"

This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd.

* revert correct change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f54d82ca

check if position_ids exists before using it (#29306) · 554e7ada
jiqing-feng authored Feb 28, 2024
```
Co-authored-by: Joao Gante <joao@huggingface.co>
```
554e7ada

RoPE loses precision for Llama / Gemma + Gemma logits.float() (#29285) · d3a4b475

Daniel Han authored Feb 29, 2024



* Update modeling_llama.py

Llama - Force float32 since bfloat16 loses precision on long contexts

* Update modeling_llama.py

* Update modeling_gemma.py

Fix RoPE and logits.float()

* @torch.no_grad()

* @torch.no_grad()

* Cos, Sin to float32

* cos, sin to float32

* Update src/transformers/models/gemma/modeling_gemma.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve PR conflicts

* Fix RoPE for llama

* Revert "Fix RoPE for llama"

This reverts commit b860a22dab9bb01cd15cb9a3220abeaefad3e458.

* Fix RoPE for llama

* RoPE device

* Autocast device type

* RoPE

* RoPE isinstance

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

d3a4b475

Idefics: generate fix (#29320) · 7628b3a0
Joao Gante authored Feb 28, 2024

7628b3a0

Disable Mixtral `output_router_logits` during inference (#29249) · 2ce56d35

Leonardo Emili authored Feb 28, 2024

* Set output_router_logits=False in prepare_inputs_for_generation for mixtral

* Add output_router_logits=False to prepare_inputs_for_generation for mixtral

* Fix style

2ce56d35

[`Llama ROPE`] Fix torch export but also slow downs in forward (#29198) · 8a8a0a4a

Arthur authored Feb 28, 2024

* remove control flow

* update gptneox

* update ....

* nits

* Actually let's just break. Otherwise we are silently failing which imo is not optimal

* version BC

* fix tests

* fix eager causal

* nit

* add a test

* style

* nits

* nits

* more nits for the test

* update and fix

* make sure cuda graphs are not skipped

* read token is needed for meta llama

* update!

* fiixup

* compile test should be slow

* fix thet fix copies

* stle 🫠

8a8a0a4a

[`T5 and Llama Tokenizer`] remove warning (#29346) · 7c87f357

Arthur authored Feb 28, 2024



* remove warning

* add co-author

* update

---------
Co-authored-by: hiaoxui <hiaoxui@users.noreply.github.com>

7c87f357

[`require_read_token`] fix typo (#29345) · a5288852
Arthur authored Feb 28, 2024
```
fix wrapper
```
a5288852
Remove numpy usage from owlvit (#29326) · e715c78c
fxmarty authored Feb 28, 2024
```
* remove numpy usage from owlvit

* fix init owlv2

* style
```
e715c78c

FIX [`Gemma` / `CI`] Make sure our runners have access to the model (#29242) · ad00c482

Younes Belkada authored Feb 28, 2024



* pu hf token in gemma tests

* update suggestion

* add to flax

* revert

* fix

* fixup

* forward contrib credits from discussion

---------
Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>

ad00c482

simplify get_class_in_module and fix for paths containing a dot (#29262) · bd5b9863
Jared Van Bortel authored Feb 27, 2024

bd5b9863

Starcoder2 model - bis (#29215) · 63caa370

RaymondLi0 authored Feb 28, 2024



* Copy model

* changes

* misc

* fixes

* add embed and residual dropout (#30)

* misc

* remove rms norm and gated MLP

* remove copied mentions where its not a copy anymore

* remove unused _shape

* copied from mistral instead

* fix copies

* fix copies

* add not doctested

* fix

* fix copyright

* Update docs/source/en/model_doc/starcoder2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix doc

* revert some changes

* add fa2 tests

* fix styling nit

* fix

* push dummy docs

---------
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

63caa370