Commits · 59cd9de39da4c406e12c1785f70ee73806ebc6ba · chenpangpang / transformers

11 Jan, 2024 11 commits

Yih-Dar authored Jan 11, 2024



* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

59cd9de3

Fix load balancing loss func for mixtral (#28256) · e768616a

liangxuZhang authored Jan 11, 2024



* Correct the implementation of auxiliary loss of mixtrtal

* correct the implementation of auxiliary loss of mixtrtal

* Implement a simpler calculation method

---------
Co-authored-by: zhangliangxu3 <zhangliangxu3@jd.com>

e768616a

Correctly resolve trust_remote_code=None for AutoTokenizer (#28419) · 5d4d62d0
Matt authored Jan 11, 2024
```
* Correctly resolve trust_remote_code=None for AutoTokenizer

* Second attempt at a proper resolution
```
5d4d62d0

[Phi] Extend implementation to use GQA/MQA. (#28163) · 55090585

Gustavo de Rosa authored Jan 11, 2024

* chore(phi): Updates configuration_phi with missing keys.

* chore(phi): Adds first draft of combined modeling_phi.

* fix(phi): Fixes according to latest review.

* fix(phi): Removes pad_vocab_size_multiple to prevent inconsistencies.

* fix(phi): Fixes unit and integration tests.

* fix(phi): Ensures that everything works with microsoft/phi-1 for first integration.

* fix(phi): Fixes output of docstring generation.

* fix(phi): Fixes according to latest review.

* fix(phi): Fixes according to latest review.

* fix(tests): Re-enables Phi-1.5 test.

* fix(phi): Fixes attention overflow on PhiAttention (for Phi-2).

* fix(phi): Improves how queries and keys are upcast.

* fix(phi): Small updates on latest changes.

55090585

Optionally preprocess segmentation maps for MobileViT (#28420) · d5606378

Harisankar Babu authored Jan 11, 2024

* optionally preprocess segmentation maps for mobilevit

* changed pretrained model name to that of segmentation model

* removed voc-deeplabv3 from model archive list

* added preprocess_image and preprocess_mask methods for processing images and segmentation masks respectively

* added tests for segmentation masks based on segformer feature extractor

* use crop_size instead of size

* reverting to initial model

d5606378

Set `cache_dir` for `evaluate.load()` in example scripts (#28422) · 95091e15

Alex Hedges authored Jan 11, 2024

While using `run_clm.py`,[^1] I noticed that some files were being added
to my global cache, not the local cache. I set the `cache_dir` parameter
for the one call to `evaluate.load()`, which partially solved the
problem. I figured that while I was fixing the one script upstream, I
might as well fix the problem in all other example scripts that I could.

There are still some files being added to my global cache, but this
appears to be a bug in `evaluate` itself. This commit at least moves
some of the files into the local cache, which is better than before.

To create this PR, I made the following regex-based transformation:
`evaluate\.load\((.*?)\)` -> `evaluate\.load\($1,
cache_dir=model_args.cache_dir\)`. After using that, I manually fixed
all modified files with `ruff` serving as useful guidance. During the
process, I removed one existing usage of the `cache_dir` parameter in a
script that did not have a corresponding `--cache-dir` argument
declared.

[^1]: I specifically used `pytorch/language-modeling/run_clm.py` from
v4.34.1 of the library. For the original code, see the following URL:
https://github.com/huggingface/transformers/tree/acc394c4f5e1283c19783581790b3dc3105a3697/examples/pytorch/language-modeling/run_clm.py.

95091e15

Fix docker file (#28452) · 5fd5ef76

Yih-Dar authored Jan 11, 2024



fix docker file
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5fd5ef76

Use python 3.10 for docbuild (#28399) · d019acb8
Yih-Dar authored Jan 11, 2024
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d019acb8

Optimize the speed of the truncate_sequences function. (#28263) · 2a85345a

ikkvix authored Jan 11, 2024



* change truncate_sequences

* Update tokenization_utils_base.py

* change format

* fix when ids_to_move=0

* fix

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2a85345a

Enable multi-label image classification in pipeline (#28433) · 66964c00
amyeroberts authored Jan 11, 2024
```
Enable multi-label image classification
```
66964c00
Assitant model may on a different device (#27995) · 8205b264
jiqing-feng authored Jan 11, 2024
```
* Assitant model may on a different device

* fix tensor device
```
8205b264

10 Jan, 2024 13 commits

[Whisper] Fix slow test (#28407) · cbbe3074

Patrick von Platen authored Jan 10, 2024



* [Whisper] Fix slow test

* update

* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

cbbe3074

[docstring] Fix docstring for ErnieConfig, ErnieMConfig (#27029) · 6c78bbcb

Sparty authored Jan 10, 2024



* Remove ErnieConfig, ErnieMConfig check_docstrings

* Run fix_and_overwrite for ErnieConfig, ErnieMConfig

* Replace <fill_type> and <fill_docstring> in configuration_ernie, configuration_ernie_m.py with type and docstring values

---------
Co-authored-by: vignesh-raghunathan <vignesh_raghunathan@intuit.com>

6c78bbcb

Fix load correct tokenizer in Mixtral model documentation (#28437) · 3724156b
Francisco Kurucz authored Jan 10, 2024

3724156b

Fix for checkpoint rename race condition (#28364) · cef2e40e

Timothy Blattner authored Jan 10, 2024



* Changed logic for renaming staging directory when saving checkpoint to only operate with the main process.
Added fsync functionality to attempt to flush the write changes in case os.rename is not atomic.

* Updated styling using make fixup

* Updated check for main process to use built-in versions from trainer
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

* Fixed incorrect usage of trainer main process checks
Added with open usage to ensure better file closing as suggested from PR
Added rotate_checkpoints into main process logic

* Removed "with open" due to not working with directory. os.open seems to work for directories.

---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>

cef2e40e

update docs to add the `phi-2` example (#28392) · fff8ca8e
Susnato Dhar authored Jan 10, 2024
```
* update docs

* added Tip
```
fff8ca8e
CI: limit natten version (#28432) · ee2482b6
Joao Gante authored Jan 10, 2024

ee2482b6
Fix number of models in README.md (#28430) · ffd37103
prasatee authored Jan 10, 2024

ffd37103
Support `DeepSpeed` when using auto find batch size (#28088) · 6015d0ad
Zach Mueller authored Jan 10, 2024
```
Fixup test
```
6015d0ad
Skip now failing test in the Trainer tests (#28421) · a777f525
Zach Mueller authored Jan 10, 2024
```
* Fix test

* Skip
```
a777f525

[BUG] BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch (#28201) · 4df1d696

HanHui authored Jan 10, 2024



fix(generation/logits_process.py): BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch
Co-authored-by: chenhanhui <chenhanhui@kanzhun.com>

4df1d696

Bump fonttools from 4.31.1 to 4.43.0 in /examples/research_projects/decision_transformer (#28417) · 932ad8af

dependabot[bot] authored Jan 10, 2024

Bump fonttools in /examples/research_projects/decision_transformer

Bumps [fonttools](https://github.com/fonttools/fonttools) from 4.31.1 to 4.43.0.
- [Release notes](https://github.com/fonttools/fonttools/releases)
- [Changelog](https://github.com/fonttools/fonttools/blob/main/NEWS.rst)
- [Commits](https://github.com/fonttools/fonttools/compare/4.31.1...4.43.0

)

---
updated-dependencies:
- dependency-name: fonttools
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

932ad8af

Use mmap option to load_state_dict (#28331) · 701298d2
Weiming Zhao authored Jan 10, 2024
```
Use mmap option to load_state_dict (#28331)
```
701298d2

Fix `_merge_input_ids_with_image_features` for llava model (#28333) · 0f2f0c63

Victor SANH authored Jan 10, 2024



* fix `_merge_input_ids_with_image_features` for llava model

* Update src/transformers/models/llava/modeling_llava.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* adress comments

* style and tests

* ooops

* test the backward too

* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update tests/models/vipllava/test_modeling_vipllava.py

* style and quality

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

0f2f0c63

09 Jan, 2024 4 commits

Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 (#28245) · 976189a6

Xuehai Pan authored Jan 09, 2024

* Fix initialization for missing parameters in `from_pretrained` under ZeRO-3

* Test initialization for missing parameters under ZeRO-3

* Add more tests

* Only enable deepspeed context for per-module level parameters

* Enable deepspeed context only once

* Move class definition inside test case body

976189a6

fix auxiliary loss training in DetrSegmentation (#28354) · 357971ec
Sangbum Daniel Choi authored Jan 09, 2024
```
* fix auxiliary loss training in detrSegmentation

* add auxiliary_loss testing
```
357971ec
[SDPA] Make sure attn mask creation is always done on CPU (#28400) · 8604dd30
Patrick von Platen authored Jan 09, 2024
```
* [SDPA] Make sure attn mask creation is always done on CPU

* Update docker to 2.1.1

* revert test change
```
8604dd30

update warning for image processor loading (#28209) · 5c7e11e0

Yih-Dar authored Jan 09, 2024



* info

* update

* Update src/transformers/models/auto/image_processing_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

5c7e11e0

08 Jan, 2024 8 commits

Add SigLIP (#26522) · 3b742ea8

NielsRogge authored Jan 08, 2024



* Add first draft

* Use appropriate gelu function

* More improvements

* More improvements

* More improvements

* Convert checkpoint

* More improvements

* Improve docs, remove print statements

* More improvements

* Add link

* remove unused masking function

* begin tokenizer

* do_lower_case

* debug

* set split_special_tokens=True

* Remove script

* Fix style

* Fix rebase

* Use same design as CLIP

* Add fast tokenizer

* Add SiglipTokenizer to init, remove extra_ids

* Improve conversion script

* Use smaller inputs in conversion script

* Update conversion script

* More improvements

* Add processor to conversion script

* Add tests

* Remove print statements

* Add tokenizer tests

* Fix more tests

* More improvements related to weight initialization

* More improvements

* Make more tests pass

* More improvements

* More improvements

* Add copied from

* Add canonicalize_text

* Enable fast tokenizer tests

* More improvements

* Fix most slow tokenizer tests

* Address comments

* Fix style

* Remove script

* Address some comments

* Add copied from to tests

* Add more copied from

* Add more copied from

* Add more copied from

* Remove is_flax_available

* More updates

* Address comment

* Remove SiglipTokenizerFast for now

* Add caching

* Remove umt5 test

* Add canonicalize_text inside _tokenize, thanks Arthur

* Fix image processor tests

* Skip tests which are not applicable

* Skip test_initialization

* More improvements

* Compare pixel values

* Fix doc tests, add integration test

* Add do_normalize

* Remove causal mask and leverage ignore copy

* Fix attention_mask

* Fix remaining tests

* Fix dummies

* Rename temperature and bias

* Address comments

* Add copied from to tokenizer tests

* Add SiglipVisionModel to auto mapping

* Add copied from to image processor tests

* Improve doc

* Remove SiglipVisionModel from index

* Address comments

* Improve docs

* Simplify config

* Add first draft

* Make it like mistral

* More improvements

* Fix attention_mask

* Fix output_attentions

* Add note in docs

* Convert multilingual model

* Convert large checkpoint

* Convert more checkpoints

* Add pipeline support, correct image_mean and image_std

* Use padding=max_length by default

* Make processor like llava

* Add code snippet

* Convert more checkpoints

* Set keep_punctuation_string=None as in OpenCLIP

* Set normalized=False for special tokens

* Fix doc test

* Update integration test

* Add figure

* Update organization

* Happy new year

* Use AutoModel everywhere

---------
Co-authored-by: patil-suraj <surajp815@gmail.com>

3b742ea8

Add segmentation map processing to SAM Image Processor (#27463) · 73c88012

Rosie Wood authored Jan 08, 2024



* add segmentation map processing to sam image processor

* fixup

* add tests

* reshaped_input_size is shape before padding

* update tests for size/shape outputs

* fixup

* add code snippet to docs

* Update docs/source/en/model_doc/sam.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add missing backticks

* add `segmentation_maps` as arg for SamProcessor.__call__()

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

73c88012

Remove shell=True from subprocess.Popen to Mitigate Security Risk (#28299) · 2272ab57
Avimanyu Bandyopadhyay authored Jan 08, 2024
```
Remove shell=True from subprocess.Popen to mitigate security risk
```
2272ab57
[AttentionMaskConverter] fix sdpa unmask unattended (#28369) · 87a6cf41
zspo authored Jan 08, 2024
```
fix tensor device
```
87a6cf41

Bugfix / ffmpeg input device (mic) not working on Windows (#27051) · 98dba52c

Ondrej Major authored Jan 08, 2024

* fix input audio device for windows.

* ffmpeg audio device Windows

* Fixes wrong input device assignment in Windows

* Fixed getting mic on Windows systems by adding _get_microphone_name() function.

98dba52c

remove two deprecated function (#28220) · 7d9d5cea
Hz, Ji authored Jan 08, 2024

7d9d5cea
Fix building alibi tensor when num_heads is not a power of 2 (#28380) · 0c2121f9
Mohamed Abu El-Nasr authored Jan 08, 2024
```
* Fix building alibi tensor when num_heads is not a power of 2

* Remove print function
```
0c2121f9

Enhancing Code Readability and Maintainability with Simplified Activation... · 53cffeb3

Chi authored Jan 08, 2024


Enhancing Code Readability and Maintainability with Simplified Activation Function Selection. (#28349)

* Little bit change code in get_activation()

* proper area to deffine gelu_activation() in this two file

* Fix github issue

* Mistake some typo

* My mistake to self using to call config

* Reformat my two file

* Update src/transformers/activations.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/electra/modeling_electra.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/convbert/modeling_convbert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Rename gelu_act to activatioin

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

53cffeb3

07 Jan, 2024 1 commit
- [Phi2] Add support for phi2 models (#28211) · 3eddda11
  Susnato Dhar authored Jan 07, 2024
```
* modified script and added test for phi2

* changes
```
  3eddda11
05 Jan, 2024 3 commits

chore: Fix typo s/exclusivelly/exclusively/ (#28361) · 4ab5fb89
hugo-syn authored Jan 05, 2024

4ab5fb89

Update VITS modeling to enable ONNX export (#28141) · 7226f3d2

Ella Charlaix authored Jan 05, 2024

* Update vits modeling for onnx export compatibility

* fix style

* Update src/transformers/models/vits/modeling_vits.py

7226f3d2

fix FA2 when using quantization for remaining models (#28341) · cadf93a6

Susnato Dhar authored Jan 05, 2024



* fix fa2 autocasting when using quantization

* Update src/transformers/models/distilbert/modeling_distilbert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/distilbert/modeling_distilbert.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

cadf93a6