Commits · acd653164b6874e395ac9d46850f67599d8cdb58 · chenpangpang / transformers

05 Dec, 2023 13 commits

Update CUDA versions for DeepSpeed (#27853) · acd65316

Zach Mueller authored Dec 05, 2023

* Update CUDA versions

* For testing

* Allow for workflow dispatch

* Use newer image

* Revert workflow

* Revert workflow

* Push

* Other docker image

acd65316

[`Docs`] Update broken image on fused modules (#27856) · ba52dec4
Younes Belkada authored Dec 05, 2023
```
Update quantization.md
```
ba52dec4

Documentation: Spanish translation of perplexity.mdx (#27807) · da1d0d40

Aaron Jimenez authored Dec 05, 2023

* Copy perplexity.md file to es/ folder

* Adding perplexity to es/_toctree.yml

* Translate first section

* Calculating PPL section translate

* Example section translate

* fix translate of log-likehood

* Fix title translate

* Fix \ in second paragraph

* Change verosimilitud for log-likelihood

* Run 'make style'

da1d0d40

fix(whisper): mutable generation config (#27833) · 788730c6
Vedat Baday authored Dec 05, 2023

788730c6
Update `VitDetModelTester.get_config` to use `pretrain_image_size` (#27831) · ac975074
Yih-Dar authored Dec 05, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
ac975074
⚠️ [VitDet] Fix test (#27832) · 28e2887a
NielsRogge authored Dec 05, 2023
```
Address test
```
28e2887a

[Time series] Add PatchTSMixer (#26247) · b242d0f2

Arindam Jati authored Dec 05, 2023



* patchtsmixer initial commit

* x,y->context_values,target_values, unittest addded

* cleanup code

* minor

* return hidden states

* model tests, partial integration tests

* ettm notebook temporary

* minor

* config mask bug fix, tests updated

* final ETT notebooks

* add selfattn

* init

* added docstrings

* PatchTSMixerForPretraining -> PatchTSMixerForMaskPretraining

* functionality tests added

* add start and input docstrings

* docstring edits

* testcase edits

* minor changes

* docstring error fixed

* ran make fixup

* finalize integration tests and docs

* minor

* cleaned gitignore

* added dataclass decorator, ran black formatter

* ran ruff

* formatting

* add slow decorator

* renamed in_Channel to input_size and default to 1

* shorten dataclass names

* use smaller model for testing

* moved the 3 heads to the modeling file

* use scalers instead of revin

* support forecast_channel_indices

* fix regression scaling

* undo reg. scaling

* removed unneeded classes

* forgot missing

* add more layers

* add copied positional_encoding

* use patchmask from patchtst

* removed dependency on layers directory

* formatting

* set seed

* removed unused imports

* fixed forward signature test

* adding distributional head for PatchTSMixerForecasting

* add generate to forecast

* testcases for generate

* add generate and distributional head for regression

* raise Exception for negative values for neg binominal distribution

* formatting changes

* remove copied from patchtst and add TODO for test passing

* make copies

* doc edits

* minor changes

* format issues

* minor changes

* minor changes

* format docstring

* change some class names to PatchTSMixer + class name

Transpose to PatchTSMixerTranspose
GatedAttention to PatchTSMixerGatedAttention

* change NormLayer to PatchTSMixerNormLayer

* change MLP to PatchTSMixerMLP

* change PatchMixer to PatchMixerBlock, FeatureMixer to FeatureMixerBlock

* change ChannelFeatureMixer to ChannelFeatureMixerBlock

* change PatchMasking to PatchTSMixerMasking

* change Patchify to PatchTSMixerPatchify

* list to `list`

* fix docstrings

* formatting

* change bs to batch_size, edit forecast_masking

* edit random_masking

* change variable name and update docstring in PatchTSMixerMasking

* change variable name and update docstring in InjectScalerStatistics4D

* update forward call in PatchTSMixerTranspose

* change variable name and update docstring in PatchTSMixerNormLayer

* change variable name and update docstring in PatchTSMixerMLP

* change variable name and update docstring in ChannelFeatureMixerBlock

* formatting

* formatting issues

* docstring issue

* fixed observed_mask type in docstrings

* use FloatTensor type

* formatting

* fix rescaling issue in forecasting, fixed integration tests

* add docstring from decorator

* fix docstring

* Update README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* PatchTSMixerChannelFeatureMixerBlock

* formatting

* ForPretraining

* use num_labels instead of n_classes

* remove commented out code

* docstring fixed

* nn.functional used instead of one letter F

* x_tmp renamed

* one letter variable x removed from forward calls

* one letter variable y removed

* remove commented code

* rename patch_size, in_channels, PatchTSMixerBackbone

* add config to heads

* add config to heads tests

* code reafactoring to use config instead of passing individual params

* Cdocstring fixes part 1

* docstring fixes part 2

* removed logger.debug

* context_values -> past_values

* formatting changes

* pe -> positional_encoding

* removed unused target variable

* self.mode logic fixed

* formatting change

* edit docstring and var name

* change n_targets to num_targets

* rename input_size to num_input_channels

* add head names with prefix PatchTSMixer

* edit docstring in PatchTSMixerForRegression

* fix var name change in testcases

* add PatchTSMixerAttention

* return dict for all exposed classes, test cases added

* format

* move loss function to forward call

* make style

* adding return dict/tuple

* make repo-consistency

* remove flatten mode

* code refactoring

* rename data

* remove PatchTSMixer and keep only PatchTSMixerEncoder

* docstring fixes

* removed unused code

* format

* format

* remove contiguous and formatting changes

* remove model description from config

* replace asserts with ValueError

* remove nn.Sequential from PatchTSMixerNormLayer

* replace if-else with map

* remove all nn.Sequential

* format

* formatting

* fix gradient_checkpointing error after merge, and formatting

* make fix-copies

* remove comments

* reshape

* doesnt support gradient checkpointing

* corect Patchify

* masking updates

* batchnorm copy from

* format checks

* scaler edits

* remove comments

* format changes

* remove self.config

* correct class PatchTSMixerMLP(nn.Module):

* makr fix

* doc updates

* fix-copies

* scaler class correction

* doc edits

* scaler edits

* update readme with links

* injectstatistics add

* fix-copies

* add norm_eps option to LayerNorm

* format changes

* fix copies

* correct make copies

* use parametrize

* fix doc string

* add docs to toctree

* make style

* doc segmenting

* docstring edit

* change forecast to prediction

* edit doc

* doc edits

* remove PatchTSMixerTranspose

* add PatchTSMixerPositionalEncoding and init position_enc

* remove positional_encoding

* edit forecast_masking, remove forecast_mask_ratios

* fix broken code

* var rename target_values -> future_values

* num_features -> d_model

* fix broken code after master merge

* repo consistency

* use postional embedding

* prediction_logits -> prediction_outputs, make fix-copies

* uncommented @slow

* minor changes

* loss first in tuple

* tuple and dict same ordering

* style edits

* minor changes

* dict/tuple consistent enablement

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix formatting

* formatting

* usage tip

* test on cpu only

* add sample usage

* change PatchTSMixerForClassification to PatchTSMixerForTimeSeriesClassification

* push changes

* fix copies

* std scaling set to default True case

* minor changes

* stylechanges

---------
Co-authored-by: Arindam Jati <arindam.jati@ibm.com>
Co-authored-by: vijaye12 <vijaye12@in.ibm.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: nnguyen <nnguyen@us.ibm.com>
Co-authored-by: vijaye12 <vijaykr.e@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Nam Nguyen <namctin@gmail.com>
Co-authored-by: Wesley Gifford <79663411+wgifford@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

b242d0f2

Move tensors to same device to enable IDEFICS naive MP training (#27746) · e5c12c03
Bram Willemsen authored Dec 05, 2023

e5c12c03
[`ClipVision`] `accelerate` support for clip-vision (#27851) · 3e68944c
Younes Belkada authored Dec 05, 2023
```
support accelerate for clip-vision
```
3e68944c
Generate: Update VisionEncoderDecoder test value (#27850) · b7e6d120
Joao Gante authored Dec 05, 2023
```
update test result, due to bug fix in decoder-only beam search
```
b7e6d120

Faster generation using AWQ + Fused modules (#27411) · fdb85be4

Younes Belkada authored Dec 05, 2023



* v1 fusing modules

* add fused mlp support

* up

* fix CI

* block save_pretrained

* fixup

* small fix

* add new condition

* add v1 docs

* add some comments

* style

* fix nit

* adapt from suggestion

* add check

* change arg names

* change variables name

* Update src/transformers/integrations/awq.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

* split up into 3 different private methods

* more conditions

* more checks

* add fused tests for custom models

* fix

* fix tests

* final update docs

* final fixes

* fix importlib metadata

* Update src/transformers/utils/quantization_config.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* change it to `do_fuse`

* nit

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Update src/transformers/utils/quantization_config.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* few fixes

* revert

* fix test

* fix copies

* raise error if model is not quantized

* add test

* use quantization_config.config when fusing

* Update src/transformers/modeling_utils.py

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

fdb85be4

Make image processors more general (#27690) · df40edfb

NielsRogge authored Dec 05, 2023

* Make image processors more general

* Add backwards compatibility for KOSMOS-2

* Remove use_square_size everywhere

* Remove script

df40edfb

pin `ruff==0.1.5` (#27849) · 96f9caa1
Yih-Dar authored Dec 05, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
96f9caa1

04 Dec, 2023 19 commits

Translate `en/tasks` folder docs to Japanese

🇯🇵

(#27098) · 235e5d49

Rockerz authored Dec 05, 2023



* Create asr.md

* Create audio_classification.md

* Create document_question_answering.md

* Update document_question_answering.md

* add

* add

* ggg

* gg

* add masked_language_modeling.md

* add monocular_depth estimation

* new

* dd

* add

* add

* cl

* add

* Add Traslation.md

* hgf

* Added docs to Toctree file

* Update docs/source/ja/tasks/asr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/asr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/image_classification.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/idefics.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/image_captioning.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Fix docs and revert changes

* Update docs/source/en/tasks/idefics.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/language_modeling.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/language_modeling.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/language_modeling.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/prompting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/masked_language_modeling.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/masked_language_modeling.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/prompting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/object_detection.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/semantic_segmentation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/semantic_segmentation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/token_classification.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/translation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/visual_question_answering.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/summarization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* changes in review 1 and 2

* add

* Update docs/source/ja/tasks/asr.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/tasks/translation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* changes

* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update _toctree.yml

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

235e5d49

translate internal folder files to chinese (#27638) · a502b0d4
jiaqiw09 authored Dec 05, 2023
```
* translate

* update

* update

---------
Co-authored-by: jiaqiw <wangjiaqi50@huawei.com>
```
a502b0d4
[Seamless v2] Add FE to auto mapping (#27829) · 3c15fd19
Sanchit Gandhi authored Dec 04, 2023

3c15fd19

Disallow `pickle.load` unless `TRUST_REMOTE_CODE=True` (#27776) · 1d63b0ec

Yih-Dar authored Dec 04, 2023



* fix

* fix

* Use TRUST_REMOTE_CODE

* fix doc

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1d63b0ec

restructure AMD scheduled CI (#27743) · e0d2e695
Yih-Dar authored Dec 04, 2023
```
* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
e0d2e695
single word should be set to False (#27738) · e739a361
Arthur authored Dec 04, 2023

e739a361
[Hot-Fix][XLA] Re-enable broken _tpu_save for XLATensors (#27799) · 2b5d5ead
Yeounoh Chung authored Dec 04, 2023
```
* [XLA] Re-enable broken _tpu_save for XLATensors, by explicitly moving to cpu

* linter-fix
```
2b5d5ead

Flash Attention 2 support for RoCm (#27611) · 1da1302e

fxmarty authored Dec 04, 2023



* support FA2

* fix typo

* fix broken tests

* fix more test errors

* left/right

* fix bug

* more test

* typo

* fix layout flash attention falcon

* do not support this case

* use allclose instead of equal

* fix various bugs with flash attention

* bump

* fix test

* fix mistral

* use skiptest instead of return that may be misleading

* add fix causal arg flash attention

* fix copies

* more explicit comment

* still use self.is_causal

* fix causal argument

* comment

* fixes

* update documentation

* add link

* wrong test

* simplify FA2 RoCm requirements

* update opt

* make flash_attn_uses_top_left_mask attribute private and precise comment

* better error handling

* fix copy & mistral

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/utils/import_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* use is_flash_attn_greater_or_equal_2_10 instead of is_flash_attn_greater_or_equal_210

* fix merge

* simplify

* inline args

---------
Co-authored-by: Felix Marty <felix@hf.co>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

1da1302e

Added test cases for rembert refering to albert and reformer test_tok… (#27637) · 4d4febb7

Nilesh authored Dec 04, 2023



* Added test cases for rembert refering to albert and reformer test_tokenization

* removed CURL_CA_BUNDLE='

* Added flag test_sentencepiece_ignore_case and space_between_special_tokens to True

* Overrided test_added_tokens_serialization

* As slow->fast token failed due to the different initialization for [MASK]  for slow and fast, Therefore it required to make the initialization for [MASK] token uniform between fast and slow token

* Added few more test cases in test_encode_decode_round_trip and modefied the slow token (mask_token) to  have AddedToken instance with lstrip=True

* Added few test cases in test_encoder_decoder round trip and also modified slow tokenizer of rembert to have mask_token as AddedToken with lstrip = True

* Cleaned the code and added  fmt: skip to avoid line breaks after make style +  added comments to indicate from the copied test cases

* Corrected few comments

* Fixed quality issue

* Ran fix-copies

* Fixed few minor issues as (make fix-copies) broke few test cases while stripping the text

* Reverted the changes made by repo-consistancy

---------
Co-authored-by: Kokane <kokanen@apac.corpdir.net>

4d4febb7

[Whisper] Fix doctest in timestamp logits processor (#27795) · a0f7c4a4
Sanchit Gandhi authored Dec 04, 2023

a0f7c4a4
[Seamless v1] Link to v2 docs (#27827) · ede09d67
Sanchit Gandhi authored Dec 04, 2023

ede09d67

Keypoints 0.0 are confusing... · facc6645

RogerWhisper authored Dec 04, 2023

Keypoints 0.0 are confusing ../transformers/models/detr/image_processing_detr.py which are fixed (#26250)

* Keypoints 0.0 is fixed

* fixed keypoints for image_processing_yolos

* fixed keypoints for image_processing_deta

* fixed keypoints for image_processing_deformable_detr

* fixed keypoints for image_processing_conditional_detr

* fixed styles

* Removed Comments

* Removed comment form conditional detr too

* Removed Extra code

* make fix-copes

* Fixed code quality

* keypoints changes

facc6645

Fix `Owlv2ModelIntegrationTest::test_inference_object_detection` (#27793) · 73893df8
Yih-Dar authored Dec 04, 2023
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
73893df8

Fix `TvpModelIntegrationTests` (#27792) · 5a551df9

Yih-Dar authored Dec 04, 2023



* fix

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

5a551df9

[`ModelOnTheFlyConversionTester`] Mark as slow for now (#27823) · c0b9db09
Arthur authored Dec 04, 2023
```
* mark test as slow for now

* style
```
c0b9db09
Add `persistent_workers` parameter to `TrainingArguments` (#27189) · 269078a7
Ilya authored Dec 04, 2023
```
added param
Co-authored-by: Ilya Fedorov <ilyaf@nvidia.com>
```
269078a7
Fix typo in max_length deprecation warnings (#27788) · a2b1e1df
Noah Siegel authored Dec 04, 2023

a2b1e1df

Improve forward signature test (#27729) · 7edf8bfa

NielsRogge authored Dec 04, 2023



* First draft

* Extend test_forward_signature

* Update tests/test_modeling_common.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Revert suggestion

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

7edf8bfa

[JAX] Replace uses of jax.devices("cpu") with jax.local_devices(backend="cpu") (#27593) · bcd0a91a

Roy Hvaara authored Dec 03, 2023

An upcoming change to JAX will include non-local (addressable) CPU devices in jax.devices() when JAX is used multicontroller-style, where there are multiple Python processes.

This change preserves the current behavior by replacing uses of jax.devices("cpu"), which previously only returned local devices, with jax.local_devices("cpu"), which will return local devices both now and in the future.

This change is always safe (i.e., it should always preserve the previous behavior), but it may sometimes be unnecessary if code is never used in a multicontroller setting.
Co-authored-by: Peter Hawkins <phawkins@google.com>

bcd0a91a

01 Dec, 2023 7 commits

[MusicGen] Fix audio channel attribute (#27440) · 2c658b5a
Sanchit Gandhi authored Dec 01, 2023
```
[MusicGen] Fix mono logit test
```
2c658b5a
Better error message for bitsandbytes import (#27764) · abd4cbd7
Marc Sun authored Dec 01, 2023
```
* better error message

* fix logic

* fix log
```
abd4cbd7

Make using safetensors files automated. (#27571) · 7b6324e1

Nicolas Patry authored Dec 01, 2023



* [WIP] Make using safetensors files automated.

If `use_safetensors=True` is used, and it doesn't exist:

- Don't crash just yet
- Lookup for an open PR containing it.
- If yes, use that instead
- If not, touch the space to convert, wait for conversion to be finished
  and the PR to be opened
- Use that new PR
- Profit.

* Remove the token.

* [Auto Safetensors] Websocket -> SSE (#27656)

* Websocket -> SSE

* Support sharded + tests +cleanup

a

* env var

* Apply suggestions from code review

* Thanks Simon

* Thanks Wauplin
Co-authored-by: Wauplin <lucainp@gmail.com>

* Cleanup

* Update tests

* Tests should pass

* Apply to other tests

* Extend extension

* relax requirement on latest hfh

* Revert

* Correct private handling & debug statements

* Skip gated repos as of now

* Address review comments
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: Lysandre <lysandre@huggingface.co>
Co-authored-by: Wauplin <lucainp@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: ArthurZucker <arthur.zucker@gmail.com>

7b6324e1

Fixes for PatchTST Config (#27777) · 95900916

Wesley Gifford authored Dec 01, 2023



* Remove config reference and pass num_patches for PatchTSTforPrediction

* ensure return_dict is properly set

---------
Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com>

95900916

[i18n-fr] Translate installation to French (#27657) · cf62539a

Nolwenn Bernard authored Dec 01, 2023



* partial traduction of installation

* Finish translation of installation

* Update installation.mdx

* Rename installation.mdx to installation.md

* Typos

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/fr/installation.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Address review comments

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

cf62539a

[SeamlessM4Tv2] Fix links in README (#27782) · 0ad4e7e6
Joshua Lochner authored Dec 01, 2023
```
Fix typo in README
```
0ad4e7e6
Fix unsupported setting of self._n_gpu in training_args on XPU devices (#27716) · 9ddbb696
Liangliang-Ma authored Dec 01, 2023
```
change xpu _n_gpu = 1
```
9ddbb696

30 Nov, 2023 1 commit

Add SeamlessM4T v2 (#27779) · 29f1aee3

Yoach Lacombe authored Nov 30, 2023



* add working convertion script

* first non-working version of modeling code

* update modeling code (working)

* make style

* make fix-copies

* add config docstrings

* add config to ignore docstrings formatage due to unconventional markdown

* fix copies

* fix generation num_return_sequences

* enrich docs

* add and fix tests beside integration tests

* update integration tests

* update repo id

* add tie weights and make style

* correct naming in .md

* fix imports and so on

* correct docstrings

* fix fp16 speech forward

* fix speechencoder attention

* make style

* fix copied from

* rename SeamlessM4Tv2-v2 to SeamlessM4Tv2

* Apply suggestions on configuration
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove useless public models

* fix private models + better naming for T2U models

* clean speech encoder relative position embeddings

* refactor chunk attention

* add docstrings to chunk attention method

* improve naming and docstrings

* rename some attention variables + add temperature sampling in T2U model

* rename DOCSTRINGS variable names

* make style + remove 2 useless config parameters

* enrich model card

* remove any attention_head reference + fix temperature in T2U

* new fmt and make style

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* rename spkr_id->speaker_id and change docstrings of get_char_input_ids

* simplify v2attention

* make style

* Update seamless_m4t_v2.md

* update code and tests with last update

* update repo ids

* fill article name, abstract andauthors

* update not_doctested and slow_doc tests

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

29f1aee3