Commits · ff143ae10e21e580eaa176bec8b693ca34ed27b7 · chenpangpang / transformers

23 Feb, 2023 3 commits
- Update doctest GH workflow file (#21744) · ff143ae1
  Yih-Dar authored Feb 23, 2023
```
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ff143ae1
- Make ImageProcessorMixin compatible with subfolder kwarg (#21725) · 448e050b
  Naga Sai Abhinay authored Feb 23, 2023
```
* Add subfolder support

* Add kwarg docstring

* formatting fix

* Add test
```
  448e050b
- typos in french documentation (#21750) · 064f3748
  Thomas Paviot authored Feb 23, 2023
  
  064f3748
22 Feb, 2023 12 commits

Added "Open in Colab" to task guides (#21729) · 619d51e0
Maria Khalusova authored Feb 22, 2023
```
added Open in Colab to task guides
```
619d51e0

Fix to KerasMetricCallback when the model returns unstructured output (#21727) · d913f4aa

Matt authored Feb 22, 2023

* Stop doing dict-things to non-dict inputs

* Add a debug check

* Add a debug check

* Remove debug checks, looks good now!

* make fixup

d913f4aa

[SpeechT5HifiGan] Handle batched inputs (#21702) · 82e61f34
Sanchit Gandhi authored Feb 22, 2023
```
* [SpeechT5HifiGan] Handle batched inputs

* fix docstring

* rebase and new ruff style
```
82e61f34

Fix `GPTSanJapaneseModel` (#21731) · 09127c57

Yih-Dar authored Feb 22, 2023



* fix

* skip test_model_parallelism

* skip test_model_parallelism

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

09127c57

Fix `ErnieMEmbeddings` device issue (#21726) · aff87da1

Yih-Dar authored Feb 22, 2023



* remove .parameters()).device

* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

aff87da1

Change doc example for `BigBirdForQuestionAnswering` (#21723) · 2f2b19ff

Yih-Dar authored Feb 22, 2023



Change doc example for BigBirdForQuestionAnswering
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

2f2b19ff

Remove `gptsan_japanese` from doctest list to avoid GPU OOM (#21722) · 354b3383
Yih-Dar authored Feb 22, 2023
```
remove from doctest list to avoid GPU OOM
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
354b3383
Respect documentation on passive log level (#21700) · b19d64d8
Sylvain Gugger authored Feb 22, 2023
```
* Respect documentation on passive log level

* Fix test and set log level in examples

* Add doc
```
b19d64d8
Fix quality · ee6e71e2
Sylvain Gugger authored Feb 22, 2023

ee6e71e2
[`MBart`] Fix cross attention mask check (#21730) · 24b930ad
Younes Belkada authored Feb 22, 2023
```
fix typo
```
24b930ad
Apply ruff flake8-comprehensions (#21694) · 5e8c8eb5
Aaron Gokaslan authored Feb 22, 2023

5e8c8eb5

Time series transformer: input projection and Std scaler (#21020) · df06fb1f

Kashif Rasul authored Feb 22, 2023



* added loc and scale outputs from scalers

* fix typo

* fix tests

* fixed formatting

* initial StdScaler

* move scaling to optional str

* calculate std feature for scalers

* undid change as it does not help

* added StdScaler with weights

* added input projection layer and d_model hyperparam

* use linear proj

* add back layernorm_embedding

* add sin-cos pos embeddings

* updated scalers

* formatting

* fix type

* fixed test

* fix repeated_past_values cal.

* fix when keepdim=false

* fix default_scale

* backward compatibility of scaling config

* update integration test expected output

* fix style

* fix docs

* use the actual num_static_real_features in feature_dim cal

* clarified docs

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* prediction_length is not optional

* fix for reviewer

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* get rid of un-needed new lines

* fix doc

* remove unneeded new lines

* fix style

* static_categorical_features and static_real_features are optional

* fix integration test

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixing docs for multivariate setting

* documentation for generate

---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

df06fb1f

21 Feb, 2023 8 commits

Adding type hints to call() functions in this file (#21548) · bb5a2f2f

mollerup23 authored Feb 21, 2023



* Adding type hints to call() functions in this file

* make fixup

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

bb5a2f2f

Adding task guides to resources (#21704) · 78a53d59

Maria Khalusova authored Feb 21, 2023



* added resources: links to task guides that support these models

* minor polishing

* conflict resolved

* link fix

* Update docs/source/en/model_doc/vision-encoder-decoder.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

78a53d59

Fix TVLT (torch device issue) (#21710) · 03aaac35

Yih-Dar authored Feb 21, 2023



* fix tvlt ci

* fix tvlt ci

* fix tvlt ci

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

03aaac35

Fix `get_class_in_module` (#21709) · 4c6346cc

Yih-Dar authored Feb 21, 2023



Fix get_class_in_module
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4c6346cc

Fix typo in `PROCESSOR_MAPPING_NAMES` and add tests (#21703) · ed6ceb76

Yih-Dar authored Feb 21, 2023



* Add test

* Fix GITProcessor

* Update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ed6ceb76

remove position ids and token type ids from forward args in docstring (#21701) · 4deaa534
Arthur authored Feb 21, 2023

4deaa534

Fix axial positional encoding calculations for reformer.mdx (#21649) · c40e3581

Ishan Jindal authored Feb 20, 2023



* Update reformer.mdx

Fix axial positional encoding calculations

* Update docs/source/en/model_doc/reformer.mdx
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

c40e3581

Add WhisperTokenizerFast (#21222) · deafc243

Jonatan Kłosko authored Feb 21, 2023



* Add WhisperTokenizerFast

* Fixup

* Up

* Up

* Improve tests

* Update src/transformers/models/whisper/tokenization_whisper_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Keep stride in whisper pipelien test

* Remove unknown token special case

* Reduce vocabulary size in tests

* Fix vocab size assertion

* Sync copied changes from WhisperTokenizer

* Skip pipeline tests

* Update assertion

* Remove Whisper tokenizer dependency on sentencepiece

* Format

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

deafc243

20 Feb, 2023 9 commits

Pass along revision in dynamic code fetch (#21698) · 8b3db33a
Sylvain Gugger authored Feb 20, 2023

8b3db33a
Fix-rag-finetune-project-requirement (#21697) · 4194e5f4
Arthur authored Feb 20, 2023
```
pin pytorch lightning requirement
```
4194e5f4
Add EfficientNet (#21563) · 49ab1623
Alara Dirik authored Feb 20, 2023
```
* Add EfficientNet to transformers
```
49ab1623
[`bnb`] fix `bnb` decoders bug (#21688) · c9a06714
Younes Belkada authored Feb 20, 2023
```
* fix `bnb` decoders bug

* make fixup
```
c9a06714

add GPTSAN model (reopen) (#21291) · f56174ac

tanreinama authored Feb 20, 2023

* add GPTSAN-Japanese

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN (update for review)

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix typo in comment text

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix document and comments

* fix class name GPTSAN->GPTSan

* fix import and test for tokenizer

f56174ac

Fix quality · c87bbe1f
Sylvain Gugger authored Feb 20, 2023

c87bbe1f
Fix for non-contiguous label tensors in VisonEncoderDecoder (#21582) · 011cc17a
Morgan McGuire authored Feb 20, 2023
```
* add prints

* add shape

* add reshape

* clean up
```
011cc17a

add flax whisper implementation (#20479) · 2840272c

Andy Ehrenberg authored Feb 20, 2023



* add flax whisper implementation

* rever change to setup

* remove unused imports

* revert generation changes

* flax whisper docs

* docs

* import order

* import sorting

* isort

* add dummy objects

* doc formatting

* formatting

* remove trailing whitespaces

* fix flax whisper docs

* add generation logic to unlock flax whisper

* remove scans

* give credits to Flax Bart implementation

* remove unused imports

* add license

* remove assert

* more credits to Bart

* fix style

* formatting

* support left padding

* add flax whisper generation test

* remove copied from comments whenever not a full copy

* fix docstrings for logits processors

* revert change to FlaxForceTokensLogitsProcessor

* revert doc changes

* improve generation docs

* reorganize

* formatting

* cleanup docs

* add tests

* handle empty list case

* fix forced decoder ids in flax tests

* add flax whisper to inits

* upate dummy objects

* docs for FlaxAutoModelForSpeechSeq2Seq

* fix decoder_position_ids computation in pretrained model decode/__call__ fns

* add Copied from statements as necessary

* compute position_ids only in __call__ and decode methods of pretrained model subclasses

* improve readabilityof compute positional embeddings

* check dimensionality of input_features instead of hidden_states

* copied from statement for init_cache

* formatting

* fix copies

* fix copies

* pass attention mask to encoder layers

* fix decoder module outputs

* set dtype
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* smaller flax model for whisper test

* Update src/transformers/generation/flax_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/whisper/test_modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* cleanup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* bias cleanup

* doc fix

* align style for force tokens processor

* readability

* fix input shape in tests

* revert FlaxGenerationMixin docstring

* formatting

* fix tests

* fix imports

* consistent encoder hidden states

* consistent hidden states

* input shapes

* typo

* partial class trick

* partial class for input shape

* base_class with correct input shape

* partial base classes

* match by name

* set main_input_name

* compare on names

* formatting

* remove unused import

* safer position ids computation

* safer position id computation

* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove identical inherited tests

* fix prompt ids in tests

* use generation config

* use jnp array

* better var names

* more explicit bias use

* import transformers

* formatting

* test formatting

* remove unused imports

* remove unused imports

* formatting

* isort

* docs

* fix ln orders for encoder hidden states

* whisper unique generation stuff

* flake

* use finfo for attention bias

* docs

* Update src/transformers/generation/flax_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* docs

* add timestamp flax test

* jit for timestamps

* formatting

* clean up timestamps processor

* formatting

* remove if_true

* cleanup

---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2840272c

Enable PyTorch/XLA Fully Sharded Data Parallel (FSDP) (#21406) · 7735e040

AlexWertheim authored Feb 20, 2023



* Reinserted import statement accidentally removed during rebasing.

* Added auto_wrap functionality, restructured XLA FSDP logic to more closely match PyTorch FSDP logic.

* Fixed flag descriptions; changed several instances of fsdp_ to xla_fsdp_; pass in auto_wrap_policy and auto_wrapper_callable directly to avoid lambda saving.

* Moved XLA FSDP logic to be adjacent to Fairscale FSDP logic in trainer.

* Formatted changes in accordance with HF style requirements.

* Added back in warning which was accidentally removed.

* - Merged XLA FSDP training arguments into `fsdp_config`
- Added `xla` boolean flag to `fsdp_config` to specify XLA FSDP wrapping
- Merged XLA FSDP wrapping logic into FSDP wrapping logic within trainer
  class

* Cleaned up errors, moved argument to fsdp_config

- Set `xla` and `xla_fsdp_grad_ckpt` flags by default in fsdp_config
- Added missing colons following conditionals
- Moved `fsdp_transformer_layer_cls_to_wrap` to `fsdp_config`
- Modified `fsdp_transformer_layer_cls_to_wrap` to be list of strings,
  not just one string
- Changed Fairscale FSDP logic to allow for set of layer classes to wrap
- Removed unnecessary checks for `xla_fsdp`

* Corrected small errors, improved layer class flag

- Correctly set default values for `xla` and `xla_fsdp_grad_ckpt`
  arguments
- Made `fsdp_transformer_layer_cls_to_wrap` a list of strings instead of
  a single string
- Added processing to ensure that `fsdp_transformer_layer_cls_to_wrap`
  works as expected if passed as a single string
- Updated PyTorch FSDP logic to accept a list of layers to wrap, as done
  with XLA FSDP
- Replaced instances of `getattr()` with `.get()` for dictionary
  retrievals with default values, including when setting
  `fsdp_min_num_params`
- Corrected `self.fsdp is not None` to `len(self.fsdp) > 0`
- Removed extraneous `xla_fsdp` argument descriptions from outside
  `fsdp_config`

* Changed xla-fsdp-settings to be dictionary

- Modified xla-fsdp-settings to be entered directly as dictionary
  instead of loaded through JSON file
- Made small style corrections

* Reverted unintentional local_rank TPU check

* Do not block XLA FSDP if local rank is -1

* Rebased and applied automatic formatting

- Rebased
- Applied automatic formatting changes via `make style`

* Applied automatic formatting with latest version of black

* Replaced  expression with

* Reran black examples tests src utils
ruff examples tests src utils --fix
make autogenerate_code
make[1]: Entering directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers'
make[1]: Leaving directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers' after additional formatting changes

* Additionall automatic formatting changes

* Remove unnecessary whitespace characters from src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7735e040

17 Feb, 2023 7 commits

Fix dynamic module import error (#21646) · 7f1cdf18

Yih-Dar authored Feb 17, 2023



* fix dynamic module import error

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

7f1cdf18

[`BLIP`] update blip path on slow tests (#21476) · 8a4c319d
Younes Belkada authored Feb 17, 2023
```
* update blip path

* Update tests/models/blip/test_modeling_blip.py
```
8a4c319d

[`ImageProcessor`] Refactor default `mean` & `std` to `OPENAI_CLIP_MEAN` &... · 087fd5f3

Younes Belkada authored Feb 17, 2023

[`ImageProcessor`] Refactor default `mean` & `std` to `OPENAI_CLIP_MEAN` & `OPENAI_CLIP_STD` (#21425)

* fix default value

* add the fix on other models

087fd5f3

Generate: eta sampling numerical stability (#21676) · 005b5157
Joao Gante authored Feb 17, 2023

005b5157
Fix multi-gpu training error for LayoutLMv2 (#21675) · bb6a664e
Yoshinari Fujinuma authored Feb 17, 2023
```
Co-authored-by: Yoshinari Fujinuma <fujinuy@amazon.com>
```
bb6a664e

[`CLAP`] Fix few broken things (#21670) · a8eb4f79

Younes Belkada authored Feb 17, 2023



* add `is_longer`

* fix docstring

* fix config class

* fix loss

* fix all doctests

* fix order

* fix last failing tests

---------
Co-authored-by: arthur.zucker@gmail.com <arthur.zucker@gmail.com>

a8eb4f79

[`bnb`] Introducing `BitsAndBytesConfig` (#21579) · 3668ec17

Younes Belkada authored Feb 17, 2023



* v1 `BitsandbytesConfig`

- add v1
- add tests
- more user-friendly API
- add docs

* change to `BitsAndBytesConfig`

* replace logic

* changes

* make fixup

* quality

* make fixup

* fix doc

* fix test

* update toctree

* fix slow test

* add tips

* add warning

* change title

* oops

* Update docs/source/en/main_classes/quantization.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove unused file

* adapt suggestion

- add also tests
- change logic

* update docs

* adapt suggestions

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3668ec17

16 Feb, 2023 1 commit

Adapt PerceiverIO Multimodal class to work with arbitrary modalities (#20054) · f16d29b3

Steven Anton authored Feb 16, 2023



* * Properly register parameters in PerceiverMultimodalPreprocessor
* Adapt PerceiverTextPreprocessor to work with PerceiverMultimodalPreprocessor
* Change a few type hints

* Fix formatting; incorrect return type

* Return embeddings_wo_pos

---------
Co-authored-by: Steven Anton <antonstv@amazon.com>

f16d29b3