Commits · 89a0a9eaceaba3b6d25cbc109f4febd147e0aa43 · chenpangpang / transformers

"docs/source/vscode:/vscode.git/clone" did not exist on "78a471ff716b092099aa63b9615fee5170139709"

22 Mar, 2023 1 commit
- [deepspeed] offload + non-cpuadam optimizer exception doc (#22044) · 89a0a9ea
  Stas Bekman authored Mar 21, 2023
```
* [deepspeed] offload + non-cpuadam optimizer exception doc

* deps
```
  89a0a9ea
20 Mar, 2023 3 commits

Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278) · 7bd86505
Maria Khalusova authored Mar 20, 2023
```
* added an example of pad_to_multiple_of

* make style

* addressed feedback
```
7bd86505
Fix doc links (#22274) · 8ac29fe0
amyeroberts authored Mar 20, 2023

8ac29fe0

Rework a bit the LLaMA conversion script (#22236) · 786092a3

Sylvain Gugger authored Mar 20, 2023



* Update LLaMA conversion script

* Doc

* Fix the weight size for the 13B checkpoint

* Update src/transformers/models/llama/convert_llama_weights_to_hf.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

786092a3

17 Mar, 2023 7 commits

fix(docs): fix task guide links in model docs (#22226) · 074490b2
Seb0 authored Mar 17, 2023
```
fix(docs): task guide links in model docs
```
074490b2
Removed .mdx extension in two links (#22230) · 314cdf7c
Maria Khalusova authored Mar 17, 2023
```
removed .mdx extension
```
314cdf7c

Add LlamaForSequenceClassification (#22209) · f2514413

lewtun authored Mar 17, 2023



* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f2514413

LLaMA house-keeping (#22216) · 00934026
Sylvain Gugger authored Mar 17, 2023
```
* LLaMA house-keeping

* Doc links
```
00934026

Depth estimation task guide (#22205) · 42f8f764

Maria Khalusova authored Mar 17, 2023

* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix

42f8f764

fix code example in mgp-str doc (#22219) · af1c864c
wangpeng authored Mar 17, 2023
```
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
```
af1c864c
fix typos in llama.mdx (#22223) · 33d033d6
Kevin Turner authored Mar 17, 2023

33d033d6

16 Mar, 2023 2 commits

LLaMA Implementation (#21955) · 0041be5b

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

0041be5b

Fix typo in Align docs (#22199) · 1485bd9c
Alara Dirik authored Mar 16, 2023
```
Fix align docs typo
```
1485bd9c

14 Mar, 2023 2 commits
- Update 2 doctest expected values for torch 2.0.0 (#22148) · ff887035
  Yih-Dar authored Mar 14, 2023
```
update values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ff887035
- Add ConvNeXT V2 (#21679) · cdddfbff
  Alara Dirik authored Mar 14, 2023
```
* Add ConvNeXt V2 to transformers
* TF model is separated from the PR to fix issues
```
  cdddfbff
13 Mar, 2023 5 commits

docs: New terms and updates to glossary (#21982) · 101a6cd2

MichaelRipa authored Mar 13, 2023



* Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added link to 'Pipeline for inference' tutorial

* Trigger CI

* Update docs/source/en/glossary.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Added entry for self supervised learning, added deleted entries + fixed broken links

* Update docs/source/en/glossary.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

101a6cd2

[deepspeed docs] Activation Checkpointing (#22099) · 618697ef

Stas Bekman authored Mar 13, 2023



* [deepspeed docs] Activation Checkpointing

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update deepspeed.mdx

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

618697ef

Zero-shot image classification task guide (#22132) · 8def252d

Maria Khalusova authored Mar 13, 2023



* WIP

* WIP

* manual inference example

* make style

* Apply suggestions from code review
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

---------
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

8def252d

add new model of MGP-STR (#21418) · 102b5ff4

wangpeng authored Mar 13, 2023



* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* remove representation_size from MGPSTRConfig

* reformat configuration_mgp_str.py

* format test_processor_mgp_str.py

* add test for tokenizer and complete model/processer test and model file

* rm Unnecessary tupple in modeling_mgp_str

* reduce hidden_size/layers/label_size in test_model

* add integration tests and change MGPSTR to Mgpstr

* add test for logit values

* reformat test model file

---------
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>

102b5ff4

Add AutoModelForZeroShotImageClassification (#22087) · 32e3466d
Alara Dirik authored Mar 13, 2023
```
Adds AutoModelForZeroShotImageClassification to transformers
```
32e3466d

10 Mar, 2023 2 commits
- GPT-J specific half precision on CPU note (#22086) · bdec2768
  Maria Khalusova authored Mar 10, 2023
```
* re: #21989

* update re: #21989

* removed cpu option

* make style
```
  bdec2768
- Fix small typo in flan-ul2.mdx (#22068) · ade26bf9
  Kevin Jiang authored Mar 10, 2023
```
* Update flan-ul2.mdx

* Update flan-ul2.mdx
```
  ade26bf9
09 Mar, 2023 3 commits

Can't install tf2 on M1 Chip by default (#22046) · 68477430
Shaun VanWeelden authored Mar 09, 2023

68477430

Docs Improvement - In ZSH, not using ' ' around pip install fails, fix it (#22045) · 81cd655c

Shaun VanWeelden authored Mar 09, 2023

In ZSH, not using ' ' around pip install fails

Running 
```
pip install transformers[torch]
```
in the default ZSH terminal will fail with the error `zsh: no matches found: transformers[torch]`

The solution is to wrap the installation path in ' ' like 
```
pip install 'transformers[torch]'
```

Relevant StackOverflow: https://stackoverflow.com/questions/30539798/zsh-no-matches-found-requestssecurity

81cd655c

Update ALIGN docs (#22025) · 2055d737
Alara Dirik authored Mar 09, 2023
```
* Fix typos and add code examples, resources
```
2055d737

08 Mar, 2023 2 commits

[WIP] Add BridgeTowerForContrastiveLearning (#21964) · de81adf9

Anahita Bhiwandiwalla authored Mar 08, 2023



* Add BridgeTower for ITC

* Fix review feedback

* Rename BridgeTowerForITC, cleanup

* Fix style and quality

* implement tests

---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>

de81adf9

update: bertology paper (#22012) · bbd94997
Qiushi authored Mar 08, 2023

bbd94997

07 Mar, 2023 2 commits

[Time-Series] informer model (#21099) · 8abe4930

Eli Simhayev authored Mar 08, 2023

* added informer to gitignore

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* moved enc-dec init to InformerEncoder/Decoder init

* added 'init_std' to config, now model init works!

* WIP conversion script, and added code sources

* WIP conversion script: loading original informer pth works

* WIP conversion script: change defaults in the config

* WIP conversion script: supporting Informer input embedding

* WIP conversion script: added parameters for the informer embed

* WIP conversion script: change dim_feedforward=2048

* WIP conversion script: remove unused args for loading checkpoint

* just cleaning up

* DataEmbedding removed, after thinking with Kashif

* working on forward pass

* WIP forward pass: trying to establish working batch for forward pass

* cleaning and finalizing

* adding HF names and docs

* init after cleaning works

* WIP in tests

* added docs for the informer specific args

* fix style

* undo change

* cleaning informer, now need to work only enc-dec

* initial enc-dec classes

* added encoder and decoder

* added todo

* add todos for conv_layers

* added decoder docs from vanilla

* added encoder docs from vanilla

* remove encoder decoder from the original informer

* removed AttentionLayer from the original paper

* removed TriangularCausalMask, same as decoder_attention_mask

* initial sparse attention

* use conv_layers

* fixed test_config test

* fix parenthesis when itearting zip(layers, conv_layers)

* error found in prob attention, added sizes as comments

* fix sizes

* added proposal for q_reduce indexing, and remove unused

* WIP ProbMask, and changed factor=2 for testing

* remove unused libs for this PR for creating the env

* fix checking the attn_weights.size() after bmm

* Q_reduce: changed from torch.gather to simple slicing

* WIP calculate final attn_output

* finish adding v_aggregated, attn_output ready

* changed tgt_len to u in attention_mask, need to fix the size error

* comment attention_mask for encoder, and fix if cond for v_agg

* added ProbMask support (wip), removed old original code

* finished ProbMask 😃



* Revert "remove unused libs for this PR for creating the env"

This reverts commit 11a081e09e92771e51a5d2758d53a9afb59547f0.

* fixes

* make style

* fix initial tests

* fix more tests

* dry

* make style

* remove unused files

* style

* added integration tests

* fix num_static_real_features

* fix header

* remove unused function

* fix example

* fix docs

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/modeling_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixes for reviewer

* use prediction_length from model

* fix style

* fixed informer.mdx

* added to index

* updated readme

* undo

* make fix-copies

* typo

* fix copy

* added Informer to toctree

* in order

* fixed comments

* remove unneeded new lines in docs

* make static real and cat optional

* fix use of distil conv layers

* fixed integration test

* added checkpoint for convlayer

* make fix-copies

* updated from time series model

* make fix-copies

* copy decoder

* fix unit tests

* updated scaling config

* fix integration tests

* IGNORE_NON_TESTED

* IGNORE_NON_AUTO_CONFIGURED

* IGNORE_NON_AUTO_CONFIGURED

* updated check configs

* fix formatting

* undo change from time series

* prediction_length should not be None

* aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor

* make style

* make fix-copies

* niels CR: update contributed by

* niels CR: update configuration_informer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: update kashif -> huggingface
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: `sampling_factor` only relevant when `attention_type`=prob

* make style

* fixed U_part: added multiplication by `L_Q`

* fixed bug: remove `is not None` from `if config.distil`

* fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check

* fix integration tests

* updated model hub

* do not shift as in training

* undo

* fix make-copies

* make fix-copies

* added `if prediction_length is None`

* changed `ProbSparseAttention` to `InformerProbSparseAttention`

* changed `V_sum` -> `v_mean_dim_time`

* changed `ConvLayer` to `InformerConvLayer` and fixed `super()`

* TimeSeriesTansformer->Informer in decoder's Copied from

* more descriptive in ProbSparse

* make style

* fix coped from

* Revert "added `if prediction_length is None`"

This reverts commit b4cbddfa05e3bd739b79569cd3c3b89e316f2451.

* fixed indent

* use InformerSinusoidalPositionalEmbedding

* make fix-style

* fix from #21860

* fix name

* make fix-copies

* use time series utils

* fix dec num_heads

* docstring

* added time series util doc

* _import_structure

* formatting

* changes from review

* make style

* fix docs

* fix doc

* removed NegativeLogLikelihood

---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

8abe4930

[Whisper] Add model for audio classification (#21754) · 7c393181

Sanchit Gandhi authored Mar 07, 2023

* [Whisper] Add model for audio classification

* make fix-copies

* add to docs

* add docstring

* empty returns

* add code example

* switch to fleurs

* stick everything on one line

7c393181

06 Mar, 2023 1 commit

docs: improve clarity for language modeling (#21952) · 31e3c6c3

PD Hall authored Mar 06, 2023

* docs: improve clarity for clm/mlm

* docs: remove incorrect explanation

* docs: remove incorrect explanation

---------

Co-authored-by: pdhall99 <pdhall99>

31e3c6c3

03 Mar, 2023 1 commit

[Flan-UL2] Add-flan-ul2 (#21929) · 82aac00e

Arthur authored Mar 03, 2023



* add doc and readme

* add model docs

* update toctree and fix copies

* update

* update doc file

* fix

* add FLAN-UL2 to configuration mapping

* fixup

* Apply suggestions from code review

* more clarification

---------
Co-authored-by: younesbelakda <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

82aac00e

01 Mar, 2023 5 commits

Add ALIGN to transformers (#21741) · 269b0549

Alara Dirik authored Mar 01, 2023

Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.

269b0549

Add TFVisionTextDualEncoder (#21873) · f7c618e3

Matt authored Mar 01, 2023



* Temporary commit to stash everything so far

* Temporary commit to stash everything so far

* stash commit

* Refactor from_pretrained

* Fix final test, make fixup

* Update dummies

* Add model to TEST_FILES_WITH_NO_COMMON_TESTS

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Add TFVisionTextDualEncoder to utils/documentation_tests.txt

* make fixup

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

f7c618e3

[doc] deepspeed tests (#21859) · 3eba1dd2
Stas Bekman authored Mar 01, 2023

3eba1dd2
update FSDP and add XLA-FSDP documentation (#21812) · 571dd693
Sourab Mangrulkar authored Mar 01, 2023
```
* update FSDP and add XLA-FSDP documentation

* resolving comments

* minor update

* fix xla-fsdp docs
```
571dd693
Removed BLIP mention from the troubleshooting guide (#21872) · 9c1d5988
Maria Khalusova authored Mar 01, 2023
```
removed BLIP mention from the troubleshooting guide
```
9c1d5988

28 Feb, 2023 2 commits

Add: task guide for zero shot object detection (#21829) · 6ca84458

Maria Khalusova authored Feb 28, 2023



* zero shot object detection part 1

* added batch prediction section

* added image guided object detection section

* make style

* added the task guide to the TOC

* minor polishing

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* added embedded owlvit demo

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor fix

* make style

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6ca84458

[`Blip2`] Add `Blip2Model` (#21817) · b8de7e44

Younes Belkada authored Feb 28, 2023

* add v1

* add `Blip2Model`

- add relevant functions
- add tests
- add on automapping

* fix docs

* fix doctest

b8de7e44

27 Feb, 2023 2 commits

[`tests`] add `accelerate` marker (#21743) · 831f3144
Younes Belkada authored Feb 27, 2023
```
* add `accelerate` marker

* add to docs

* Update docs/source/en/testing.mdx
```
831f3144

[Pipeline] Add zero shot audio classificatoin pipeline (#21600) · cc44e72d

Arthur authored Feb 27, 2023



* add pipeline

* update init

* add zero shot to init

* update inits and correct checkpoints

* update base to support input features

* add tests

* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* update pieline code

* use tiny checkpoint

* nits and expected value with tiny model

* style

* last nit on tests values

* fix styling

* fix collate fn that was casting t float

* update

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

cc44e72d