Commits · 582d085bb2c54e20907bfdfae24d0e9e37070ca6 · chenpangpang / transformers

30 Sep, 2022 4 commits

Add expected output to the sample code for `ViTMSNForImageClassification` (#19183) · 582d085b

Sayak Paul authored Sep 30, 2022

* chore: add expected output to the sample code.

* add: imagenet-1k labels to the model config.

* chore: apply code formatting.

* chore: change the expected output.

582d085b

Rebase ESM PR and update all file formats (#19055) · 368b649a

Matt authored Sep 30, 2022



* Rebase ESM PR and update all file formats

* Fix test relative imports

* Add __init__.py to the test dir

* Disable gradient checkpointing

* Remove references to TFESM... FOR NOW >:|

* Remove completed TODOs from tests

* Convert docstrings to mdx, fix-copies from BERT

* fix-copies for the README and index

* Update ESM's __init__.py to the modern format

* Add to _toctree.yml

* Ensure we correctly copy the pad_token_id from the original ESM model

* Ensure we correctly copy the pad_token_id from the original ESM model

* Tiny grammar nitpicks

* Make the layer norm after embeddings an optional flag

* Make the layer norm after embeddings an optional flag

* Update the conversion script to handle other model classes

* Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py

* Break the copied from link from BertModel.forward to remove token_type_ids

* Remove debug array saves

* Begin ESM-2 porting

* Add a hacky workaround for the precision issue in original repo

* Code cleanup

* Remove unused checkpoint conversion code

* Remove unused checkpoint conversion code

* Fix copyright notices

* Get rid of all references to the TF weights conversion

* Remove token_type_ids from the tests

* Fix test code

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add credit

* Remove _ args and __ kwargs in rotary embedding

* Assertively remove asserts

* Replace einsum with torch.outer()

* Fix docstring formatting

* Remove assertions in tokenization

* Add paper citation to ESMModel docstring

* Move vocab list to single line

* Remove ESMLayer from init

* Add Facebook copyrights

* Clean up RotaryEmbedding docstring

* Fix docstring formatting

* Fix docstring for config object

* Add explanation for new config methods

* make fix-copies

* Rename all the ESM- classes to Esm-

* Update conversion script to allow pushing to hub

* Update tests to point at my repo for now

* Set config properly for tests

* Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted

* make fixup

* Update expected values for slow tests

* make fixup

* Remove EsmForCausalLM for now

* Remove EsmForCausalLM for now

* Fix padding idx test

* Updated README and docs with ESM-1b and ESM-2 separately (#19221)

* Updated README and docs with ESM-1b and ESM-2 separately

* Update READMEs, longer entry with 3 citations

* make fix-copies
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Tom Sercu <tsercu@fb.com>
Co-authored-by: Your Name <you@example.com>

368b649a

Catch `HFValidationError` in `TrainingSummary` (#19252) · 4fd32a1f

Yih-Dar authored Sep 30, 2022



* Catch HfValidationError in TrainingSummary
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4fd32a1f

Add MarkupLM (#19198) · f3d2f7a6

NielsRogge authored Sep 30, 2022



* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>

f3d2f7a6

29 Sep, 2022 9 commits

[Wav2Vec2] Fix None loss in doc examples (#19218) · 49d62b01

rbsteinm authored Sep 29, 2022

* pass sampled_negative_indices parameter to the model to avoid getting a None loss
* concerns doc examples for Wav2Vec2ForPreTraining and Wav2Vec2ConformerForPreTraining

49d62b01

Cast TF generate() inputs (#19232) · cca6e6fe

Matt authored Sep 29, 2022



* Just stick a couple of casts into generate()

* Cast decoder_input_ids too

* Don't accidentally cast floats

* Move to _generate()

* Move to after input validation
Co-authored-by: Your Name <you@example.com>

cca6e6fe

Improve DETR post-processing methods (#19205) · 01eb34ab

Alara Dirik authored Sep 29, 2022

* Ensures consistent arguments and outputs with other post-processing methods
* Adds post_process_semantic_segmentation, post_process_instance_segmentation, post_process_panoptic_segmentation, post_process_object_detection methods to DetrFeatureExtractor
* Adds deprecation warnings to post_process, post_process_segmentation and post_process_panoptic

01eb34ab

Fix TrainingArgs argument serialization (#19239) · b79028f0
atturaioe authored Sep 29, 2022

b79028f0
Use `hf_raise_for_status` instead of deprecated `_raise_for_status` (#19244) · 902d30b3
Lucain authored Sep 29, 2022
```
* Use  instead of  from huggingface_hub

* bump huggingface_hub to 0.10.0 + make deps_table_update
```
902d30b3

Fix opt softmax small nit (#19243) · 3a27ba3d

Younes Belkada authored Sep 29, 2022

* fix opt softmax nit

- Use the same logic as 1eb09537550734a783c194e416029cb9bc4cb119 for consistency

* Update src/transformers/models/opt/modeling_opt.py

3a27ba3d

[TensorFlow] Adding GroupViT (#18020) · 0dc7b3a7

Aritra Roy Gosthipaty authored Sep 29, 2022



* chore: initial commit

* chore: adding util methods

yet to work on the nn.functional.interpolate port with align_corener=True

* chore: refactor the utils

* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown

* chore: adding the layers

* chore: porting all the layers from torch to tf

This is the initial draft, nothing is tested yet.

* chore: aligning the layers with reference to tf clip

* chore: aligning the modules

* added demaraction comments
* added copied and adapted from comments

* chore: aligning with CLIP

* chore: wrangling the layers to keep it tf compatible

* chore: aligning the names of the layers for porting

* chore: style changes

* chore: adding docs and inits

* chore: adding tfp dependencis

the code is taken from TAPAS

* chore: initial commit for testing

* chore: aligning the vision embeddings with the vit implementatino

* chore: changing model prefix

* chore: fixing the name of the model and the layer normalization test case

* chore: every test passes but the slow ones

* chore: fix style and integration test

* chore: moving comments below decorators

* chore: make fixup and fix-copies changes

* chore: adding the Vision and Text Model to check_repo

* chore: modifying the prefix name to align it with the torch implementation

* chore: fix typo in configuration

* choer: changing the name of the model variable

* chore: adding segmentation flag

* chore: gante's review

* chore: style refactor

* chore: amy review

* chore: adding shape_list to parts that have been copied from other snippets

* chore: init batchnorm with torch defaults

* chore: adding shape_list to pass the tests

* test fix: adding seed as 0

* set seed

* chore: changing the straight through trick to fix -ve dimensinos

* chore: adding a dimension to the loss

* chore: adding reviewers and contributors names to the docs

* chore: added changes after review

* chore: code quality fixup

* chore: fixing the segmentation snippet

* chore: adding  to the layer calls

* chore: changing int32 to int64 for inputs of serving

* chore: review changes

* chore: style changes

* chore: remove from_pt=True

* fix: repo consistency
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

0dc7b3a7

Add a getattr method, which replaces _module_getattr in torch.fx.Tracer from PyTorch 1.13+ (#19233) · bb6fa06f
Michael Benayoun authored Sep 29, 2022

bb6fa06f

XGLM - Fix Softmax NaNs when using FP16 (#18057) · 9d732fd2

Gabriele Sarti authored Sep 29, 2022



* fix fp16 for xglm

* Removed misleading comment

* Fix undefined variable
Co-authored-by: Gabriele Sarti <gsarti@amazon.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

9d732fd2

28 Sep, 2022 5 commits
- Document and validate typical_p in generation (#19128) · 9c6aeba3
  Nick Doiron authored Sep 28, 2022
```
* Document and validate typical_p in generation
```
  9c6aeba3
- Fix doctest for `TFDeiTForImageClassification` (#19173) · de359c45
  Yih-Dar authored Sep 28, 2022
```
* Fix doctest for TFDeiTForImageClassification

* Remove unnecessary tf.random.set_seed
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  de359c45
- Fix deprecation warning for return_all_scores (#19217) · 22d37a9d
  Gabriel Luiz Freitas Almeida authored Sep 28, 2022
```
* Improve deprecation warning for return_all_scores

* Fix formatting
```
  22d37a9d
- Generate: add warning when left padding should be used (#19067) · a357ed50
  Joao Gante authored Sep 28, 2022
```
* add warning when left padding should be used

* PT: check for pad token; FLAX: can only check while not tracing
```
  a357ed50
- Fix small use_cache typo in the docs (#19191) · 942fa8ce
  Ankur Goyal authored Sep 28, 2022
  
  942fa8ce
27 Sep, 2022 4 commits
- Added tests for yaml and json parser (#19219) · 2df60287
  IMvision12 authored Sep 28, 2022
```
* Added tests for yaml and json

* Added tests for yaml and json
```
  2df60287
- Use `math.pi` instead of `torch.pi` in `MaskFormer` (#19201) · 2d956958
  Yih-Dar authored Sep 27, 2022
```
* Use math.pi
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  2d956958
- Add a use_parallel_residual argument to control the residual computing way (#18695) · 226b0e46
  wangxu authored Sep 27, 2022
```
* Add a gpt_j_residual argument to control the residual computing way

* Put duplicate code outside of the if block

* Rename parameter "gpt_j_residual" to "use_parallel_residual" and set the default value to True
```
  226b0e46
- Remove unused `cur_len` in generation_utils.py (#18874) · 7132d55c
  Ekagra Ranjan authored Sep 27, 2022
```
* remove unused cur_len in generation_utils.py

* linting
```
  7132d55c
26 Sep, 2022 6 commits
- Fix cached_file in offline mode for cached non-existing files (#19206) · a32f97c3
  Sylvain Gugger authored Sep 26, 2022
```
* Fix cached_file in offline mode for cached non-existing files

* Add tests

* Test with offline mode
```
  a32f97c3
- Add warning for torchaudio <= 0.10 in MCTCTFeatureExtractor (#19203) · ca088639
  Yih-Dar authored Sep 26, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  ca088639
- Updated hf_argparser.py (#19188) · be4f2699
  IMvision12 authored Sep 27, 2022
```
* Changed json_file_parser function and added yaml parser function

* update hf_argparser

* Added allow_extra_keys argument
```
  be4f2699
- Use repo_type instead of deprecated datasets repo IDs (#19202) · c20b2c7e
  Sylvain Gugger authored Sep 26, 2022
```
* Use repo_type instead of deprecated datasets repo IDs

* Add missing one in doc
```
  c20b2c7e
- Move the model type check (#19027) · 216b2f9e
  Ankur Goyal authored Sep 26, 2022
```
Co-authored-by: Ankur Goyal <ankur@impira.com>
```
  216b2f9e
- Remove pos arg from Perceiver's Pre/Postprocessors (#18602) · 408b5e30
  Ahmad Elawady authored Sep 26, 2022
```
* Remove pos arg from Perceiver's Pre/Postprocessors

* Revert the removed pos args in public methods
```
  408b5e30
23 Sep, 2022 7 commits

Fixed type hint for pipelines/check_task (#19150) · 6395d122
Fei Wang authored Sep 24, 2022

6395d122

Fix incorrect comments about atten mask for pytorch backend (#18728) · ece76244

Tianqi Zhang (张天启) authored Sep 24, 2022



* fix incorrect comments about atten mask

* typo

* Update for CodeGen
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ece76244

Add doctests to Perceiver examples (#19129) · 49bf5698

Steven Anton authored Sep 23, 2022



* Fix bug in example and add to tests

* Fix failing tests

* Check the size of logits

* Code style

* Try again...

* Add expected loss for PerceiverForMaskedLM doctest
Co-authored-by: Steven Anton <antonstv@amazon.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

49bf5698

Detr preprocessor fix (#19007) · fe01ec34
Alara Dirik authored Sep 23, 2022
```
* fix in-place preprocessing of inputs
```
fe01ec34

Add semantic segmentation post-processing method to MobileViT (#19105) · 7e84723f

Alara Dirik authored Sep 23, 2022



* add post-processing method for semantic segmentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7e84723f

[WIP] Trainer supporting evaluation on multiple datasets (#19158) · 905635f5

Tim Baumgärtner authored Sep 23, 2022

* support for multiple eval datasets

* support multiple datasets in seq2seq trainer

* add documentation

* update documentation

* make fixup

* revert option for multiple compute_metrics

* revert option for multiple compute_metrics

* revert added empty line

905635f5

fix HPO DDP GPU problem (#19168) · 49629e7b

Wang, Yi authored Sep 23, 2022


Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

49629e7b

22 Sep, 2022 5 commits

Fix TrainingArguments documentation (#19162) · 8d59385f
Sylvain Gugger authored Sep 22, 2022
```
* Fix TrainingArguments documentation

* Fix TFTrainingArguments documentation
```
8d59385f
fix: ckpt paths. (#19159) · 3a396c59
Sayak Paul authored Sep 22, 2022

3a396c59
TF: check embeddings range (#19102) · 1b5ab39c
Joao Gante authored Sep 22, 2022

1b5ab39c
Improve conditional detr docs (#19154) · cf6308ef
NielsRogge authored Sep 22, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
cf6308ef

MSN (Masked Siamese Networks) for ViT (#18815) · 2d9853b2

Sayak Paul authored Sep 22, 2022



* feat: modeling and conversion scripts for msn.

* chore: change license year.

* chore: remove unneeded modules.

* feat: direct loading of state_dict from remote url.

* fix: import paths.

* add: rest of the files.

* add and fix rest of the files.
Co-authored-by: Niels <niels.rogge1@gmail.com>

* chore: formatting.

* code quality fix.

* chore: remove pooler.

* feat: add classification top.

* fix: configuration object.

* add: initial test cases (one failing).

* fix: basemodeloutput.

* add: caution on using the classification head.

* add: rest of the model related files.

* add: vit msn readme.

* fix: copied from statement.

* fix: dummy objects.

* add: ViTMSNPreTrainedModel to inits.

* fix: repo consistency.

* minor change in the model doc.

* fix: tests.

* Empty-Commit

* Update src/transformers/models/vit_msn/configuration_vit_msn.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address PR comments.

* Update src/transformers/models/vit_msn/modeling_vit_msn.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* chore: put model in no_grad() and formatting.
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

2d9853b2