Commits · b9a768b3ffa80c4c19d024f9f42d5917e7d8109e · chenpangpang / transformers

01 Apr, 2022 1 commit
- Fixed a typo in legacy seq2seq_trainer.py (#16531) · bfeff6cc
  Cathy authored Apr 01, 2022
  
  bfeff6cc
31 Mar, 2022 2 commits

[research] link to the XTREME-S paper (#16519) · 5807054b

Anton Lozhkov authored Mar 31, 2022



* [research] link to the XTREME-S paper

* Update examples/research_projects/xtreme-s/README.md
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

5807054b

fixed a typo (#16508) · 05b4c329
Bhadresh Savani authored Mar 31, 2022

05b4c329

30 Mar, 2022 1 commit
- [examples] max samples can't be bigger than the len of dataset (#16501) · a73281e3
  Stas Bekman authored Mar 30, 2022
```
* [examples] max samples can't be bigger than then len of dataset

* do tf and flax
```
  a73281e3
29 Mar, 2022 2 commits
- Fix example test and test_fetcher for examples (#16478) · b62ac4d2
  Sylvain Gugger authored Mar 29, 2022
  
  b62ac4d2
- [MNLI example] Prevent overwriting matched with mismatched metrics (#16475) · 5216607f
  Eldar Kurtic authored Mar 29, 2022
```
* Prevent overwriting matched with mismatched metrics

* Fix style
```
  5216607f
28 Mar, 2022 2 commits
- Update run_t5_mlm_flax.py (#16421) · 8049dfa4
  Yongrae Jo authored Mar 28, 2022
```
Fix typo in comment: proprocessed -> preprocessed
```
  8049dfa4
- QDQBert example update (#16395) · 7ecbb9c5
  Shang Zhang authored Mar 28, 2022
```
* update Dockerfile and utils_qa

* Update README.md
```
  7ecbb9c5
25 Mar, 2022 2 commits
- Rename master to main for notebooks links and leftovers (#16397) · 867f3950
  Sylvain Gugger authored Mar 25, 2022
  
  867f3950
- Big file_utils cleanup (#16396) · 088c1880
  Sylvain Gugger authored Mar 25, 2022
```
* Big file_utils cleanup

* This one still needs to be treated separately
```
  088c1880
24 Mar, 2022 1 commit

Update readme with how to train offline and fix BPE command (#15897) · f5e8c9bd

Nathan Cooper authored Mar 24, 2022



* Update readme with how to train offline and fix BPE command

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

f5e8c9bd

23 Mar, 2022 3 commits

Decision transformer gym (#15845) · aff9bc40

Edward Beeching authored Mar 23, 2022



* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Add type hints for Pegasus (#16324)

* Funnel type hints (#16323)

* add pt funnel type hints

* add tf funnel type hints

* Add type hints for ProphetNet PyTorch (#16272)

* [GLPN] Improve docs (#16331)

* Add link to notebook

* Add link

* Fix bug
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

* Added type hints for Pytorch Marian calls (#16200)

* Added type hinting for forward functions in pytorch marian

* typo correction

* Removed type hints on functions from BART per Suraj Patil request

* fix import pb

* fix typo

* corrected tuple call

* ran black

* after fix-copies
Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List

* Fixing copies to roformer and pegasus
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* done (#16340)

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add type annotations for Rembert/Splinter and copies (#16338)

* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`
Co-authored-by: matt <rocketknight1@gmail.com>

* [Bug template] Shift responsibilities for long-range (#16344)

* Fix code repetition in serialization guide (#16346)

* Adopt framework-specific blocks for content (#16342)

* ✨ refactor code samples with framework-specific blocks

* ✨ update training.mdx

* 🖍

 apply feedback

* Updates the default branch from master to main (#16326)

* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Created the Decision Transformer Modle

* updating tests, copy to other machine

* Added last hidden size to Decision Transformer modelling outputs

* Removed copy of original DT file

* made a temporary change to gpt2 to have it conform with the Decision Transformer version

* Updated tests

* Ignoring a file used to test the DT model

* added comments to config file

* added comments and argument descriptions to decision transformer file

* Updated doc

* Ran "make style"

* Remove old model imports

* Removed unused imports, cleaned up init file

* Update docs/source/model_doc/decision_transformer.mdx

added my username
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Reverted changes made to gpt2

* Removed datasets submodule

* Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states

* Added support for return of hidden states, attentions and return dict of gpt2 model.

* Updated tests to include many of the ModelTesterMixin tests. 

The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes

* Added missing line to the end of gpt2 file

* Added an integration test for the Decision Transformer

Test performs and autoregressive evaluation for two time steps

* Set done and info to _ to fix failing test

* Updated integration test to be deterministic and check expected outputs

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unnecessary config options

* Cleaned up commented code and old comments.

* Cleaned up commented code.

* Changed DecisionTransformer to Decision Transformer

* Added Decision Transformer to the main README file

* Added copy of GTP2 called DecisionTranformerGPT2Model

* isorted imports

* isorted imports

* Added model to non-English README files

* Ran make fix-copies and corrected some cases.

* Updated index file to include Decision Transformer

* Added gpt2 model as copy inside the Decision Transformer model file

* Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS

* Deleted redundant checkpoint files (I don't know how these got committed)

* Removed testing files. (These should have never been committed)

* Removed accidentally committed files

* Moved the Decision Transformer test to its own directory

* Moved DecisionTransformOutput to modeling_decision_transformer

* Moved the example usage to research project and cleaned comments

* Made tests ignore the copy of gpt2 in Decision Transformer

* Added module output to modelling decision transformer

* removed copied gpt2 model from list of transformers models

* Updated tests and created __init__ file for new test location

* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Removed unneeded summary type from config file

* Fixed copies

* Updated pretrained config map to refer to hopper-medium checkpoint

* Added Decision transformer to model docs

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updated model with custom docstring example

* Updated copies, config auto, and readme files.
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com>
Co-authored-by: Adam Montgomerie <adam@avanssion.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Clementine Fourrier <cfourrie@inria.fr>
Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

aff9bc40

Reorganize file utils (#16264) · 4975002d

Sylvain Gugger authored Mar 23, 2022

* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit

4975002d

Updates the default branch from master to main (#16326) · eca77f47

Lysandre Debut authored Mar 23, 2022



* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

eca77f47

21 Mar, 2022 1 commit
- [xtreme-s] Update Minds14 results (#16241) · e226a24f
  Anton Lozhkov authored Mar 21, 2022
```
* update results

* per-language metrics

* Format the per-language metrics
```
  e226a24f
17 Mar, 2022 1 commit
- remove jax.ops.index (#16220) · 93d3fd86
  Suraj Patil authored Mar 17, 2022
  
  93d3fd86
16 Mar, 2022 3 commits

Minor fixes to XTREME-S (#16193) · d35e0c62

Anton Lozhkov authored Mar 16, 2022



* Minor fixes

* Fix vocab union

* Update examples/research_projects/xtreme-s/README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update README

* unused import
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d35e0c62

Replace all deprecated `jax.ops` operations with jnp's `at` (#16078) · ee27b3d7
Sanchit Gandhi authored Mar 16, 2022
```
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
```
ee27b3d7
[Xtreme-S] fix some namings (#16183) · c2dc89be
Patrick von Platen authored Mar 16, 2022

c2dc89be

15 Mar, 2022 1 commit

Add the XTREME-S fine-tuning example (#15985) · 99fd3eb4

Anton Lozhkov authored Mar 16, 2022

* CTC+classification draft

* CTC+classification draft

* style

* multilingual runs

* Fix race condition during processor.from_reatrained

* Merge covost experiments

* Add README

* Quality

* Switch to .all configs

* Fix typos

99fd3eb4

12 Mar, 2022 1 commit

[Deepspeed] add support for bf16 mode (#14569) · 580dd87c

Stas Bekman authored Mar 11, 2022



* [WIP] add support for bf16 mode

* prep for bf16

* prep for bf16

* fix; zero2/bf16 is ok

* check bf16 is available

* test fixes

* enable zero3_bf16

* config files

* docs

* split stage_dtype; merge back to non-dtype-specific config file

* fix doc

* cleanup

* cleanup

* bfloat16 => bf16 to match the PR changes

* s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/

* test fixes/skipping

* move

* fix

* Update docs/source/main_classes/deepspeed.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* backticks

* cleanup

* cleanup

* cleanup

* new version

* add note about grad accum in bf16
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

580dd87c

10 Mar, 2022 2 commits
- Don't compute metrics in LM examples on TPU (#16029) · 19597998
  Sylvain Gugger authored Mar 10, 2022
  
  19597998
- Update README.md · 6c9010ef
  Sanchit Gandhi authored Mar 10, 2022
  
  6c9010ef
09 Mar, 2022 2 commits
- Fix broken code blocks in README.md (#15967) · 8feede22
  Shotaro Ishihara authored Mar 10, 2022
```
at transformers/examples/pytorch/contrastive-image-text
```
  8feede22
- Swag example: Update doc format (#16014) · e7f34ccd
  Joao Gante authored Mar 09, 2022
  
  e7f34ccd
08 Mar, 2022 2 commits
- Update TF multiple choice example (#15868) · 62d84760
  Joao Gante authored Mar 08, 2022
  
  62d84760
- Speedup training by using numpy instead of jnp for batch shuffling (#15963) · 91fb62d0
  Yeb Havinga authored Mar 08, 2022
```
Speedup training by using numpy instead of jnp for batch shuffling
Co-authored-by: Yeb Havinga <y.t.havinga@mgrid.net>
```
  91fb62d0
04 Mar, 2022 2 commits
- [FlaxT5 Example] fix flax t5 example pretraining (#15835) · 10b76987
  Patrick von Platen authored Mar 04, 2022
  
  10b76987
- Update README.md · b7147489
  Sanchit Gandhi authored Mar 04, 2022
  
  b7147489
03 Mar, 2022 2 commits
- Fix #15898 (#15928) · c0281feb
  davidleonfdez authored Mar 03, 2022
  
  c0281feb
- v4.18.0.dev.0 · 79d28e80
  Sylvain Gugger authored Mar 03, 2022
  
  79d28e80
02 Mar, 2022 2 commits
- Fix tiny typo (#15884) · e535c389
  Ross Johnstone authored Mar 02, 2022
  
  e535c389
- Update TF QA example (#15870) · 05c237ea
  Joao Gante authored Mar 02, 2022
  
  05c237ea
01 Mar, 2022 1 commit
- Update TF LM examples (#15855) · 3f2e6368
  Joao Gante authored Mar 01, 2022
  
  3f2e6368
25 Feb, 2022 1 commit
- [examples/summarization and translation] fix readme (#15833) · bf1fe328
  Suraj Patil authored Feb 25, 2022
  
  bf1fe328
23 Feb, 2022 1 commit

[Test refactor 1/5] Per-folder tests reorganization (#15725) · 29c10a41

Lysandre Debut authored Feb 23, 2022



* Per-folder tests reorganization
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>

29c10a41

22 Feb, 2022 1 commit
- Fix typo on examples/pytorch/question-answering (#15644) · 3db2e8f9
  Yongrae Jo authored Feb 23, 2022
```
cna -> can
```
  3db2e8f9
21 Feb, 2022 3 commits

TF text classification examples (#15704) · 3956b133
Joao Gante authored Feb 21, 2022
```
* Working example with to_tf_dataset

* updated text_classification

* more comments
```
3956b133

add VisionTextDualEncoder and CLIP fine-tuning script (#15701) · 86119c11

Suraj Patil authored Feb 21, 2022



* begin script

* update script

* fix features and data args

* main

* add requirements

* add column name args

* fix captions

* don't jit transforms

* fix caption

* fix labels, handle attention mask

* convert pixel values to numpy

* labels => input_ids

* transform images on the fly

* use AutoModel class, create the hybird model outside of the script

* fix version message

* add readme

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* adderss review comments

* add more comments

* allow freezing vision and text models
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

86119c11

Fix minor comment typos (#15740) · 5444687f
Ivan Agarský authored Feb 21, 2022

5444687f