Commits · 9853c5dd587b9fb5eb3e39f9282c3c9e56188eb0 · chenpangpang / transformers

06 Apr, 2021 3 commits
- Development on v4.6.0dev0 · 9853c5dd
  Lysandre authored Apr 06, 2021
  
  9853c5dd
- Release v4.5.0 · 4906a29f
  Lysandre authored Apr 06, 2021
  
  4906a29f
- added social thumbnail for docs (#11083) · b219d6b5
  Philipp Schmid authored Apr 06, 2021
  
  b219d6b5
05 Apr, 2021 1 commit
- Pin docutils (#11062) · ef62f038
  Lysandre Debut authored Apr 05, 2021
```
* Pin docutils

* Versions table
```
  ef62f038
01 Apr, 2021 1 commit

Add Vision Transformer and ViTFeatureExtractor (#10950) · 30677dc7

NielsRogge authored Apr 01, 2021



* Squash all commits into one

* Update ViTFeatureExtractor to use image_utils instead of torchvision

* Remove torchvision and add Pillow

* Small docs improvement

* Address most comments by @sgugger

* Fix tests

* Clean up conversion script

* Pooler first draft

* Fix quality

* Improve conversion script

* Make style and quality

* Make fix-copies

* Minor docs improvements

* Should use fix-copies instead of manual handling

* Revert "Should use fix-copies instead of manual handling"

This reverts commit fd4e591bce4496d41406425c82606a8fdaf8a50b.

* Place ViT in alphabetical order
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

30677dc7

30 Mar, 2021 1 commit

Sagemaker test (#10925) · 604c0850

Philipp Schmid authored Mar 30, 2021

* init

* first working test

* added todo for setup.py

* working test for single node multi node ddp and smd

* added tensorflow single node test

* added directory for pytorch and tensorflow due to different requirements.txt

* added directory for pytorch and tensorflow

* added comment for run_glue until it is available

* added output_dir to it

* smaller dataset to make test running faster

* adjust HP and script

* adjusted parameter for tensorflow

* refactored test scripts

* adjusted make file

* init

* first working test

* added todo for setup.py

* working test for single node multi node ddp and smd

* added tensorflow single node test

* added directory for pytorch and tensorflow due to different requirements.txt

* added directory for pytorch and tensorflow

* added comment for run_glue until it is available

* added output_dir to it

* smaller dataset to make test running faster

* adjust HP and script

* adjusted parameter for tensorflow

* refactored test scripts

* adjusted make file

* updated dlc container

* commented in all tests

* added both ecr images

* added new master branches

* debug

* added new datasets version

* init

* strange rebase bug

* removed changes

* changed min version for tests to work

* updated DLC

* added model parallel test

* removed test files

* removed test files

* tested with ned dlc

* added correct sagemaker sdk version

* adjust DLCs for official one

* reworked tests

* quality

* removed default profile added documentation to it

* added step in release for sagemaker tests

* reverted version for example script removed duplicated script and added install from master to requirements.txt

* removed mistaken .DS_Stores from mac

* fixed tests

* added Sylvains feedback

* make style

* added lysandre's feedback

604c0850

17 Mar, 2021 1 commit

Check copies blackify (#10775) · 40b049c7

Sylvain Gugger authored Mar 17, 2021

* Apply black before checking copies

* Fix for class methods

* Deal with lonely brackets

* Remove debug and add forward changes

* Separate copies and fix test

* Add black as a test dependency

40b049c7

16 Mar, 2021 4 commits

Development on v4.5.0dev0 · 1b5ce1e6
Lysandre authored Mar 16, 2021

1b5ce1e6
Release v4.4.0 · c988db5a
Lysandre authored Mar 16, 2021

c988db5a

Release utils (#10735) · 813d730c

Sylvain Gugger authored Mar 16, 2021

* Examples version update

* Refactor a bit

* All version updates

* Fixes

* README cleanup

* Post-release/patch

* Fixes

* More fixes

* Tests

* More fixes

* Moar fixes

* Make commands and update setup

* Replace spaces with weird tabs

* Fix test

* Style

813d730c

Flax testing should not run the full torch test suite (#10725) · 9f8619c6

Patrick von Platen authored Mar 16, 2021

* make flax tests pytorch independent

* fix typo

* finish

* improve circle ci

* fix return tensors

* correct flax test

* re-add sentencepiece

* last tokenizer fixes

* finish maybe now

9f8619c6

15 Mar, 2021 1 commit

Tests run on Docker (#10681) · 58f672e6

Lysandre Debut authored Mar 15, 2021



* Tests run on Docker
Co-authored-by: Morgan <funtowiczmo@gmail.com>

* Comments from code review

* Reply to itself

* Dependencies
Co-authored-by: Morgan <funtowiczmo@gmail.com>

58f672e6

10 Mar, 2021 1 commit

Speech2TextTransformer (#10175) · d26b37e7

Suraj Patil authored Mar 10, 2021



* s2t

* fix config

* conversion script

* fix import

* add tokenizer

* fix tok init

* fix tokenizer

* first version working

* fix embeds

* fix lm head

* remove extra heads

* fix convert script

* handle encoder attn mask

* style

* better enc attn mask

* override _prepare_attention_mask_for_generation

* handle attn_maks in encoder and decoder

* input_ids => input_features

* enable use_cache

* remove old code

* expand embeddings if needed

* remove logits bias

* masked_lm_loss => loss

* hack tokenizer to support feature processing

* fix model_input_names

* style

* fix error message

* doc

* remove inputs_embeds

* remove input_embeds

* remove unnecessary docstring

* quality

* SpeechToText => Speech2Text

* style

* remove shared_embeds

* subsample => conv

* remove Speech2TextTransformerDecoderWrapper

* update output_lengths formula

* fix table

* remove max_position_embeddings

* update conversion scripts

* add possibility to do upper case for now

* add FeatureExtractor and Processor

* add tests for extractor

* require_torch_audio => require_torchaudio

* add processor test

* update import

* remove classification head

* attention mask is now 1D

* update docstrings

* attention mask should be of type long

* handle attention mask from generate

* alwyas return attention_mask

* fix test

* style

* doc

* Speech2TextTransformer => Speech2Text

* Speech2TextTransformerConfig => Speech2TextConfig

* remove dummy_inputs

* nit

* style

* multilinguial tok

* fix tokenizer

* add tgt_lang setter

* save lang_codes

* fix tokenizer

* add forced_bos_token_id to tokenizer

* apply review suggestions

* add torchaudio to extra deps

* add speech deps to CI

* fix dep

* add libsndfile to ci

* libsndfile1

* add speech to extras all

* libsndfile1 -> libsndfile1

* libsndfile

* libsndfile1-dev

* apt update

* add sudo to install

* update deps table

* install libsndfile1-dev on CI

* tuple to list

* init conv layer

* add model tests

* quality

* add integration tests

* skip_special_tokens

* add speech_to_text_transformer in toctree

* fix tokenizer

* fix fp16 tests

* add tokenizer tests

* fix copyright

* input_values => input_features

* doc

* add model in readme

* doc

* change checkpoint names

* fix copyright

* fix code example

* add max_model_input_sizes in tokenizer

* fix integration tests

* add do_lower_case to tokenizer

* remove clamp trick

* fix "Add modeling imports here"

* fix copyrights

* fix tests

* SpeechToTextTransformer => SpeechToText

* fix naming

* fix table formatting

* fix typo

* style

* fix typos

* remove speech dep from extras[testing]

* fix copies

* rename doc file,

* put imports under is_torch_available

* run feat extract tests when torch is available

* dummy objects for processor and extractor

* fix imports in tests

* fix import in modeling test

* fxi imports

* fix torch import

* fix imports again

* fix positional embeddings

* fix typo in import

* adapt new extractor refactor

* style

* fix torchscript test

* doc

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docs, copied from, style

* fix docstring

* handle imports

* remove speech from all extra deps

* remove s2t from seq2seq lm mapping

* better names

* skip training tests

* add install instructions

* List => Tuple

* doc

* fix conversion script

* fix urls

* add instruction for libsndfile

* fix fp16 test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d26b37e7

08 Feb, 2021 1 commit
- Update tokenizers requirement (#10077) · f285e4c3
  Anthony MOI authored Feb 08, 2021
  
  f285e4c3
05 Feb, 2021 1 commit
- Bump minimum Jax requirement to 2.8.0 (#10027) · b9720dd6
  Patrick von Platen authored Feb 05, 2021
```
* Bump minimum Jax requirement to 2.8.0

* update table
```
  b9720dd6
04 Feb, 2021 3 commits

Bump version · ba607db1
Sylvain Gugger authored Feb 04, 2021

ba607db1
Release: 4.3.0.rc1 · 4cd22512
Sylvain Gugger authored Feb 04, 2021

4cd22512

Authorize last version of tokenizer (#9799) · 21b3922e

Sylvain Gugger authored Feb 04, 2021



* Authorize last version of tokenizer

* Update version table

* Fix conversion of spm tokenizers and fix some hub links

* Bump tokenizers version to 0.10.1rc1

* Add script to check tokenizers conversion with XNLI

* Add some more mask_token lstrip support

* Must modify mask_token in slow tokenizers too

* Keep using the old method for Pegasus

* add missing import
Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>

21b3922e

02 Feb, 2021 2 commits

Wav2Vec2 (#9659) · d6217fb3

Patrick von Platen authored Feb 02, 2021



* add raw scaffold

* implement feat extract layers

* make style

* remove +

* correctly convert weights

* make feat extractor work

* make feature extraction proj work

* run forward pass

* finish forward pass

* Succesful decoding example

* remove unused files

* more changes

* add wav2vec tokenizer

* add new structure

* fix run forward

* add other layer norm architecture

* finish 2nd structure

* add model tests

* finish tests for tok and model

* clean-up

* make style

* finish docstring for model and config

* make style

* correct docstring

* correct tests

* change checkpoints to fairseq

* fix examples

* finish wav2vec2

* make style

* apply sylvains suggestions

* apply lysandres suggestions

* change print to log.info

* re-add assert statement

* add input_values as required input name

* finish wav2vec2 tokenizer

* Update tests/test_tokenization_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* apply sylvains suggestions
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

d6217fb3

Bump numpy (#9934) · 62024453
Sylvain Gugger authored Feb 02, 2021

62024453

27 Jan, 2021 1 commit
- [Setup.py] update jaxlib (#9831) · d5b40d66
  Patrick von Platen authored Jan 27, 2021
```
* update jaxlib

* Update setup.py

* update table
```
  d5b40d66
18 Jan, 2021 1 commit
- Remove duplicated extra["retrieval"] (#9621) · 72fc9abf
  Anthony MOI authored Jan 18, 2021
  
  72fc9abf
14 Jan, 2021 1 commit

[setup.py] note on how to get to transformers exact dependencies from shell (#9553) · c99751dd

Stas Bekman authored Jan 14, 2021



* note on how to get to deps from shell

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix text
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c99751dd

13 Jan, 2021 3 commits
- v4.3.0.dev0 · e63cad79
  Lysandre authored Jan 13, 2021
  
  e63cad79
- Release: v4.2.0 · 7d9a9d0c
  Lysandre authored Jan 13, 2021
  
  7d9a9d0c
- use correct deps for torchhub (#9552) · b2dfcc56
  Stas Bekman authored Jan 13, 2021
  
  b2dfcc56
12 Jan, 2021 2 commits
- Revert, it was not the issue. · e6ecef71
  Sylvain Gugger authored Jan 12, 2021
  
  e6ecef71
- Fix tokenizers install for now · 250f27f2
  Sylvain Gugger authored Jan 12, 2021
  
  250f27f2
06 Jan, 2021 1 commit

Fast transformers import part 1 (#9441) · 0c96262f

Sylvain Gugger authored Jan 06, 2021

* Don't import libs to check they are available

* Don't import integrations at init

* Add importlib_metdata to deps

* Remove old vars references

* Avoid syntax error

* Adapt testing utils

* Try to appease torchhub

* Add dependency

* Remove more private variables

* Fix typo

* Another typo

* Refine the tf availability test

0c96262f

21 Dec, 2020 1 commit

Improve BERT-like models performance with better self attention (#9124) · 5a8a4eb1

Julien Plu authored Dec 21, 2020

* Improve BERT-like models attention layers

* Apply style

* Put back error raising instead of assert

* Update template

* Fix copies

* Apply raising valueerror in MPNet

* Restore the copy check for the Intermediate layer in Longformer

* Update longformer

5a8a4eb1

18 Dec, 2020 1 commit

[setup] correct transformers version format (#9176) · 84d5879e

Stas Bekman authored Dec 18, 2020

setuptools has a pretty fixed expectation of version numbers.

This PR fixes the dev version number and adds a comment with correct formats for the future editors

This fix removes this warning on `make fixup|style|etc` or any other time `setup.py` is being run.
```
setuptools/dist.py:452: UserWarning: Normalizing '4.2.0dev0' to '4.2.0.dev0'
  warnings.warn(tmpl.format(**locals()))
```
and the alternative:
```
/setuptools/dist.py:452: UserWarning: Normalizing '4.0.0-rc-1' to '4.0.0rc1
```

Fixes: #8749

@LysandreJik, @sgugger

84d5879e

17 Dec, 2020 3 commits
- setup.py development version · bf713cde
  Lysandre authored Dec 17, 2020
  
  bf713cde
- Release: v4.1.1 · bfa4ccf7
  Lysandre authored Dec 17, 2020
  
  bfa4ccf7
- Release: v4.1.0 · f5438ab8
  Lysandre authored Dec 17, 2020
  
  f5438ab8
16 Dec, 2020 1 commit

[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1

Patrick von Platen authored Dec 16, 2020



* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>

640e6fe1

15 Dec, 2020 1 commit

Fix tf2.4 (#9120) · ef2d4cd4

Julien Plu authored Dec 15, 2020



* Fix tests for TF 2.4

* Remove <2.4 limitation

* Add version condition

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ef2d4cd4

14 Dec, 2020 2 commits
- Also pin TF CPU · 251eb70c
  Sylvain Gugger authored Dec 14, 2020
  
  251eb70c
- Pin TF to < 2.4 · e4ef57a9
  Sylvain Gugger authored Dec 14, 2020
  
  e4ef57a9
07 Dec, 2020 1 commit
- Copyright (#8970) · 00aa9dbc
  Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
  00aa9dbc
30 Nov, 2020 1 commit
- fix pypi complaint on version naming · 5fd3d81e
  LysandreJik authored Nov 30, 2020
  
  5fd3d81e