Commits · 7e36deec7a406733f14aa567a624541aaee6bd40 · chenpangpang / transformers

"pytorch_transformers/modeling_xlm.py" did not exist on "32da75486bbfbcb7feb98b032dcf05e54e6f745d"

18 Oct, 2020 1 commit

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a

Thomas Wolf authored Oct 18, 2020

* splitting fast and slow tokenizers [WIP]

* [WIP] splitting sentencepiece and tokenizers dependencies

* update dummy objects

* add name_or_path to models and tokenizers

* prefix added to file names

* prefix

* styling + quality

* spliting all the tokenizer files - sorting sentencepiece based ones

* update tokenizer version up to 0.9.0

* remove hard dependency on sentencepiece 🎉

* and removed hard dependency on tokenizers 🎉



* update conversion script

* update missing models

* fixing tests

* move test_tokenization_fast to main tokenization tests - fix bugs

* bump up tokenizers

* fix bert_generation

* update ad fix several tokenizers

* keep sentencepiece in deps for now

* fix funnel and deberta tests

* fix fsmt

* fix marian tests

* fix layoutlm

* fix squeezebert and gpt2

* fix T5 tokenization

* fix xlnet tests

* style

* fix mbart

* bump up tokenizers to 0.9.2

* fix model tests

* fix tf models

* fix seq2seq examples

* fix tests without sentencepiece

* fix slow => fast  conversion without sentencepiece

* update auto and bert generation tests

* fix mbart tests

* fix auto and common test without tokenizers

* fix tests without tokenizers

* clean up tests lighten up when tokenizers + sentencepiece are both off

* style quality and tests fixing

* add sentencepiece to doc/examples reqs

* leave sentencepiece on for now

* style quality split hebert and fix pegasus

* WIP Herbert fast

* add sample_text_no_unicode and fix hebert tokenization

* skip FSMT example test for now

* fix style

* fix fsmt in example tests

* update following Lysandre and Sylvain's comments

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ba8c4d0a

26 Aug, 2020 1 commit
- Black 20 release · a75c64d8
  Lysandre authored Aug 26, 2020
  
  a75c64d8
24 Aug, 2020 1 commit
- Update repo to isort v5 (#6686) · a5737779
  Sylvain Gugger authored Aug 24, 2020
```
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
```
  a5737779
20 Aug, 2020 1 commit
- [Tests] fix attention masks in Tests (#6621) · 505f2d74
  Patrick von Platen authored Aug 20, 2020
```
* fix distilbert

* fix typo
```
  505f2d74
12 Aug, 2020 1 commit
- [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411) · 0735def8
  Patrick von Platen authored Aug 12, 2020
```
* add encoder-decoder for roberta

* fix headmask

* apply Sylvains suggestions

* fix typo

* Apply suggestions from code review
```
  0735def8
04 Aug, 2020 1 commit

cleanup torch unittests (#6196) · 5deed37f

Stas Bekman authored Aug 03, 2020

* improve unit tests

this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest

* batch 1

* batch 2

* batch 3

* batch 4

* batch 5

* style

* non-tf template

* last deletion of check_loss_output

5deed37f

31 Jul, 2020 1 commit
- Model output test (#6155) · d951c14a
  Sylvain Gugger authored Jul 31, 2020
```
* Use return_dict=True in all tests

* Formatting
```
  d951c14a
01 Jul, 2020 1 commit
- Move tests/utils.py -> transformers/testing_utils.py (#5350) · 13deb95a
  Sam Shleifer authored Jul 01, 2020
  
  13deb95a
23 Jun, 2020 1 commit
- [bart] add config.extra_pos_embeddings to facilitate reuse (#5190) · 58918c76
  Sam Shleifer authored Jun 23, 2020
  
  58918c76
16 Jun, 2020 1 commit
- [cleanup] Hoist ModelTester objects to top level (#4939) · c852036b
  Amil Khare authored Jun 16, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  c852036b
10 Jun, 2020 1 commit
- Add more models to common tests (#4910) · 4e10acb3
  Sylvain Gugger authored Jun 10, 2020
  
  4e10acb3
05 Jun, 2020 1 commit
- Use labels to remove deprecation warnings (#4807) · f1fe1846
  Sylvain Gugger authored Jun 05, 2020
  
  f1fe1846
02 Jun, 2020 1 commit

Kill model archive maps (#4636) · d4c2cb40

Julien Chaumond authored Jun 02, 2020

* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI

d4c2cb40

01 May, 2020 1 commit

[ci] Load pretrained models into the default (long-lived) cache · f54dc3f4

Julien Chaumond authored Apr 23, 2020

There's an inconsistency right now where:
- we load some models into CACHE_DIR
- and some models in the default cache
- and often, in both for the same models

When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth.

I'd rather always use the default cache

f54dc3f4

03 Mar, 2020 1 commit

[ci] Re-run integration ground truth from fairseq · f631e01d

Julien Chaumond authored Mar 03, 2020

Adopted best practice set by @patrickvonplaten of commenting lines run on fairseq, for easy comparison

also see #3020

f631e01d

20 Feb, 2020 1 commit

New BartModel (#2745) · 53ce3854

Sam Shleifer authored Feb 20, 2020

* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs

53ce3854

04 Feb, 2020 2 commits
- Style · 5f96ebc0
  Lysandre authored Feb 03, 2020
  
  5f96ebc0
- RoBERTa Pytorch tests · d28b81dc
  Lysandre authored Feb 03, 2020
  
  d28b81dc
06 Jan, 2020 2 commits
- GPU text generation: mMoved the encoded_prompt to correct device · 81d6841b
  alberduris authored Dec 31, 2019
  
  81d6841b
- Moved the encoded_prompts to correct device · dd4df80f
  alberduris authored Dec 31, 2019
  
  dd4df80f
22 Dec, 2019 6 commits

Remove __future__ imports. · c824d15a
Aymeric Augustin authored Dec 22, 2019

c824d15a

Replace (TF)CommonTestCases for modeling with a mixin. · 345c23a6

Aymeric Augustin authored Dec 22, 2019

I suspect the wrapper classes were created in order to prevent the
abstract base class (TF)CommonModelTester from being included in test
discovery and running, because that would fail.

I solved this by replacing the abstract base class with a mixin.

Code changes are just de-indenting and automatic reformattings
performed by black to use the extra line space.

345c23a6

Remove unittest.main() in test modules. · 7e98e211

Aymeric Augustin authored Dec 22, 2019

This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.

7e98e211

Switch test files to the standard test_*.py scheme. · ced0a942
Aymeric Augustin authored Dec 22, 2019

ced0a942
Move tests outside of library. · 067395d5
Aymeric Augustin authored Dec 22, 2019

067395d5

Sort imports with isort. · 158e82e0

Aymeric Augustin authored Dec 21, 2019

This is the result of:

    $ isort --recursive examples templates transformers utils hubconf.py setup.py

158e82e0

21 Dec, 2019 3 commits

Reformat source code with black. · fa84ae26

Aymeric Augustin authored Dec 21, 2019

This is the result of:

    $ black --line-length 119 examples templates transformers utils hubconf.py setup.py

There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.

This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.

fa84ae26

Take advantage of the cache when running tests. · b670c266

Aymeric Augustin authored Dec 20, 2019

Caching models across test cases and across runs of the test suite makes
slow tests somewhat more bearable.

Use gettempdir() instead of /tmp in tests. This makes it easier to
change the location of the cache with semi-standard TMPDIR/TEMP/TMP
environment variables.

Fix #2222.

b670c266

[RoBERTa] Embeddings: fix dimensionality bug · 3e52915f
Julien Chaumond authored Dec 20, 2019

3e52915f

20 Dec, 2019 1 commit
- Bug fix: 1764 · 228f5286
  Dom Hudson authored Nov 07, 2019
  
  228f5286
13 Dec, 2019 1 commit
- cleaning up configuration classes · 47f0e3cf
  thomwolf authored Dec 13, 2019
  
  47f0e3cf
06 Dec, 2019 1 commit

Remove dependency on pytest for running tests (#2055) · 35401fe5

Aymeric Augustin authored Dec 06, 2019

* Switch to plain unittest for skipping slow tests.

Add a RUN_SLOW environment variable for running them.

* Switch to plain unittest for PyTorch dependency.

* Switch to plain unittest for TensorFlow dependency.

* Avoid leaking open files in the test suite.

This prevents spurious warnings when running tests.

* Fix unicode warning on Python 2 when running tests.

The warning was:

    UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

* Support running PyTorch tests on a GPU.

Reverts 27e015bd.

* Tests no longer require pytest.

* Make tests pass on cuda

35401fe5

24 Oct, 2019 1 commit
- RoBERTa token classification · 66085a13
  Matt Maybeno authored Oct 23, 2019
```
[WIP] copy paste bert token classification for roberta
```
  66085a13
26 Sep, 2019 1 commit
- [BIG] pytorch-transformers => transformers · 31c23bd5
  thomwolf authored Sep 26, 2019
  
  31c23bd5
09 Sep, 2019 1 commit
- fixed imports in tests and gpt2 config test · b7175a27
  thomwolf authored Sep 09, 2019
  
  b7175a27
08 Sep, 2019 2 commits
- test suite independent of framework · 518307df
  thomwolf authored Sep 05, 2019
  
  518307df
- split configuration and modeling files · 1efb1f16
  thomwolf authored Sep 05, 2019
  
  1efb1f16
05 Sep, 2019 1 commit
- test suite independent of framework · 7c0baf95
  thomwolf authored Sep 05, 2019
  
  7c0baf95
04 Sep, 2019 2 commits
- split configuration and modeling files · 2a667b1e
  thomwolf authored Sep 05, 2019
  
  2a667b1e
- WIP reordering · 7fba47b7
  thomwolf authored Sep 04, 2019
  
  7fba47b7