Commits · 13be4872123094c37eb5fab939b38967b0ad2cd0 · chenpangpang / transformers

18 Jul, 2020 1 commit

Teven authored Jul 18, 2020

Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.

13be4872

17 Jul, 2020 2 commits

Revert "XLNet `use_cache` refactor (#5770)" (#5854) · 615be03f
Teven authored Jul 17, 2020
```
This reverts commit 0b2da0e5.
```
615be03f

XLNet `use_cache` refactor (#5770) · 0b2da0e5

Teven authored Jul 17, 2020

Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.

0b2da0e5

10 Jul, 2020 2 commits

Change model outputs types to self-document outputs (#5438) · edfd82f5

Sylvain Gugger authored Jul 10, 2020

* [WIP] Proposal for model outputs

* All Bert models

* Make CI green maybe?

* Fix ONNX test

* Isolate ModelOutput from pt and tf

* Formatting

* Add Electra models

* Auto-generate docstrings from outputs

* Add TF outputs

* Add some BERT models

* Revert TF side

* Remove last traces of TF changes

* Fail with a clear error message

* Add Albert and work through Bart

* Add CTRL and DistilBert

* Formatting

* Progress on Bart

* Renames and finish Bart

* Formatting

* Fix last test

* Add DPR

* Finish Electra and add FlauBERT

* Add GPT2

* Add Longformer

* Add MMBT

* Add MobileBert

* Add GPT

* Formatting

* Add Reformer

* Add Roberta

* Add T5

* Add Transformer XL

* Fix test

* Add XLM + fix XLMForTokenClassification

* Style + XLMRoberta

* Add XLNet

* Formatting

* Add doc of return_tuple arg

edfd82f5

Improvements to PretrainedConfig documentation (#5642) · b2747af5
Sylvain Gugger authored Jul 10, 2020
```
* Update PretrainedConfig doc

* Formatting

* Small fixes

* Forgotten args and more cleanup
```
b2747af5

28 Jun, 2020 1 commit
- save_pretrained: mkdir(exist_ok=True) (#5258) · 45e26125
  Sam Shleifer authored Jun 28, 2020
```
* all save_pretrained methods mkdir if not os.path.exists
```
  45e26125
10 Jun, 2020 1 commit
- [All models] fix docs after adding output attentions to all forward functions (#4909) · 3b3619a3
  Patrick von Platen authored Jun 10, 2020
```
* fix doc

* add format file

* add output attentions to all docs

* add also for bart

* fix naming

* re-add doc to config
```
  3b3619a3
09 Jun, 2020 1 commit

[All models] Extend config.output_attentions with output_attentions function arguments (#4538) · 6e603cb7

Bharat Raghunathan authored Jun 10, 2020



* DOC: Replace instances of ``config.output_attentions`` with function argument ``output_attentions``

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* Fix further regressions in tests relating to `output_attentions`

Ensure proper propagation of `output_attentions` as a function parameter
to all model subclasses

* Fix more regressions in `test_output_attentions`

* Fix issues with BertEncoder

* Rename related variables to `output_attentions`

* fix pytorch tests

* fix bert and gpt2 tf

* Fix most TF tests for `test_output_attentions`

* Fix linter errors and more TF tests

* fix conflicts

* DOC: Apply Black Formatting

* Fix errors where output_attentions was undefined

* Remove output_attentions in classes per review

* Fix regressions on tests having `output_attention`

* fix conflicts

* fix conflicts

* fix conflicts

* fix conflicts

* fix pytorch tests

* fix conflicts

* fix conflicts

* Fix linter errors and more TF tests

* fix tf tests

* make style

* fix isort

* improve output_attentions

* improve tensorflow
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

6e603cb7

02 Jun, 2020 1 commit

Kill model archive maps (#4636) · d4c2cb40

Julien Chaumond authored Jun 02, 2020

* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI

d4c2cb40

01 Jun, 2020 2 commits
- [config] Ensure that id2label always takes precedence over num_labels · ec8717d5
  Julien Chaumond authored Jun 01, 2020
  
  ec8717d5
- [config] Ensure that id2label always takes precedence over num_labels · 751a1e08
  Julien Chaumond authored Jun 01, 2020
```
Fixes bug reported in https://github.com/huggingface/transformers/issues/4669

See #3967 for context
```
  751a1e08
01 May, 2020 1 commit
- Configs: saner num_labels in configs. (#3967) · 27d55125
  Julien Chaumond authored May 01, 2020
  
  27d55125
29 Apr, 2020 1 commit

CDN urls (#4030) · 455c6390

Julien Chaumond authored Apr 28, 2020

* [file_utils] use_cdn + documentation

* Move to cdn. urls for weights

* [urls] Hotfix for bert-base-japanese

455c6390

23 Apr, 2020 1 commit
- [housekeeping] Upgrade `# type` Python 2 syntax · a946b6b5
  Julien Chaumond authored Apr 23, 2020
```
cc @sshleifer
```
  a946b6b5
22 Apr, 2020 1 commit

Trainer (#3800) · dd9d483d

Julien Chaumond authored Apr 21, 2020

* doc

* [tests] Add sample files for a regression task

* [HUGE] Trainer

* Feedback from @sshleifer

* Feedback from @thomwolf + logging tweak

* [file_utils] when downloading concurrently, get_from_cache will use the cached file for subsequent processes

* [glue] Use default max_seq_length of 128 like before

* [glue] move DataTrainingArguments around

* [ner] Change interface of InputExample, and align run_{tf,pl}

* Re-align the pl scripts a little bit

* ner

* [ner] Add integration test

* Fix language_modeling with API tweak

* [ci] Tweak loss target

* Don't break console output

* amp.initialize: model must be on right device before

* [multiple-choice] update for Trainer

* Re-align to 827d6d6e

dd9d483d

18 Apr, 2020 1 commit
- [Config, Serialization] more readable config serialization (#3797) · e9d0bc02
  Patrick von Platen authored Apr 18, 2020
```
* better config serialization

* finish configuration utils
```
  e9d0bc02
14 Apr, 2020 1 commit

[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734) · 01c37dcd

Patrick von Platen authored Apr 14, 2020

* remove output_past from pt

* make style

* add optional input length for gpt2

* add use cache to prepare input

* save memory in gpt2

* correct gpt2 test inputs

* make past input optional for gpt2

* finish use_cache for all models

* make style

* delete modeling_gpt2 change in test file

* correct docstring

* correct is true statements for gpt2

01c37dcd

10 Apr, 2020 1 commit

Add `run_glue_tpu.py` that trains models on TPUs (#3702) · 551b4505

Jin Young Sohn authored Apr 10, 2020

* Initial commit to get BERT + run_glue.py on TPU

* Add README section for TPU and address comments.

* Cleanup TPU bits from run_glue.py (#3)

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* Cleanup TPU bits from run_glue.py

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py

.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* No need to call `xm.mark_step()` explicitly (#4)

Since for gradient accumulation we're accumulating on batches from
`ParallelLoader` instance which on next() marks the step itself.

* Resolve R/W conflicts from multiprocessing (#5)

* Add XLNet in list of models for `run_glue_tpu.py` (#6)

* Add RoBERTa to list of models in TPU GLUE (#7)

* Add RoBERTa and DistilBert to list of models in TPU GLUE (#8)

* Use barriers to reduce duplicate work/resources (#9)

* Shard eval dataset and aggregate eval metrics (#10)

* Shard eval dataset and aggregate eval metrics

Also, instead of calling `eval_loss.item()` every time do summation with
tensors on device.

* Change defaultdict to float

* Reduce the pred, label tensors instead of metrics

As brought up during review some metrics like f1 cannot be aggregated
via averaging. GLUE task metrics depends largely on the dataset, so
instead we sync the prediction and label tensors so that the metrics can
be computed accurately on those instead.

* Only use tb_writer from master (#11)

* Apply huggingface black code formatting

* Style

* Remove `--do_lower_case` as example uses cased

* Add option to specify tensorboard logdir

This is needed for our testing framework which checks regressions
against key metrics writtern by the summary writer.

* Using configuration for `xla_device`

* Prefix TPU specific comments.

* num_cores clarification and namespace eval metrics

* Cache features file under `args.cache_dir`

Instead of under `args.data_dir`. This is needed as our test infra uses
data_dir with a read-only filesystem.

* Rename `run_glue_tpu` to `run_tpu_glue`
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

551b4505

31 Mar, 2020 1 commit
- [Generate] Add bad words list argument to the generate function (#3367) · b38d552a
  Patrick von Platen authored Mar 31, 2020
```
* add bad words list

* make style

* add bad_words_tokens

* make style

* better naming

* make style

* fix typo
```
  b38d552a
26 Mar, 2020 1 commit

Add t5 to pipeline(task='summarization') (#3413) · 9c683ef0

Patrick von Platen authored Mar 26, 2020

* solve conflicts

* move warnings below

* incorporate changes

* add pad_to_max_length to pipelines

* add bug fix for T5 beam search

* add prefix patterns

* make style

* fix conflicts

* adapt pipelines for task specific parameters

* improve docstring

* remove unused patterns

9c683ef0

25 Mar, 2020 1 commit
- Extend config with task specific configs. (#3433) · ffa17fe3
  Patrick von Platen authored Mar 25, 2020
```
* add new default configs

* change prefix default to None
```
  ffa17fe3
20 Mar, 2020 1 commit
- Clean special token init in modeling_....py (#3264) · 95e00d08
  Patrick von Platen authored Mar 20, 2020
```
* make style

* fix conflicts
```
  95e00d08
19 Mar, 2020 1 commit
- Simpler Error message when loading config/model with .from_pretrained() (#3341) · ecfd3363
  Julien Chaumond authored Mar 19, 2020
  
  ecfd3363
16 Mar, 2020 1 commit
- Improved Error message when loading config/model with .from_pretrained() (#3247) · af471ce5
  Patrick von Platen authored Mar 16, 2020
```
* better error message

* better error message

* update to model identifier instead of url

* update to model identifier instead of ur
```
  af471ce5
11 Mar, 2020 3 commits
- work in progress · 7a11e925
  Patrick von Platen authored Mar 06, 2020
  
  7a11e925
- fix conflicts · ff648221
  Patrick von Platen authored Mar 06, 2020
  
  ff648221
- refactored code a bit and made more generic · c0d9dd3b
  Patrick von Platen authored Mar 05, 2020
  
  c0d9dd3b
09 Mar, 2020 2 commits
- fix typo · 847d3703
  Patrick von Platen authored Mar 09, 2020
  
  847d3703
- fix repetition penalty mask in tf · 3e624c64
  Patrick von Platen authored Mar 09, 2020
  
  3e624c64
05 Mar, 2020 1 commit
- Pass kwargs to configuration (#3147) · b623ddc0
  Lysandre Debut authored Mar 05, 2020
```
* Pass kwargs to configuration

* Setter

* test
```
  b623ddc0
24 Feb, 2020 1 commit

Add local_files_only parameter to pretrained items (#2930) · a143d947

Bram Vanroy authored Feb 24, 2020

* Add disable_outgoing to pretrained items

Setting disable_outgoing=True disables outgonig traffic:
- etags are not looked up
- models are not downloaded

* parameter name change

* Remove forgotten print

a143d947

21 Feb, 2020 1 commit

Improve special_token_id logic in run_generation.py and add tests (#2885) · fc38d4c8

Patrick von Platen authored Feb 21, 2020



* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* changed fast random lm generation testing design to more general one

* delete in old testing design in gpt2

* correct old variable name

* temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed

* adapted all fast random generate tests to new design

* better warning description in modeling_utils

* better comment

* better comment and error message
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

fc38d4c8

31 Jan, 2020 1 commit
- config.architectures · b85c59f9
  Julien Chaumond authored Jan 30, 2020
  
  b85c59f9
24 Jan, 2020 2 commits
- Code quality · babd41e7
  Lysandre authored Jan 24, 2020
  
  babd41e7
- Configuration utils · 009fcb0e
  Lysandre authored Jan 24, 2020
  
  009fcb0e
16 Jan, 2020 1 commit
- Tokenizer.from_pretrained: fetch all possible files remotely · 23a2cea8
  Julien Chaumond authored Jan 15, 2020
  
  23a2cea8
13 Jan, 2020 3 commits
- Map configs to models and tokenizers · 03046285
  Julien Chaumond authored Jan 13, 2020
  
  03046285
- Py35 doesn't like inline variable types · 3c86b6f3
  Julien Chaumond authored Jan 13, 2020
  
  3c86b6f3
- Config to Model mapping · b803b067
  Julien Chaumond authored Jan 13, 2020
  
  b803b067
11 Jan, 2020 1 commit
- Serialize model_type if exists · 6bb3edc3
  Julien Chaumond authored Jan 11, 2020
  
  6bb3edc3