Commits · da10de8466c001dceca328dac12751abb71c65eb · chenpangpang / transformers

30 Oct, 2019 3 commits
- fix bug with padding mask + add corresponding test · da10de84
  Rémi Louf authored Oct 30, 2019
  
  da10de84
- rename seq2seq to encoder_decoder · 3b0d2fa3
  Rémi Louf authored Oct 30, 2019
  
  3b0d2fa3
- revert renaming of lm_labels to ltr_lm_labels · 9c1bdb5b
  Rémi Louf authored Oct 30, 2019
  
  9c1bdb5b
29 Oct, 2019 2 commits
- update docstrings; rename lm_labels to more explicit ltr_lm_labels · 098a89f3
  Rémi Louf authored Oct 29, 2019
  
  098a89f3
- resolve PR comments · dfce4096
  Rémi Louf authored Oct 29, 2019
  
  dfce4096
28 Oct, 2019 7 commits
- here's one big commit · 4c3ac4a7
  Rémi Louf authored Oct 18, 2019
  
  4c3ac4a7
- fix test of truncation function · 932543f7
  Rémi Louf authored Oct 17, 2019
  
  932543f7
- extend works in-place · a67413cc
  Rémi Louf authored Oct 17, 2019
  
  a67413cc
- remove potential UndefinedError · cb26b035
  Rémi Louf authored Oct 17, 2019
  
  cb26b035
- pad sequence with 0, mask with -1 · b915ba9d
  Rémi Louf authored Oct 17, 2019
  
  b915ba9d
- add lm_labels for the LM cross-entropy · dc580dd4
  Rémi Louf authored Oct 17, 2019
  
  dc580dd4
- the decoder attends to the output of the encoder stack (last layer) · f873a3ed
  Rémi Louf authored Oct 17, 2019
  
  f873a3ed
17 Oct, 2019 10 commits
- fix model2model · 56e2ee4e
  thomwolf authored Oct 17, 2019
  
  56e2ee4e
- fix data processing in script · 8cd56e30
  thomwolf authored Oct 17, 2019
  
  8cd56e30
- add training pipeline (formatting temporary) · 578d23e0
  Rémi Louf authored Oct 17, 2019
  
  578d23e0
- use two different tokenizers for storyand summary · 47a06d88
  Rémi Louf authored Oct 17, 2019
  
  47a06d88
- add Model2Model to __init__ · bfb9b540
  Rémi Louf authored Oct 17, 2019
  
  bfb9b540
- correct the truncation and padding of dataset · c1bc709c
  Rémi Louf authored Oct 17, 2019
  
  c1bc709c
- reword explanation of encoder_attention_mask · 87d60b6e
  Rémi Louf authored Oct 17, 2019
  
  87d60b6e
- correct composition of padding and causal masks · 638fe7f5
  Rémi Louf authored Oct 17, 2019
  
  638fe7f5
- document the MLM modification + raise exception on MLM training with encoder-decoder · 4e0f2434
  Rémi Louf authored Oct 17, 2019
  
  4e0f2434
- revert black formatting to conform with lib style · 624a5644
  Rémi Louf authored Oct 17, 2019
  
  624a5644
16 Oct, 2019 7 commits
- tying weights is going to be a clusterfuck · 9b71fc9a
  Rémi Louf authored Oct 16, 2019
  
  9b71fc9a
- separate inputs into encoder & decoder inputs · 95ec1d08
  Rémi Louf authored Oct 16, 2019
  
  95ec1d08
- add separator between data import and train · e4e0ee14
  Rémi Louf authored Oct 16, 2019
  
  e4e0ee14
- correct syntax error: dim() and not dims() · a424892f
  Rémi Louf authored Oct 16, 2019
  
  a424892f
- remove Bert2Rnd test · 33c01368
  Rémi Louf authored Oct 16, 2019
  
  33c01368
- adapt attention masks for the decoder case · 07520696
  Rémi Louf authored Oct 16, 2019
```
The introduction of a decoder introduces 2 changes:
- We need to be able to specify a separate mask in the cross
attention to mask the positions corresponding to padding tokens in the
encoder state.
- The self-attention in the decoder needs to be causal on top of not
attending to padding tokens.
```
  07520696
- fix function that defines masks in XLM · c5a94a61
  Rémi Louf authored Oct 16, 2019
```
the definition of `get_masks` would blow with the proper combination of
arguments. It was just a matter of moving a definition outside of a
control structure.
```
  c5a94a61
15 Oct, 2019 8 commits
- add `is_decoder` attribute to `PretrainedConfig` · 488a6641
  Rémi Louf authored Oct 15, 2019
```
We currenctly instantiate encoders and decoders for the seq2seq by
passing the `is_decoder` keyword argument to the `from_pretrained`
classmethod. On the other hand, the model class looks for the value
of the `is_decoder` attribute in its config.

In order for the value to propagate from the kwarg to the configuration
we simply need to define `is_decoder` as an attribute to the base
`PretrainedConfig`, with a default at `False`.
```
  488a6641
- comment the seq2seq functions · 4c81960b
  Rémi Louf authored Oct 15, 2019
  
  4c81960b
- take path to pretrained for encoder and decoder for init · 6d6c3267
  Rémi Louf authored Oct 15, 2019
  
  6d6c3267
- specify in readme that both datasets are required · 0d81fc85
  Rémi Louf authored Oct 15, 2019
  
  0d81fc85
- remove Bert2Bert from module declaration · 19e99647
  Rémi Louf authored Oct 15, 2019
  
  19e99647
- test the full story processing · 1aec9405
  Rémi Louf authored Oct 15, 2019
  
  1aec9405
- truncation function is fully tested · 22e1af68
  Rémi Louf authored Oct 15, 2019
  
  22e1af68
- wip commit, switching computers · 260ac7d9
  Rémi Louf authored Oct 15, 2019
  
  260ac7d9
14 Oct, 2019 3 commits
- add instructions to fetch the dataset · fe25eefc
  Rémi Louf authored Oct 14, 2019
  
  fe25eefc
- delegate the padding with special tokens to the tokenizer · 41279327
  Rémi Louf authored Oct 14, 2019
  
  41279327
- process the raw CNN/Daily Mail dataset · 447fffb2
  Rémi Louf authored Oct 14, 2019
```
the data provided by Li Dong et al. were already tokenized, which means
that they are not compatible with  all the models in the library. We
thus process the raw data directly and tokenize them using the models'
tokenizers.
```
  447fffb2