Commits · 3e1cd8241eddc7f3ec036c26f1cbbd3272088653 · chenpangpang / transformers

10 Oct, 2019 11 commits
- fix stupid (re)naming issue · 3e1cd824
  Rémi Louf authored Oct 10, 2019
  
  3e1cd824
- remove the staticmethod used to load the config · 81ee29ee
  Rémi Louf authored Oct 10, 2019
  
  81ee29ee
- rename the attributes in the Bert Layer · d7092d59
  Rémi Louf authored Oct 10, 2019
```
Since the preloading of weights relies on the name of the class's
attributes changing the namespace breaks loading pretrained weights on
Bert and all related models. I reverted `self_attention` to `attention`
and us `crossattention` for the decoder instead.
```
  d7092d59
- prune both attention and self-attention heads · 51261167
  Rémi Louf authored Oct 10, 2019
  
  51261167
- add is_decoder as an attribute to Config class · 17177e73
  Rémi Louf authored Oct 10, 2019
  
  17177e73
- replace double quotes with simple quotes · df85a0ff
  Rémi Louf authored Oct 10, 2019
  
  df85a0ff
- merge the two Bert layers classes · 9ca788b2
  Rémi Louf authored Oct 10, 2019
  
  9ca788b2
- Remove and do the branching in · edfc8f82
  Rémi Louf authored Oct 10, 2019
  
  edfc8f82
- remove and do the branching in · 09cfd122
  Rémi Louf authored Oct 10, 2019
  
  09cfd122
- override `from_pretrained` in Bert2Rnd · 877ef2c6
  Rémi Louf authored Oct 10, 2019
```
In the seq2seq model we need to both load pretrained weights in the
encoder and initialize the decoder randomly. Because the
`from_pretrained` method defined in the base class relies on module
names to assign weights, it would also initialize the decoder with
pretrained weights. To avoid this we override the method to only
initialize the encoder with pretrained weights.
```
  877ef2c6
- add comment on recursive weights loading · 851ef592
  Rémi Louf authored Oct 10, 2019
  
  851ef592
08 Oct, 2019 8 commits
- rename class in __init__ · 770b15b5
  Rémi Louf authored Oct 08, 2019
  
  770b15b5
- remove old seq2seq file · 61ed8890
  Rémi Louf authored Oct 08, 2019
  
  61ed8890
- rename Bert2Bert -> Bert2Rnd · 8abfee9e
  Rémi Louf authored Oct 08, 2019
  
  8abfee9e
- add a placeholder test · 82628b0f
  Rémi Louf authored Oct 08, 2019
  
  82628b0f
- Add BertDecoderModel and Bert2Bert classes · 07009830
  Rémi Louf authored Oct 08, 2019
```
I am not sure what happens when the class is initialized with the
pretrained weights.
```
  07009830
- add general structure for Bert2Bert class · 75feacf1
  Rémi Louf authored Oct 08, 2019
  
  75feacf1
- add General attention classes · 15a2fc88
  Rémi Louf authored Oct 08, 2019
```
The modifications that I introduced in a previous commit did break
Bert's internal API. I reverted these changes and added more general
classes to handle the encoder-decoder attention case.

There may be a more elegant way to deal with retro-compatibility (I am
not comfortable with the current state of the code), but I cannot see it
right now.
```
  15a2fc88
- add a decoder layer for Bert · cd6a59d5
  Rémi Louf authored Oct 08, 2019
  
  cd6a59d5
07 Oct, 2019 8 commits
- generalize BertSelfAttention to take separate query, key, value · a0dcefa3
  Rémi Louf authored Oct 07, 2019
```
There is currently no way to specify the quey, key and value separately
in the Attention module. However, the decoder's "encoder-decoder
attention" layers take the decoder's last output as a query, the
encoder's states as key and value. We thus modify the existing code so
query, key and value can be added separately.

This obviously poses some naming conventions; `BertSelfAttention` is not
a self-attention module anymore. The way the residual is forwarded is
now awkard, etc. We will need to do some refacto once the decoder is
fully implemented.
```
  a0dcefa3
- add class wireframes for Bert decoder · 31adbb24
  Rémi Louf authored Oct 07, 2019
  
  31adbb24
- rename BertLayer to BertEncoderLayer · dda1adad
  Rémi Louf authored Oct 07, 2019
  
  dda1adad
- do some (light) housekeeping · 0053c0e0
  Rémi Louf authored Oct 07, 2019
```
Several packages were imported but never used, indentation and line
spaces did not follow PEP8.
```
  0053c0e0
- raise exception when class initialized with __init__ · 386e86e2
  Rémi Louf authored Oct 07, 2019
  
  386e86e2
- add wireframe for seq2seq model · 4446c02b
  Rémi Louf authored Oct 07, 2019
  
  4446c02b
- Rephrase forward method to reduce ambiguity · 904158ac
  Christopher Goh authored Oct 07, 2019
  
  904158ac
- Fix some typos in README · 0f65d8cb
  Christopher Goh authored Oct 07, 2019
  
  0f65d8cb
06 Oct, 2019 1 commit
- Correct device assignment in run_generation · f3e0218f
  LysandreJik authored Oct 05, 2019
  
  f3e0218f
04 Oct, 2019 5 commits
- unecessary carriage return · 0820bb05
  VictorSanh authored Oct 04, 2019
  
  0820bb05
- run_squad --> run_squad_w_distillation · f5891c38
  VictorSanh authored Oct 04, 2019
  
  f5891c38
- add distillation+finetuning option in run_squad · 764a7923
  VictorSanh authored Sep 27, 2019
  
  764a7923
- New model addition issue template · bb464289
  Lysandre Debut authored Oct 04, 2019
  
  bb464289
- Decode documentaton · 7bddb45a
  LysandreJik authored Oct 04, 2019
  
  7bddb45a
03 Oct, 2019 7 commits
- Merge pull request #1373 from TimYagan/fix-css · b3cfd979
  Thomas Wolf authored Oct 03, 2019
```
Fixed critical css font-family issues
```
  b3cfd979
- Merge pull request #1313 from enzoampil/master · 81a1e124
  Lysandre Debut authored Oct 03, 2019
```
Add option to use a 'stop token'
```
  81a1e124
- Merge branch 'master' into master · d3f24dfa
  Lysandre Debut authored Oct 03, 2019
  
  d3f24dfa
- XLM use_lang_embedding flag in run_generation · ecc4f1bd
  LysandreJik authored Oct 03, 2019
  
  ecc4f1bd
- Added XLM to run_generation, with prompt language selection. · c2c2ca0f
  LysandreJik authored Oct 03, 2019
  
  c2c2ca0f
- Merge pull request #1296 from danai-antoniou/add-duplicate-tokens-error · 1569610f
  Thomas Wolf authored Oct 03, 2019
```
Added ValueError for duplicates in list of added tokens
```
  1569610f
- DistillBert Documentation Code Example fixes · e1b2949a
  drc10723 authored Oct 03, 2019
  
  e1b2949a