Commits · edfc8f822557f3df7d9057a6457a933cddf15299 · chenpangpang / transformers

10 Oct, 2019 3 commits

Remove and do the branching in · edfc8f82
Rémi Louf authored Oct 10, 2019

edfc8f82
remove and do the branching in · 09cfd122
Rémi Louf authored Oct 10, 2019

09cfd122

override `from_pretrained` in Bert2Rnd · 877ef2c6

Rémi Louf authored Oct 10, 2019

In the seq2seq model we need to both load pretrained weights in the
encoder and initialize the decoder randomly. Because the
`from_pretrained` method defined in the base class relies on module
names to assign weights, it would also initialize the decoder with
pretrained weights. To avoid this we override the method to only
initialize the encoder with pretrained weights.

877ef2c6

08 Oct, 2019 6 commits
- rename class in __init__ · 770b15b5
  Rémi Louf authored Oct 08, 2019
  
  770b15b5
- rename Bert2Bert -> Bert2Rnd · 8abfee9e
  Rémi Louf authored Oct 08, 2019
  
  8abfee9e
- Add BertDecoderModel and Bert2Bert classes · 07009830
  Rémi Louf authored Oct 08, 2019
```
I am not sure what happens when the class is initialized with the
pretrained weights.
```
  07009830
- add general structure for Bert2Bert class · 75feacf1
  Rémi Louf authored Oct 08, 2019
  
  75feacf1
- add General attention classes · 15a2fc88
  Rémi Louf authored Oct 08, 2019
```
The modifications that I introduced in a previous commit did break
Bert's internal API. I reverted these changes and added more general
classes to handle the encoder-decoder attention case.

There may be a more elegant way to deal with retro-compatibility (I am
not comfortable with the current state of the code), but I cannot see it
right now.
```
  15a2fc88
- add a decoder layer for Bert · cd6a59d5
  Rémi Louf authored Oct 08, 2019
  
  cd6a59d5
07 Oct, 2019 4 commits

generalize BertSelfAttention to take separate query, key, value · a0dcefa3

Rémi Louf authored Oct 07, 2019

There is currently no way to specify the quey, key and value separately
in the Attention module. However, the decoder's "encoder-decoder
attention" layers take the decoder's last output as a query, the
encoder's states as key and value. We thus modify the existing code so
query, key and value can be added separately.

This obviously poses some naming conventions; `BertSelfAttention` is not
a self-attention module anymore. The way the residual is forwarded is
now awkard, etc. We will need to do some refacto once the decoder is
fully implemented.

a0dcefa3

add class wireframes for Bert decoder · 31adbb24
Rémi Louf authored Oct 07, 2019

31adbb24
rename BertLayer to BertEncoderLayer · dda1adad
Rémi Louf authored Oct 07, 2019

dda1adad

do some (light) housekeeping · 0053c0e0

Rémi Louf authored Oct 07, 2019

Several packages were imported but never used, indentation and line
spaces did not follow PEP8.

0053c0e0

02 Oct, 2019 1 commit
- initialy -> initially · 63ed224b
  Santiago Castro authored Oct 02, 2019
  
  63ed224b
26 Sep, 2019 1 commit
- [BIG] pytorch-transformers => transformers · 31c23bd5
  thomwolf authored Sep 26, 2019
  
  31c23bd5
23 Sep, 2019 1 commit
- Remove unnecessary use of FusedLayerNorm · 98dd19b9
  Santiago Castro authored Sep 22, 2019
  
  98dd19b9
17 Sep, 2019 1 commit
- tiny fixes · 62760baf
  Julien Chaumond authored Sep 17, 2019
  
  62760baf
09 Sep, 2019 1 commit
- fix tf bert model · 50c6bc41
  thomwolf authored Sep 09, 2019
  
  50c6bc41
08 Sep, 2019 1 commit
- split configuration and modeling files · 1efb1f16
  thomwolf authored Sep 05, 2019
  
  1efb1f16
04 Sep, 2019 2 commits
- split configuration and modeling files · 2a667b1e
  thomwolf authored Sep 05, 2019
  
  2a667b1e
- WIP reodering arguments for torchscript and TF · e25cba78
  thomwolf authored Sep 04, 2019
  
  e25cba78
31 Aug, 2019 4 commits
- updated pruning logic with sets - Bert and GPT-2 · bdb4409e
  thomwolf authored Aug 31, 2019
  
  bdb4409e
- Added patch to remaining models · 0c8e823b
  LysandreJik authored Aug 29, 2019
  
  0c8e823b
- Blocks deletion from already deleted heads. Necessary integration test. · 87747518
  LysandreJik authored Aug 21, 2019
```
Now raises a warning when a head to be deleted already has been deleted. An integration test verifying the total pipeline (-> from config -> save model -> load model -> additional head pruning) has been added.
```
  87747518
- Pruning saved to configuration first try · 42e00cf9
  Lysandre authored Aug 19, 2019
  
  42e00cf9
28 Aug, 2019 2 commits
- fixing model to add torchscript, embedding resizing, head pruning and masking + tests · c9bce181
  thomwolf authored Aug 28, 2019
  
  c9bce181
- fix small typo · 7f5d8534
  VictorSanh authored Aug 28, 2019
  
  7f5d8534
23 Aug, 2019 1 commit
- change layernorm code to pytorch's native layer norm · e13465fb
  David Pollack authored Aug 23, 2019
  
  e13465fb
20 Aug, 2019 1 commit
- fix #808 · 53c8f700
  thomwolf authored Aug 20, 2019
  
  53c8f700
19 Aug, 2019 1 commit
- Doc: loading from config alone does not load the model weights · c589862b
  Lysandre authored Aug 19, 2019
  
  c589862b
09 Aug, 2019 1 commit
- Corrected logger.error info · 70607664
  Kevin Trebing authored Aug 09, 2019
```
Signed-off-by: Kevin Trebing <Kevin.Trebing@gmx.net>
```
  70607664
08 Aug, 2019 2 commits
- fix #976 · f2b300df
  LysandreJik authored Aug 08, 2019
  
  f2b300df
- fix #971 · 7df303f5
  LysandreJik authored Aug 08, 2019
  
  7df303f5
06 Aug, 2019 2 commits
- Fix examples in docstring · 6ec1ee9e
  wangfei authored Aug 06, 2019
  
  6ec1ee9e
- Fix examples of loading pretrained models in docstring · beb03ec6
  wangfei authored Aug 06, 2019
  
  beb03ec6
05 Aug, 2019 1 commit
- Update modeling_bert.py · d7fd1056
  雷打不动！ authored Aug 05, 2019
  
  d7fd1056
03 Aug, 2019 1 commit
- Fix comment typo · a24f8306
  wangfei authored Aug 03, 2019
  
  a24f8306
27 Jul, 2019 1 commit
- cleaning up example docstrings · bfbe52ec
  thomwolf authored Jul 27, 2019
  
  bfbe52ec
23 Jul, 2019 1 commit
- fix #827 · 0227b4a9
  thomwolf authored Jul 23, 2019
  
  0227b4a9
16 Jul, 2019 1 commit
- adding missing docstring fix #793 · 5fe0b378
  thomwolf authored Jul 16, 2019
  
  5fe0b378