- 08 Oct, 2019 5 commits
-
-
R茅mi Louf authored
-
R茅mi Louf authored
I am not sure what happens when the class is initialized with the pretrained weights.
-
R茅mi Louf authored
-
R茅mi Louf authored
The modifications that I introduced in a previous commit did break Bert's internal API. I reverted these changes and added more general classes to handle the encoder-decoder attention case. There may be a more elegant way to deal with retro-compatibility (I am not comfortable with the current state of the code), but I cannot see it right now.
-
R茅mi Louf authored
-
- 07 Oct, 2019 8 commits
-
-
R茅mi Louf authored
There is currently no way to specify the quey, key and value separately in the Attention module. However, the decoder's "encoder-decoder attention" layers take the decoder's last output as a query, the encoder's states as key and value. We thus modify the existing code so query, key and value can be added separately. This obviously poses some naming conventions; `BertSelfAttention` is not a self-attention module anymore. The way the residual is forwarded is now awkard, etc. We will need to do some refacto once the decoder is fully implemented.
-
R茅mi Louf authored
-
R茅mi Louf authored
-
R茅mi Louf authored
Several packages were imported but never used, indentation and line spaces did not follow PEP8.
-
R茅mi Louf authored
-
R茅mi Louf authored
-
Christopher Goh authored
-
Christopher Goh authored
-
- 06 Oct, 2019 1 commit
-
-
LysandreJik authored
-
- 04 Oct, 2019 5 commits
-
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
Lysandre Debut authored
-
LysandreJik authored
-
- 03 Oct, 2019 21 commits
-
-
Thomas Wolf authored
Fixed critical css font-family issues
-
Lysandre Debut authored
Add option to use a 'stop token'
-
Lysandre Debut authored
-
LysandreJik authored
-
LysandreJik authored
-
Thomas Wolf authored
Added ValueError for duplicates in list of added tokens
-
drc10723 authored
-
VictorSanh authored
-
Brian Ma authored
add DistilBert model shortcut into ALL_MODELS
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-