"test/vscode:/vscode.git/clone" did not exist on "f2ddea0a205a14c76063c876084942aa1d3e1b17"
- 11 Dec, 2019 3 commits
-
-
Stefan Schweter authored
-
Julien Chaumond authored
-
LysandreJik authored
-
- 10 Dec, 2019 27 commits
-
-
Thomas Wolf authored
Progress indicator improvements when downloading pre-trained models.
-
Leo Dirac authored
-
LysandreJik authored
-
Lysandre authored
-
Thomas Wolf authored
clean up PT <=> TF conversion
-
Rémi Louf authored
-
Thomas Wolf authored
[WIP] Squad refactor
-
Thomas Wolf authored
create encoder attention mask from shape of hidden states
-
Julien Chaumond authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
-
Rémi Louf authored
We currently save the pretrained_weights of the encoder and decoder in two separate directories `encoder` and `decoder`. However, for the `from_pretrained` function to operate with automodels we need to specify the type of model in the path to the weights. The path to the encoder/decoder weights is handled by the `PreTrainedEncoderDecoder` class in the `save_pretrained` function. Sice there is no easy way to infer the type of model that was initialized for the encoder and decoder we add a parameter `model_type` to the function. This is not an ideal solution as it is error prone, and the model type should be carried by the Model classes somehow. This is a temporary fix that should be changed before merging.
-
Rémi Louf authored
Since I started my PR the `add_special_token_single_sequence` function has been deprecated for another; I replaced it with the new function.
-
- 09 Dec, 2019 10 commits
-
-
Pierric Cistac authored
-
Bilal Khan authored
-
Bilal Khan authored
-
Bilal Khan authored
-
Bilal Khan authored
-
Bilal Khan authored
-
Bilal Khan authored
-
LysandreJik authored
-
Lysandre Debut authored
-
Rémi Louf authored
We currently create encoder attention masks (when they're not provided) based on the shape of the inputs to the encoder. This is obviously wrong; sequences can be of different lengths. We now create the encoder attention mask based on the batch_size and sequence_length of the encoder hidden states.
-