- 01 Apr, 2020 1 commit
-
-
Anirudh Srinivasan authored
-
- 25 Feb, 2020 2 commits
-
-
Lysandre Debut authored
* All Tokenizers BertTokenizer + few fixes RobertaTokenizer OpenAIGPTTokenizer + Fixes GPT2Tokenizer + fixes TransfoXLTokenizer Correct rst for TransformerXL XLMTokenizer + fixes XLNet Tokenizer + Style DistilBERT + Fix XLNet RST CTRLTokenizer CamemBERT Tokenizer FlaubertTokenizer XLMRobertaTokenizer cleanup * cleanup
-
srush authored
* change masking to direct labelings * fix black * switch to ignore index * . * fix black
-
- 21 Feb, 2020 1 commit
-
-
Lysandre Debut authored
-
- 13 Feb, 2020 1 commit
-
-
Sam Shleifer authored
* activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity
-
- 11 Feb, 2020 1 commit
-
-
Oleksiy Syvokon authored
PyTorch < 1.3 requires multiplication operands to be of the same type. This was violated when using default attention mask (i.e., attention_mask=None in arguments) given BERT in the decoder mode. In particular, this was breaking Model2Model and made tutorial from the quickstart failing.
-
- 07 Feb, 2020 1 commit
-
-
monologg authored
-
- 04 Feb, 2020 1 commit
-
-
Lysandre authored
-
- 03 Feb, 2020 1 commit
-
-
Lysandre authored
Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
-
- 28 Jan, 2020 1 commit
-
-
Wietse de Vries authored
-
- 23 Jan, 2020 7 commits
- 15 Jan, 2020 1 commit
-
-
Julien Chaumond authored
-
- 14 Jan, 2020 1 commit
-
-
Lysandre authored
Created a link between the linear layer bias and the model attribute bias. This does not change anything for the user nor for the conversion scripts, but allows the `resize_token_embeddings` method to resize the bias as well as the weights of the decoder. Added a test.
-
- 07 Jan, 2020 2 commits
-
-
Romain Keramitas authored
Signed-off-by:Romain Keramitas <r.keramitas@gmail.com>
-
Genta Indra Winata authored
-
- 06 Jan, 2020 3 commits
-
-
alberduris authored
-
alberduris authored
-
Lysandre authored
-
- 22 Dec, 2019 6 commits
-
-
Aymeric Augustin authored
-
Aymeric Augustin authored
-
Aymeric Augustin authored
This prevents transformers from being importable simply because the CWD is the root of the git repository, while not being importable from other directories. That led to inconsistent behavior, especially in examples. Once you fetch this commit, in your dev environment, you must run: $ pip uninstall transformers $ pip install -e . -
Aymeric Augustin authored
Ignore warnings related to Python 2, because it's going away soon.
-
Aymeric Augustin authored
-
Aymeric Augustin authored
This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py
-
- 21 Dec, 2019 1 commit
-
-
Aymeric Augustin authored
This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand.
-
- 18 Dec, 2019 3 commits
-
-
Julien Chaumond authored
-
Antti Virtanen authored
-
Antti Virtanen authored
-
- 11 Dec, 2019 2 commits
-
-
Julien Chaumond authored
-
Masatoshi Suzuki authored
-
- 10 Dec, 2019 3 commits
-
-
LysandreJik authored
-
LysandreJik authored
-
Rémi Louf authored
-
- 09 Dec, 2019 1 commit
-
-
Rémi Louf authored
We currently create encoder attention masks (when they're not provided) based on the shape of the inputs to the encoder. This is obviously wrong; sequences can be of different lengths. We now create the encoder attention mask based on the batch_size and sequence_length of the encoder hidden states.
-