"docs/vscode:/vscode.git/clone" did not exist on "08ba4b4902df5a18f5ad41d9490c50fe0a4c970f"
- 22 May, 2020 11 commits
-
-
Julien Chaumond authored
As discussed w/ @lysandrejik packaging is maintained by PyPA (the Python Packaging Authority), and should be lightweight and stable
-
Lysandre authored
-
Alexander Measure authored
changed from https://https://arxiv.org/abs/2001.04451.pdf to https://arxiv.org/abs/2001.04451.pdf
-
HUSEIN ZOLKEPLI authored
-
Anthony MOI authored
-
Julien Chaumond authored
cc @sshleifer
-
Patrick von Platen authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Frankie Liuzzi authored
* added functionality for electra classification head * unneeded dropout * Test ELECTRA for sequence classification * Style Co-authored-by:
Frankie <frankie@frase.io> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 21 May, 2020 4 commits
-
-
Lysandre authored
-
Lysandre Debut authored
* TPU hangs when saving optimizer/scheduler * Style * ParallelLoader is not a DataLoader * Style * Addressing @julien-c's comments
-
Zhangyx authored
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Tobias Lee authored
* fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style
-
- 20 May, 2020 13 commits
-
-
Julien Chaumond authored
-
Julien Chaumond authored
-
Cola authored
Remove warning of deprecated overload of addcdiv_ Fix #4451
-
Julien Plu authored
* Better None gradients handling * Apply Style * Apply Style
-
Oliver 脜strand authored
* Exclude LayerNorms from weight decay * Include both formats of layer norm
-
Rens authored
-
Nathan Cooper authored
-
Timo Moeller authored
-
Lysandre Debut authored
* There is one missing key in BERT * Correct device for CamemBERT model * RoBERTa tokenization adding prefix space * Style
-
Manuel Romero authored
-
Manuel Romero authored
-
Oleksandr Bushkovskyi authored
Create model card for "Tereveni-AI/gpt2-124M-uk-fiction" model
-
Hu Xu authored
* add model_cards for BERT trained on reviews. * add link to repository. * refine README.md for each review model
-
- 19 May, 2020 12 commits
-
-
Sam Shleifer authored
-
Sam Shleifer authored
-
Patrick von Platen authored
* add longformer docs * improve docs
-
Patrick von Platen authored
* fix gpu slow tests in pytorch * change model to device syntax
-
Suraj Patil authored
* add T5 fine-tuning notebook [Community notebooks] * Update README.md Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Sam Shleifer authored
-
Iz Beltagy authored
* first commit * bug fixes * better examples * undo padding * remove wrong VOCAB_FILES_NAMES * License * make style * make isort happy * unit tests * integration test * make `black` happy by undoing `isort` changes!! * lint * no need for the padding value * batch_size not bsz * remove unused type casting * seqlen not seq_len * staticmethod * `bert` selfattention instead of `n2` * uint8 instead of bool + lints * pad inputs_embeds using embeddings not a constant * black * unit test with padding * fix unit tests * remove redundant unit test * upload model weights * resolve todo * simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_ * increase unittest coverage
-
Girishkumar authored
-
Shaoyen authored
* Map optimizer to correct device after loading from checkpoint. * Make style test pass Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
Julien Chaumond authored
-
Julien Chaumond authored
* Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii
-
Julien Chaumond authored
* Test case for #3936 * multigpu tests pass on pytorch 1.4.0 * Fixup * multigpu tests pass on pytorch 1.5.0 * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * rename multigpu to require_multigpu * mode doc
-