- 26 Sep, 2019 1 commit
-
-
thomwolf authored
-
- 18 Sep, 2019 2 commits
- 04 Aug, 2019 1 commit
-
-
Ethan Perez authored
Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.
-
- 20 Jul, 2019 1 commit
-
-
Rabeeh KARIMI authored
-
- 14 Jul, 2019 1 commit
-
-
thomwolf authored
-
- 05 Jul, 2019 1 commit
-
-
thomwolf authored
-
- 11 May, 2019 1 commit
-
-
https://github.com/huggingface/pytorch-pretrained-BERT/issues/556samuel.broscheit authored
Reason for issue was that optimzation steps where computed from example size, which is different from actual size of dataloader when an example is chunked into multiple instances. Solution in this pull request is to compute num_optimization_steps directly from len(data_loader).
-
- 02 May, 2019 1 commit
-
-
MottoX authored
-
- 30 Apr, 2019 1 commit
-
-
Aneesh Pappu authored
small fix to remove shifting of lm labels during pre process of roc stories, as this shifting happens interanlly in the model
-
- 15 Apr, 2019 2 commits
- 06 Mar, 2019 2 commits
- 28 Feb, 2019 1 commit
-
-
Catalin Voss authored
-
- 20 Feb, 2019 1 commit
-
-
Ben Johnson authored
-
- 09 Feb, 2019 1 commit
-
-
thomwolf authored
-
- 08 Feb, 2019 7 commits
- 07 Feb, 2019 1 commit
-
-
thomwolf authored
-
- 10 Jan, 2019 1 commit
-
-
thomwolf authored
-
- 08 Jan, 2019 1 commit
-
-
thomwolf authored
-