- 03 Nov, 2018 16 commits
-
-
VictorSanh authored
-
VictorSanh authored
Create DataParallel model if several GPUs
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
I seriously don't understand why they defined num_train_epochs as a float in the originial tf code. I Will change it at the end to avoir merge conflicts for now.
-
VictorSanh authored
-
Tim Rault authored
-
-
thomwolf authored
-
Tim Rault authored
-
thomwolf authored
-
thomwolf authored
-
thomwolf authored
-
-
thomwolf authored
-
- 02 Nov, 2018 24 commits
-
-
VictorSanh authored
Please review @thomwolf but i think this is equivqlent (and it mimics the loss computation of the original loss)
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
Tim Rault authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
thomwolf authored
-
Tim Rault authored
-
thomwolf authored
-
VictorSanh authored
-
Tim Rault authored
-
VictorSanh authored
Error was coming from "modeling_pytorch.py", line 484, in forward: start_loss = loss_fct(start_logits, start_positions) --> ValueError: Expected target size (12, 1), got torch.Size([12])
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-
VictorSanh authored
-