"parser/parser_test.go" did not exist on "63bc884e2503ebefc580c499a82460affe50628b"
  1. 03 Sep, 2018 1 commit
  2. 25 Jul, 2018 1 commit
    • Alexei Baevski's avatar
      Transformer lm · d2e2a1d4
      Alexei Baevski authored
      This implements transformer based language model. It already obtains better perplexity on wikitext103 without any tuning. I will also train it on gbw where I also expect to get better ppl
      
      Example training command:
      
      python train.py /private/home/abaevski/data/wiki103 —save-dir /tmp —fp16 —max-epoch 80 —save-interval 1 —arch transformer_lm —task language_modeling —optimizer nag —lr 0.008 —lr-scheduler reduce_lr_on_plateau —lr-shrink 0.6 —dropout 0.2 —criterion adaptive_loss —adaptive-softmax-cutoff 10000,50000,200000 —max-tokens 512 —tokens-per-sample 512 —seed 1 —sample-break-mode none —log-format json —log-interval 50 —save-interval-updates 2500 —keep-interval-updates 25
      small transformer got to 31.3 ppl on wiki text 103 (compared to 35 with fconv) while @myleott got a big transformer lm to 27 something ppl on wiki text 103
      d2e2a1d4
  3. 21 Jun, 2018 3 commits
  4. 15 Jun, 2018 16 commits
  5. 27 Feb, 2018 4 commits
  6. 22 Jan, 2018 5 commits
  7. 06 Dec, 2017 1 commit
  8. 02 Dec, 2017 1 commit
  9. 13 Nov, 2017 1 commit
  10. 12 Nov, 2017 3 commits
  11. 08 Nov, 2017 4 commits