1. 25 Apr, 2019 5 commits
  2. 24 Apr, 2019 1 commit
  3. 22 Apr, 2019 2 commits
    • Max Ryabinin's avatar
      Fix generation with --no-early-stop (#627) · fa52d202
      Max Ryabinin authored
      Summary:
      Because the size of `unfinalized_scores` is equal to current `bsz` and not initial batch size, we need to index it by `unfin_idx` instead of `sent` in `is_finished`.
      Fixes #588.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/627
      
      Differential Revision: D15034641
      
      Pulled By: myleott
      
      fbshipit-source-id: 2638e68e877ae01256cac7d8e69b5b7fec8f7017
      fa52d202
    • Yongqiang Wang's avatar
      reduce memory footprint for average_checkpoints (#647) · d63477e1
      Yongqiang Wang authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/647
      
      the current implementation of average_checkpoints requires loading all
      the model parameters into memory and then do the averaging. To average large
      models (e.g., transformer) over a large number of checkpoints (e.g., >50),
      it may require over 100GB memory.
      
      Loading all the parameters is not necessary, as we know the number of models in advance.
      
      Reviewed By: skritika
      
      Differential Revision: D15027513
      
      fbshipit-source-id: 0afe37c9a031a9ab0f1e78844a37be49ec5f76f1
      d63477e1
  4. 17 Apr, 2019 3 commits
  5. 16 Apr, 2019 1 commit
  6. 15 Apr, 2019 3 commits
  7. 12 Apr, 2019 1 commit
  8. 10 Apr, 2019 4 commits
  9. 09 Apr, 2019 2 commits
  10. 07 Apr, 2019 1 commit
    • Haoran Li's avatar
      move distributed_init after get_batch_iterator · 34028c63
      Haoran Li authored
      Summary: There are constantly wait timeout issue for using multiple nodes, even setting copylocallytempdir:/ doesn't help, eg f105637629. It seems to be working after I moved distributed_init after get_batch_iterator, eg f106520580
      
      Reviewed By: myleott
      
      Differential Revision: D14817769
      
      fbshipit-source-id: edbb101a28d8082241c7bdd8c5500c9dad27647c
      34028c63
  11. 05 Apr, 2019 3 commits
  12. 04 Apr, 2019 1 commit
    • Jay Mahadeokar's avatar
      aligned training task and CE related changes · 3658fa32
      Jay Mahadeokar authored
      Summary:
      This diff adds:
      
      1. Aligned training task specifically for doing cross entropy criterion training using prod data and prod like models
      2. Few changes to correctly register the task and criterions.
      3. Changes to trainer code for propogating accuracy metrics which we care about for training.
      
      Couple of things are hacky right now:
      - The reporting is not modular (this needs to be thought about in general for fairseq).
      
      - The get dummy batch could be specific to task instead of specific for dataset.
      
      Reviewed By: myleott
      
      Differential Revision: D14670482
      
      fbshipit-source-id: dc077247b2ae9d26a8e842a386ec5faa5771e836
      3658fa32
  13. 03 Apr, 2019 2 commits
  14. 02 Apr, 2019 3 commits
  15. 29 Mar, 2019 5 commits
  16. 26 Mar, 2019 1 commit
  17. 19 Mar, 2019 2 commits