1. 02 May, 2019 1 commit
    • Kritika Singh's avatar
      Make CTC work with more encoder-only models · ffc9c8cc
      Kritika Singh authored
      Summary:
      Changes include:
      1. Added get_normalized_probabilities to the encoder-only base class FairseqEncoderModel
      2. Made CTCCriterion work for both batch_first (LSTMSubsampleEncoderModel) and batch_second (LSTMEncoderOnly) encoder types
      3. Added tests for different encoder and CTC combinations.
      
      TODO:
      CTC still doesn't work for VGGLSTMEncoderModel so I have disabled that. Will debug and send out fix in another diff.
      
      Reviewed By: jay-mahadeokar
      
      Differential Revision: D15158818
      
      fbshipit-source-id: acb484bad705c937d676d2c3dcde3e3562d68ed9
      ffc9c8cc
  2. 01 May, 2019 5 commits
  3. 30 Apr, 2019 6 commits
  4. 29 Apr, 2019 2 commits
  5. 27 Apr, 2019 2 commits
  6. 26 Apr, 2019 1 commit
  7. 25 Apr, 2019 6 commits
  8. 24 Apr, 2019 1 commit
  9. 22 Apr, 2019 2 commits
    • Max Ryabinin's avatar
      Fix generation with --no-early-stop (#627) · fa52d202
      Max Ryabinin authored
      Summary:
      Because the size of `unfinalized_scores` is equal to current `bsz` and not initial batch size, we need to index it by `unfin_idx` instead of `sent` in `is_finished`.
      Fixes #588.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/627
      
      Differential Revision: D15034641
      
      Pulled By: myleott
      
      fbshipit-source-id: 2638e68e877ae01256cac7d8e69b5b7fec8f7017
      fa52d202
    • Yongqiang Wang's avatar
      reduce memory footprint for average_checkpoints (#647) · d63477e1
      Yongqiang Wang authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/647
      
      the current implementation of average_checkpoints requires loading all
      the model parameters into memory and then do the averaging. To average large
      models (e.g., transformer) over a large number of checkpoints (e.g., >50),
      it may require over 100GB memory.
      
      Loading all the parameters is not necessary, as we know the number of models in advance.
      
      Reviewed By: skritika
      
      Differential Revision: D15027513
      
      fbshipit-source-id: 0afe37c9a031a9ab0f1e78844a37be49ec5f76f1
      d63477e1
  10. 17 Apr, 2019 3 commits
  11. 16 Apr, 2019 1 commit
  12. 15 Apr, 2019 3 commits
  13. 12 Apr, 2019 1 commit
  14. 10 Apr, 2019 4 commits
  15. 09 Apr, 2019 2 commits