Commits · 83e08b6f6e4b08a77ff3cb3538fbeb93ae65b79f · OpenDAS / Fairseq

25 Sep, 2018 8 commits
- Fix validation loss · 83e08b6f
  Myle Ott authored Sep 09, 2018
  
  83e08b6f
- Pass encoder_input to generator, rather than src_tokens/src_lengths. · bfeb7732
  Stephen Roller authored Sep 08, 2018
  
  bfeb7732
- Update LM test with --no-c10d · 8bd8ec8f
  Myle Ott authored Sep 07, 2018
  
  8bd8ec8f
- Disable c10d for AdaptiveLoss · f66e9cb5
  Myle Ott authored Sep 06, 2018
  
  f66e9cb5
- Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 · 1082ba35
  Sergey Edunov authored Sep 06, 2018
```
- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
```
  1082ba35
- Revert sequence generator changes · 311d2c6c
  Myle Ott authored Sep 06, 2018
  
  311d2c6c
- Sequence generator bug fix. · 0714080b
  Stephen Roller authored Sep 05, 2018
  
  0714080b
- Generator: net_input instead of manual src_tokens. · e6d45d5c
  Stephen Roller authored Sep 05, 2018
  
  e6d45d5c
24 Sep, 2018 2 commits
- Merge pull request #287 from pytorch/oss-master · 25524f19
  Sergey Edunov authored Sep 24, 2018
```
Update readme with WMT'18 model (#433)
```
  25524f19
- Update readme with WMT'18 model (#433) · 86b5cfe4
  Sergey Edunov authored Sep 24, 2018
  
  86b5cfe4
18 Sep, 2018 4 commits
- Merge pull request #279 from pytorch/oss-master · 5d150856
  Sergey Edunov authored Sep 17, 2018
```
Oss master
```
  5d150856
- Readme fix · 74b3f1e9
  Sergey Edunov authored Sep 17, 2018
  
  74b3f1e9
- Fix docs · fe2d1581
  Sergey Edunov authored Sep 17, 2018
  
  fe2d1581
- Fix readme · 5d944b06
  Sergey Edunov authored Sep 17, 2018
  
  5d944b06
07 Sep, 2018 1 commit
- modified stories readme to include sample preprocessing code to split stories to 1k tokens · 5d00e8ee
  Angela Fan authored Sep 07, 2018
  
  5d00e8ee
04 Sep, 2018 1 commit
- Update documentation · 4a47b889
  Myle Ott authored Sep 03, 2018
  
  4a47b889
03 Sep, 2018 24 commits
- Add documentation · 6381cc97
  Myle Ott authored Sep 03, 2018
  
  6381cc97
- Misc changes to simplify upcoming tutorial · 0e101e9c
  Myle Ott authored Sep 02, 2018
  
  0e101e9c
- Test max_positions · d473620e
  Myle Ott authored Sep 02, 2018
  
  d473620e
- fix cosine lr sched for t_mult=1 with warmup · dfd77717
  alexeib authored Sep 02, 2018
  
  dfd77717
- Further generalize EpochBatchIterator and move iterators into new file · 0a7f9e64
  Myle Ott authored Aug 31, 2018
  
  0a7f9e64
- Fix comment · 75f6ba05
  Myle Ott authored Aug 30, 2018
  
  75f6ba05
- fix max_positions comparison · b3cd43b2
  alexeib authored Aug 30, 2018
  
  b3cd43b2
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- Add --upsample-primary · 6296de82
  Myle Ott authored Aug 28, 2018
  
  6296de82
- Add adaptive softmax changes for lstm model · 5852d3a0
  Li Zhao authored Aug 28, 2018
  
  5852d3a0
- dont send dummy batch when reloading from checkpoint · 343819f9
  Alexei Baevski authored Aug 28, 2018
```
also don't crash if param does not recieve grads
```
  343819f9
- Fix FP16 version comparison · b9956a6a
  Myle Ott authored Aug 27, 2018
  
  b9956a6a
- Merge internal changes · 753935ef
  Myle Ott authored Aug 27, 2018
  
  753935ef
- word stats in eval_lm · c7c567a7
  Alexei Baevski authored Aug 26, 2018
  
  c7c567a7
- Old checkpoints can't be loaded because of a new meter · c9b800d2
  Sergey Edunov authored Aug 24, 2018
  
  c9b800d2
- Add training wall time meter · 9c102784
  Myle Ott authored Aug 24, 2018
  
  9c102784
- disable final layer norm for transformer decoder as it makes things worse · f84e1ed4
  Alexei Baevski authored Aug 23, 2018
  
  f84e1ed4
- Fix adaptive softmax cutoff comment · 81ba4c4c
  Louis Martin authored Aug 23, 2018
  
  81ba4c4c
- Remove --normalization-constant from fconv · 47f095fd
  Myle Ott authored Aug 21, 2018
  
  47f095fd
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
- add warmup support back to cosine lr sched (important for mt) · ba9f32cc
  alexeib authored Aug 16, 2018
  
  ba9f32cc
- Warn when using FP16 on pre-Volta GPUs · 8d6665f2
  Myle Ott authored Aug 14, 2018
  
  8d6665f2
- Move read_binarized.py to scripts/ · e7422192
  Myle Ott authored Aug 14, 2018
  
  e7422192
- script to read binarized data · cd6590b6
  Alexei Baevski authored Aug 14, 2018
  
  cd6590b6