Commits · 4cd2bb702b2715d3cd38e7106c9d52c3b229eba2 · OpenDAS / Fairseq

15 Jun, 2018 25 commits
- Revert "Make dictionary size a multiple of 8" · 4cd2bb70
  Myle Ott authored Apr 12, 2018
```
This reverts commit b2e119c209363e6ff6d2878a69c7d1a507a2e9be.
```
  4cd2bb70
- Make dictionary size a multiple of 8 · 26f87c7d
  Myle Ott authored Apr 11, 2018
  
  26f87c7d
- Add FP16 support · 7ee1d284
  Myle Ott authored Apr 10, 2018
  
  7ee1d284
- Fix batching during generation · 73a87327
  Myle Ott authored Apr 07, 2018
  
  73a87327
- Allow schedule for update-freq · 47b3b81c
  Myle Ott authored Apr 07, 2018
  
  47b3b81c
- Improve dataloader speed and deprecate concept of batch_offset (use... · 4fa8760e
  Myle Ott authored Apr 07, 2018
```
Improve dataloader speed and deprecate concept of batch_offset (use --sample-without-replacement instead)
```
  4fa8760e
- better batching · c52f6ea4
  Sergey Edunov authored Apr 07, 2018
  
  c52f6ea4
- Use FP32 for multi-head attention softmax · d6be0c7e
  Myle Ott authored Apr 07, 2018
  
  d6be0c7e
- Simulated big batches · 2d27ae08
  Sergey Edunov authored Apr 07, 2018
  
  2d27ae08
- More improvements to weight init and FP16 support · 60c4081b
  Myle Ott authored Apr 06, 2018
  
  60c4081b
- Use PyTorch LayerNorm and improve weight init · 36e360d9
  Myle Ott authored Apr 05, 2018
  
  36e360d9
- smarter way to avoid applying encoder key mask · fc830685
  alexeib authored Apr 05, 2018
  
  fc830685
- caching v3 (cache keys, values, process only last time step) (#241) · b2374e52
  Alexei Baevski authored Apr 05, 2018
```
- process only last time step during generation
- cache keys and values
- dont apply masking during generation
```
  b2374e52
- Fix buffers in sinusoidal positional embeddings · 81b47e7e
  Myle Ott authored Apr 03, 2018
  
  81b47e7e
- Fix flake8 · 5935fe2f
  Myle Ott authored Apr 02, 2018
  
  5935fe2f
- Bug fixes · f68a4435
  Myle Ott authored Apr 01, 2018
  
  f68a4435
- Pass args around to cleanup parameter lists · 1235aa08
  Myle Ott authored Mar 13, 2018
  
  1235aa08
- Remove Google batching stategy (it's not needed) · 559eca81
  Myle Ott authored Mar 06, 2018
  
  559eca81
- Add Transformer model · 97b58b46
  Myle Ott authored Mar 05, 2018
  
  97b58b46
- address comments · 6a7c8d0d
  alexeib authored Apr 09, 2018
  
  6a7c8d0d
- fix optim history · fe54ea54
  alexeib authored Apr 08, 2018
  
  fe54ea54
- Fix LSTM · b84070b7
  Myle Ott authored Apr 06, 2018
  
  b84070b7
- Faster fconv generation · 871be389
  Myle Ott authored Apr 04, 2018
  
  871be389
- Remove sweep_log prefix from json progress bar · 0e8414f9
  Myle Ott authored Apr 02, 2018
  
  0e8414f9
- 0.4.0 -> 0.5.0 · d62a8651
  Myle Ott authored Jun 14, 2018
  
  d62a8651
24 May, 2018 1 commit
- Merge internal changes (#163) · ec0031df
  Myle Ott authored May 24, 2018
  
  ec0031df
22 May, 2018 1 commit
- Update dataset code for use by https://github.com/pytorch/translate/pull/62 (#161) · 29153e27
  theweiho authored May 22, 2018
  
  29153e27
21 May, 2018 1 commit
- Fix old model checkpoints after #151 (fixes #156) (#157) · 3ae97589
  Myle Ott authored May 21, 2018
  
  3ae97589
09 May, 2018 3 commits
- Flake8 · 4973d05a
  Myle Ott authored May 09, 2018
  
  4973d05a
- Add pretrained embedding support (#151) · e40363d7
  Sai authored May 09, 2018
  
  e40363d7
- use implicit padding when possible (#152) · 48c4c6d3
  ngimel authored May 09, 2018
  
  48c4c6d3
01 May, 2018 3 commits
- Update README.md · 66ee3df9
  Myle Ott authored May 01, 2018
  
  66ee3df9
- Disallow --batch-size in interactive.py · 56099c74
  Myle Ott authored May 01, 2018
  
  56099c74
- make interactive mode print out alignment nicely · 6532e32b
  alexeib authored Apr 11, 2018
  
  6532e32b
02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

28 Mar, 2018 4 commits

Merge pull request #134 from hitvoice/master · 48836525
Sergey Edunov authored Mar 28, 2018
```
Update training commands
```
48836525
Update training command for IWSLT14 · 0a141e3f
Runqi Yang authored Mar 29, 2018
```
specify a single GPU setup for IWSLT14
```
0a141e3f

Update training commands · 435ed351

Runqi Yang authored Mar 28, 2018

Update training commands in data/README to match the latest version of this project according to #132.

Continue from 3c072958: add omitted "\".

435ed351

Update training commands · 3c072958

Runqi Yang authored Mar 28, 2018

Update training commands in data/README to match the latest version of this project according to #132.

- Motivation: in the previous data/README, the commands are obsolete and will cause the error "unrecognized arguments: --label-smoothing 0.1 --force-anneal 50". 
- What's changed: add arguments "--criterion label_smoothed_cross_entropy" and "--lr-scheduler fixed" to the training commands of all 3 datasets.
- Result: the new commands run without error on all 3 datasets.

3c072958

27 Mar, 2018 1 commit
- Merge remote-tracking branch 'upstream/master' · 4972056e
  杨润琦 authored Mar 28, 2018
  
  4972056e