Commits · 0a7f9e64bb7a306efb0ad25d6f03dc05a92dbfb9 · OpenDAS / Fairseq

03 Sep, 2018 8 commits
- Further generalize EpochBatchIterator and move iterators into new file · 0a7f9e64
  Myle Ott authored Aug 31, 2018
  
  0a7f9e64
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
- fix tests · f1d81db8
  alexeib authored Aug 09, 2018
  
  f1d81db8
- Factor out search logic in SequenceGenerator · ef43da72
  Myle Ott authored Aug 09, 2018
  
  ef43da72
- fix tests · 0b5166db
  alexeib authored Jul 31, 2018
  
  0b5166db
- add flag that allows keeping optimizer config · 2dc074d8
  alexeib authored Jul 28, 2018
```
adds -reset-optimizer, --reset-lr-scheduler, and --optimizer-overrides flags
```
  2dc074d8
- character token embeddings for word level predictions · 885e7ec9
  Alexei Baevski authored Jul 28, 2018
  
  885e7ec9
25 Jul, 2018 1 commit
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
25 Jun, 2018 2 commits
- Remove more Variable() calls (#198) · 6edf81dd
  Myle Ott authored Jun 25, 2018
  
  6edf81dd
- Fix attention order in unit tests (fixes #195) (#197) · 74efc214
  Myle Ott authored Jun 25, 2018
  
  74efc214
24 Jun, 2018 1 commit
- Fix for Dictionary.finalize · c6fe9fc5
  Myle Ott authored Jun 24, 2018
  
  c6fe9fc5
21 Jun, 2018 2 commits
- Move reorder_encoder_out to FairseqEncoder and fix non-incremental decoding · 6ec5022e
  Myle Ott authored Jun 21, 2018
  
  6ec5022e
- Fix `--output-format raw` option to preprocess.py (Fixes #188) (#190) · 572a1d55
  Myle Ott authored Jun 21, 2018
  
  572a1d55
15 Jun, 2018 11 commits

Fix bidirectional lstm · bfcc6ec7
Myle Ott authored Jun 12, 2018

bfcc6ec7
Updates for latest PyTorch · e89329d6
Myle Ott authored Jun 12, 2018

e89329d6

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Add more integration tests (LM, stories, transformer, lstm) · 16a72b4d
Myle Ott authored Jun 04, 2018

16a72b4d
Suppress stdout in test_train · 736fbee2
Myle Ott authored Jun 04, 2018

736fbee2
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
record end_of_epoch in checkpoint · 7d560402
alexeib authored May 28, 2018

7d560402
fix restoring from middle of epoch; fix defaulting transformer dropout params · 978c125a
alexeib authored May 27, 2018

978c125a

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

Fix tests · ae2585d9
Myle Ott authored May 24, 2018

ae2585d9
Fix tests · 8afb7761
Myle Ott authored Apr 24, 2018

8afb7761

24 May, 2018 1 commit
- Merge internal changes (#163) · ec0031df
  Myle Ott authored May 24, 2018
  
  ec0031df
02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

05 Mar, 2018 1 commit
- Filter padding properly in LabelSmoothedCrossEntropyCriterion (#229) · e73fddf4
  Myle Ott authored Mar 04, 2018
  
  e73fddf4
01 Mar, 2018 1 commit
- More updates for PyTorch (#114) · 6e4d370a
  Myle Ott authored Mar 01, 2018
  
  6e4d370a
27 Feb, 2018 4 commits

Refactor incremental generation to be more explicit and less magical (#222) · 9438019f
Myle Ott authored Feb 24, 2018

9438019f
More unit test fixes · 0d90e35f
Myle Ott authored Feb 15, 2018

0d90e35f
Fix tests and flake8 · 29c82741
Myle Ott authored Feb 15, 2018

29c82741

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206

08 Nov, 2017 2 commits

Rename LabelSmoothedCrossEntropy to LabelSmoothedNLLLoss · e1f49695
Myle Ott authored Nov 07, 2017

e1f49695

Refactor model definitions · 6e4b7e22

Myle Ott authored Oct 25, 2017

* Move some functionality out of FConvModel into FairseqModel base class
* Move incremental decoding functionality into FairseqIncrementalDecoder module
* Refactor positional embeddings to be more specific to FConvModel

6e4b7e22

11 Oct, 2017 1 commit
- Fix call ordering to ATen addmm and sum (#22) · ae0c05d9
  Sam Gross authored Oct 11, 2017
  
  ae0c05d9
15 Sep, 2017 1 commit
- Initial commit · e734b0fa
  Sergey Edunov authored Sep 14, 2017
  
  e734b0fa