Commits · ff3db3cddff576bdfdbfa5c9c400997ed8c8e009 · OpenDAS / Fairseq

"docs/source/vscode:/vscode.git/clone" did not exist on "a93e5578d6653e76a493fb0813a1ecf9ee27ed39"

21 Jun, 2018 1 commit
- Support FP16 during inference · 930c9580
  Myle Ott authored Jun 19, 2018
  
  930c9580
15 Jun, 2018 10 commits

Change --path to be colon-separated instead of comma-separated · 16caed31
Myle Ott authored Jun 14, 2018

16caed31

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Unify various sharding into ShardedIterator · 24d7de44
Myle Ott authored May 30, 2018

24d7de44
Migrate all binaries to use options.parse_args_and_arch · 76b5ecab
Myle Ott authored May 30, 2018

76b5ecab
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
added multiscale gated self attention layer with multiple heads, and pretrained fusion models · b59815bc
Angela Fan authored May 09, 2018

b59815bc

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

also report sentence/s timing when generating · bf47b956
Alexei Baevski authored May 17, 2018

bf47b956
allow specifying max_tokens for generation · 67af40c9
Alexei Baevski authored May 15, 2018

67af40c9

remove completed sentences from batch · 2a84f46b

Alexei Baevski authored Apr 12, 2018

remove completed sentences from batch and allow batching uneven lengths (with fixes to make padded sequences work correctly in all models)

2a84f46b

02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

05 Mar, 2018 1 commit
- Allow more flexible pre-processing and generation (#227) · b03b53b4
  Sergey Edunov authored Mar 05, 2018
```
* Allow more flexible pre-processing and generation

* Addressing CR comments

* small fix
```
  b03b53b4
01 Mar, 2018 1 commit
- More fixes for recent PyTorch (incl. topk issue) (#113) · 3bde773d
  Myle Ott authored Mar 01, 2018
  
  3bde773d
27 Feb, 2018 3 commits

Add support to prefixes (#221) · 866b27d5

Dario Pavllo authored Feb 23, 2018

* Add prefix

* Fixes

* Keep original scores with prefix

* Improve prefix code

* Replace 'repeat' with 'expand'

866b27d5

More unit test fixes · 0d90e35f
Myle Ott authored Feb 15, 2018

0d90e35f

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206

22 Jan, 2018 2 commits
- Support deprecation of volatile Variables in latest PyTorch · 7da4e062
  Myle Ott authored Dec 22, 2017
  
  7da4e062
- Add support for sharded generation · 9f7c3ec6
  Myle Ott authored Dec 21, 2017
  
  9f7c3ec6
13 Nov, 2017 1 commit
- Fallback to `--log-format=simple` for non-TTY terminals · 1b42c8c4
  Myle Ott authored Nov 12, 2017
  
  1b42c8c4
12 Nov, 2017 1 commit
- Add `--log-format` option and JSON logger · c6d6256b
  Myle Ott authored Nov 11, 2017
  
  c6d6256b
08 Nov, 2017 8 commits

Replace unk with original string · 42a0150c

Louis Martin authored Nov 06, 2017

* Add <eos> for unk replacement
* Add IndexedRawTextDataset to load raw text files
* Replace unk with original string
* Add load_raw_text_dataset() and --output-format
* Move has_binary_files to data.py

42a0150c

Add --max-sentence option for batching based on # sentences · f442f896
Myle Ott authored Nov 04, 2017

f442f896
Update README with interactive.py and fix it · 2ef422f6
Louis Martin authored Nov 02, 2017

2ef422f6
Fix interactive.py · e21901e8
Myle Ott authored Oct 31, 2017

e21901e8
Improvements to data loader · 8f9dd964
Myle Ott authored Oct 31, 2017

8f9dd964

Refactor generation · 7ae79c12

Louis Martin authored Oct 30, 2017

* Split generate.py to generate.py and interactive.py and refactor code

The main motivation behind these changes is to try to decorrelate use
cases in order to implement future improvements such as unk replacement
with original string during evaluation on test and writing predictions
to output file.
The previous implementation worked well but I found it difficult to
integrate these future improvements.

* Add --replace-unk arg to be used without align dict

Replacing <unk> tokens can be beneficial even without an alignment
dictionary.

7ae79c12

Added -unkpen flag to generate.py following logic of Lua/Torch version · 5fe8ea46
Michael Auli authored Oct 25, 2017

5fe8ea46

Refactor model definitions · 6e4b7e22

Myle Ott authored Oct 25, 2017

* Move some functionality out of FConvModel into FairseqModel base class
* Move incremental decoding functionality into FairseqIncrementalDecoder module
* Refactor positional embeddings to be more specific to FConvModel

6e4b7e22

19 Oct, 2017 5 commits
- Fix handling of continuation tokens that precede <unk> in generate.py · ff2e8cf2
  Myle Ott authored Oct 17, 2017
  
  ff2e8cf2
- Simplify deps of build_model to only depend on dict (instead of dataset) · 84b82dc6
  Myle Ott authored Oct 17, 2017
  
  84b82dc6
- Fix flake8 warnings · cb0d7b2a
  Louis Martin authored Sep 25, 2017
  
  cb0d7b2a
- Support configurable BPE symbol · 7333d04d
  Myle Ott authored Sep 25, 2017
  
  7333d04d
- Move helper functions from generate.py to fairseq/dictionary.py · 59d599a2
  Myle Ott authored Sep 25, 2017
  
  59d599a2
11 Oct, 2017 3 commits
- Ignore invalid sentences in test and valid · 3f9b9838
  Sergey Edunov authored Oct 11, 2017
  
  3f9b9838
- Don't generate during training, add --quiet to generate.py · 8f058ea0
  Sergey Edunov authored Oct 11, 2017
  
  8f058ea0
- Update progress_bar to be more robust to changes in tqdm (#21) · 7aba6084
  Myle Ott authored Oct 11, 2017
  
  7aba6084
26 Sep, 2017 1 commit
- Fix generation when vocabulary is small relative to beam size (fixes #7) · 03c4a716
  Myle Ott authored Sep 26, 2017
  
  03c4a716
18 Sep, 2017 1 commit
- More fixes · c6de2190
  Sergey Edunov authored Sep 18, 2017
  
  c6de2190
15 Sep, 2017 1 commit
- Initial commit · e734b0fa
  Sergey Edunov authored Sep 14, 2017
  
  e734b0fa