Commits · d17fa8513547cbbaee206f56c9c8e80f9bdb9196 · OpenDAS / Fairseq

"docs/vscode:/vscode.git/clone" did not exist on "bc80dc4ce0ae5a97b8a58faa1d8b8cfbb56e21f5"

12 Mar, 2019 1 commit

Dmytro Okhonko authored Mar 12, 2019

Summary: Adding Adadelta optimizer to fairseq as wrapper around torch.optim.Adadelta

Reviewed By: myleott

Differential Revision: D14418635

fbshipit-source-id: 6bf5ec008e905a4a2cbf7415e9492f5eea3ff07f

d17fa851

28 Feb, 2019 1 commit

Add test for mixture of experts · bc919276

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/543

Differential Revision: D14259481

Pulled By: myleott

fbshipit-source-id: fcb0a150b8e851cf86ea5ed1f083f56e1600588e

bc919276

01 Feb, 2019 1 commit

Support custom Dictionary implementations in 'preprocess.py' (#448) · bbb4120b

Davide Caroselli authored Feb 01, 2019

Summary:
The `preprocess.py` script has been refactored in order to:

1. Use the `options` module for command line arguments parsing. This will give to `preprocess.py` the ability to load custom modules with `--user-dir` flag (already implemented to all other binaries)
2. Dictionary loading and building code has moved to Task implementation. This allows custom Dictionary classes to be used during the data generation step.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/448

Differential Revision: D13674819

Pulled By: myleott

fbshipit-source-id: b40648a98ed6c08284577e5ec25876e018d8c822

bbb4120b

30 Jan, 2019 1 commit

Add --input option to interactive.py to support reading from file · 3dce7c9f

Myle Ott authored Jan 30, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/484

Differential Revision: D13880636

Pulled By: myleott

fbshipit-source-id: 984b2e1c3b281c28243102eb971ea45ec891d94e

3dce7c9f

25 Jan, 2019 1 commit

Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473) · b41c74dc

Myle Ott authored Jan 25, 2019

Summary:
Changelog:
- `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper
- `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu
- update READMEs
- misc fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/473

Differential Revision: D13819717

Pulled By: myleott

fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9

b41c74dc

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

03 Oct, 2018 1 commit

Fix proxying in DistributedFairseqModel · fc677c94

Myle Ott authored Oct 03, 2018

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/302

Differential Revision: D10174608

Pulled By: myleott

fbshipit-source-id: 4e2dfc76eae97afc5488f29b47e74f9897a643ff

fc677c94

25 Sep, 2018 2 commits
- Better support for various c10d API changes · fbe8ce65
  Myle Ott authored Sep 17, 2018
  
  fbe8ce65
- Update LM test with --no-c10d · 8bd8ec8f
  Myle Ott authored Sep 07, 2018
  
  8bd8ec8f
03 Sep, 2018 2 commits
- Test max_positions · d473620e
  Myle Ott authored Sep 02, 2018
  
  d473620e
- Factor out search logic in SequenceGenerator · ef43da72
  Myle Ott authored Aug 09, 2018
  
  ef43da72
25 Jul, 2018 1 commit
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
21 Jun, 2018 1 commit
- Fix `--output-format raw` option to preprocess.py (Fixes #188) (#190) · 572a1d55
  Myle Ott authored Jun 21, 2018
  
  572a1d55
15 Jun, 2018 5 commits

Fix bidirectional lstm · bfcc6ec7
Myle Ott authored Jun 12, 2018

bfcc6ec7

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Add more integration tests (LM, stories, transformer, lstm) · 16a72b4d
Myle Ott authored Jun 04, 2018

16a72b4d

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

Fix tests · ae2585d9
Myle Ott authored May 24, 2018

ae2585d9

24 May, 2018 1 commit
- Merge internal changes (#163) · ec0031df
  Myle Ott authored May 24, 2018
  
  ec0031df
02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

27 Feb, 2018 3 commits

More unit test fixes · 0d90e35f
Myle Ott authored Feb 15, 2018

0d90e35f
Fix tests and flake8 · 29c82741
Myle Ott authored Feb 15, 2018

29c82741

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206