Commits · e75cff5f2c1d62f12dc911e0bf420025eb1a4e33 · OpenDAS / Fairseq

30 Jul, 2019 1 commit

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

30 May, 2019 1 commit

Add --reset-dataloader · ffc3bb58

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613

Differential Revision: D15541384

Pulled By: myleott

fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d

ffc3bb58

17 May, 2019 1 commit

Clean up sharded train iterator · 3bfbb49b

Myle Ott authored May 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586

Differential Revision: D15372949

Pulled By: myleott

fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da

3bfbb49b

14 May, 2019 1 commit

Move save/load checkpoint functions to utils · cd1e5c09

Dmytro Okhonko authored May 14, 2019

Summary:
Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py
Move `get_perplexity` from train.py to utils.py.
This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library.

Reviewed By: myleott

Differential Revision: D15289607

fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1

cd1e5c09

07 May, 2019 1 commit

Improve init speed of TokenBlockDataset and EpochBatchIterator · e4edf27a

Myle Ott authored May 07, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/704

Differential Revision: D15221549

Pulled By: myleott

fbshipit-source-id: b0021acdc2d7792ce51421f1432e1f2bd8218f7b

e4edf27a

06 May, 2019 1 commit

allowing sharded dataset (#696) · 0add50c2

Naman Goyal authored May 06, 2019

Summary:
Co-authored-by: myleott <myleott@fb.com>

Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner.

For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`.

myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/696

Differential Revision: D15214049

fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013

0add50c2

25 Jan, 2019 1 commit

Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473) · b41c74dc

Myle Ott authored Jan 25, 2019

Summary:
Changelog:
- `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper
- `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu
- update READMEs
- misc fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/473

Differential Revision: D13819717

Pulled By: myleott

fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9

b41c74dc

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

25 Sep, 2018 1 commit
- core changes to support latte collab · cfd2a3a0
  Alexei Baevski authored Sep 20, 2018
  
  cfd2a3a0
03 Sep, 2018 3 commits
- Further generalize EpochBatchIterator and move iterators into new file · 0a7f9e64
  Myle Ott authored Aug 31, 2018
  
  0a7f9e64
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- fix tests · 0b5166db
  alexeib authored Jul 31, 2018
  
  0b5166db
15 Jun, 2018 6 commits

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Suppress stdout in test_train · 736fbee2
Myle Ott authored Jun 04, 2018

736fbee2
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
record end_of_epoch in checkpoint · 7d560402
alexeib authored May 28, 2018

7d560402
fix restoring from middle of epoch; fix defaulting transformer dropout params · 978c125a
alexeib authored May 27, 2018

978c125a

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de