- 30 Jul, 2019 1 commit
-
-
Myle Ott authored
Summary: The previous BSD+PATENTS license was controversial. We have been approved to relicense fairseq under the MIT license. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786 Differential Revision: D16560654 Pulled By: myleott fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034
-
- 30 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613 Differential Revision: D15541384 Pulled By: myleott fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d
-
- 17 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586 Differential Revision: D15372949 Pulled By: myleott fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da
-
- 14 May, 2019 1 commit
-
-
Dmytro Okhonko authored
Summary: Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py Move `get_perplexity` from train.py to utils.py. This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library. Reviewed By: myleott Differential Revision: D15289607 fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1
-
- 07 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/704 Differential Revision: D15221549 Pulled By: myleott fbshipit-source-id: b0021acdc2d7792ce51421f1432e1f2bd8218f7b
-
- 06 May, 2019 1 commit
-
-
Naman Goyal authored
Summary: Co-authored-by:
myleott <myleott@fb.com> Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner. For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`. myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets. Pull Request resolved: https://github.com/pytorch/fairseq/pull/696 Differential Revision: D15214049 fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013
-
- 25 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9
-
- 05 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
-
- 25 Sep, 2018 1 commit
-
-
Alexei Baevski authored
-
- 03 Sep, 2018 3 commits
- 15 Jun, 2018 6 commits
-
-
Myle Ott authored
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss. Changes: - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator. - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position - Remove LEFT_PAD_* constants and make them configurable per task
-
Myle Ott authored
-
Myle Ott authored
-
alexeib authored
-
alexeib authored
-
alexeib authored
This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf There are 3 modes for constructing batches: - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper - eos: one sentence per sample (skip blank lines) some results: GCNN-13 - GBW - 37.46 GCNN-14B - GBW - 33.88 GCNN-8 - Wiki103 - 43.76 GCNN-14 - Wiki103 - 35.66 train: python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500 eval: python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
-