- 03 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/801 Differential Revision: D16628318 Pulled By: myleott fbshipit-source-id: 50e93bb9108afd2ba90f1edd4f34306a7c9964a4
-
- 02 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/797 Differential Revision: D16617067 Pulled By: myleott fbshipit-source-id: 52e3aeb98d6e3b55ff9154b784028bf13eabfe38
-
- 01 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/792 Differential Revision: D16591987 Pulled By: myleott fbshipit-source-id: d27c490ae75f80ded19226b8384f4776485dd694
-
- 30 Jul, 2019 1 commit
-
-
Myle Ott authored
Summary: The previous BSD+PATENTS license was controversial. We have been approved to relicense fairseq under the MIT license. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786 Differential Revision: D16560654 Pulled By: myleott fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034
-
- 19 Jul, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734 Differential Revision: D16377044 Pulled By: myleott fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7
-
- 17 Jul, 2019 1 commit
-
-
Xing Zhou authored
Summary: Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p. To test it: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710 Test Plan: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 python tests/test_sequence_generator.py python tests/test_binaries.py Reviewed By: myleott Differential Revision: D16286688 Pulled By: xingz9 fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f
-
- 11 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally. Pull Request resolved: https://github.com/pytorch/fairseq/pull/796 Differential Revision: D15760808 Pulled By: myleott fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230
-
yilinyang7 authored
when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713) Summary: https://github.com/pytorch/fairseq/issues/712 Pull Request resolved: https://github.com/pytorch/fairseq/pull/713 Differential Revision: D15242432 Pulled By: myleott fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179
-
- 02 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625 Differential Revision: D15595787 Pulled By: myleott fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe
-
- 04 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/508 The previous version applied the temperature after the softmax. Fix that, and also generalize so it works with other search approaches. Pull Request resolved: https://github.com/pytorch/fairseq/pull/694 Differential Revision: D15175160 Pulled By: myleott fbshipit-source-id: cc87ff0e97a8a1dd37f9983163f58a8641155ab0
-
- 22 Apr, 2019 1 commit
-
-
Max Ryabinin authored
Summary: Because the size of `unfinalized_scores` is equal to current `bsz` and not initial batch size, we need to index it by `unfin_idx` instead of `sent` in `is_finished`. Fixes #588. Pull Request resolved: https://github.com/pytorch/fairseq/pull/627 Differential Revision: D15034641 Pulled By: myleott fbshipit-source-id: 2638e68e877ae01256cac7d8e69b5b7fec8f7017
-
- 12 Mar, 2019 1 commit
-
-
Dmytro Okhonko authored
Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths. Reviewed By: myleott Differential Revision: D14420044 fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015
-
- 26 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/528 Differential Revision: D14218377 Pulled By: myleott fbshipit-source-id: facb0a32f6aebf56a4fea7259080394ad2d2d846
-
- 22 Feb, 2019 2 commits
-
-
Myle Ott authored
Summary: Code for the paper: [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](https://arxiv.org/abs/1902.07816). Pull Request resolved: https://github.com/pytorch/fairseq/pull/521 Differential Revision: D14188021 Pulled By: myleott fbshipit-source-id: ed5b1ed5ad9a582359bd5215fa2ea26dc76c673e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/351 This makes it easier for tasks to plugin to generate.py/interactive.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/520 Differential Revision: D14183881 Pulled By: myleott fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33
-
- 16 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505 Differential Revision: D14110201 Pulled By: myleott fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d
-
- 05 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
-
- 26 Dec, 2018 1 commit
-
-
Myle Ott authored
Summary: - 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks - 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking Pull Request resolved: https://github.com/pytorch/fairseq/pull/422 Differential Revision: D13548445 Pulled By: myleott fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9
-
- 30 Nov, 2018 1 commit
-
-
linkerr authored
Summary: ….LongTensor but found type torch.cuda.FloatTensor for argument #3 'index' " error in the torch.__version__ == 0.4.0 , new_order = torch.arange(bsz).view(-1, 1).repeat(1, beam_size).view(-1) will return a float dtype Tensor, when exec the "line 321: fairseq/fairseq/models/fconv.py " will throw a RuntimeError Pull Request resolved: https://github.com/pytorch/fairseq/pull/393 Differential Revision: D13276496 Pulled By: myleott fbshipit-source-id: e7986246fbe2c79fff61bcab0e5bec9dd63e0afd
-
- 30 Sep, 2018 1 commit
-
-
myleott authored
-
- 25 Sep, 2018 5 commits
-
-
Alexei Baevski authored
-
Stephen Roller authored
-
Myle Ott authored
-
Stephen Roller authored
-
Stephen Roller authored
-
- 03 Sep, 2018 2 commits
- 25 Jul, 2018 2 commits
-
-
Myle Ott authored
-
Alexei Baevski authored
-
- 21 Jun, 2018 1 commit
-
-
Myle Ott authored
-
- 15 Jun, 2018 10 commits
-
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss. Changes: - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator. - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position - Remove LEFT_PAD_* constants and make them configurable per task
-
Myle Ott authored
Co-authored-by:pmichel31415 <pmichel@fb.com>
-
Myle Ott authored
-
Angela Fan authored
-
alexeib authored
This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf There are 3 modes for constructing batches: - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper - eos: one sentence per sample (skip blank lines) some results: GCNN-13 - GBW - 37.46 GCNN-14B - GBW - 33.88 GCNN-8 - Wiki103 - 43.76 GCNN-14 - Wiki103 - 35.66 train: python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500 eval: python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
-
Alexei Baevski authored
-
Myle Ott authored
-