- 25 Apr, 2019 1 commit
-
-
ankur6ue authored
Summary: Added link to blog post about incremental decoder in the FairseqIncrementalDecoder class description. Pull Request resolved: https://github.com/pytorch/fairseq/pull/662 Differential Revision: D15077845 Pulled By: myleott fbshipit-source-id: f23294721739600e14feb2cca4ece95f2b968f44
-
- 16 Apr, 2019 1 commit
-
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/635 Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291). Reviewed By: liezl200 Differential Revision: D14943776 fbshipit-source-id: 3e416a730303d1dd4f5b92550c78db989be27073
-
- 15 Apr, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/625 Differential Revision: D14822123 Pulled By: myleott fbshipit-source-id: 8a263d30020588577ee02fb8c6959ff918705103
-
- 12 Apr, 2019 1 commit
-
-
Liezl Puzon authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/633 Pull Request resolved: https://github.com/pytorch/translate/pull/456 This diff makes it easier to upgrade the state dict for components that use TransformerEncoderLayer Reviewed By: jhcross Differential Revision: D14916941 fbshipit-source-id: 6d0258c8a9492a720684dadce59c90fc87cbf5cf
-
- 10 Apr, 2019 2 commits
-
-
Liezl Puzon authored
Summary: I added an upgrade_state_dict function so that loading old models will still work layer_norms[0] --> self_attn_layer_norm layer_norms[1] --> final_layer_norm Reviewed By: pipibjc Differential Revision: D14689849 fbshipit-source-id: b2809262c11fe9d083e571fa31044798aefd48ce
-
Peng-Jen Chen authored
Summary: - Add language token to MultilingualTranslation task - Add back translation and denoising loss to MultilingualTranslation task Pull Request resolved: https://github.com/pytorch/fairseq/pull/620 Reviewed By: liezl200 Differential Revision: D14756873 Pulled By: pipibjc fbshipit-source-id: 89d668db26848fd95f446edf5923bab2113636f7
-
- 15 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - 998ba4f: Add language models from Baevski & Auli (2018) - 4294c4f6: Add mixture of experts code from Shen et al. (2019) - 00493490: Add example for multilingual training - 48d9afbe: Speed improvements, including fused operators from apex - 44d27e64: Add Tensorboard support - d17fa851: Add Adadelta optimizer - 9e1c880f: Add `FairseqEncoderModel` - b65c579b: Add `FairseqTask.inference_step` to modularize generate.py - 2ad1178e: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7
-
- 14 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: * Add FusedLayerNorm and FusedAdam * Softmax and zero grad optimizations Pull Request resolved: https://github.com/pytorch/fairseq/pull/531 Differential Revision: D14218457 Pulled By: myleott fbshipit-source-id: 5656b2d0152cd85f77dc21ec0e1439ec04b9fa89
-
- 12 Mar, 2019 1 commit
-
-
Dmytro Okhonko authored
Summary: Base class for encoder-only models. Some models doesn't have decoder part. Reviewed By: myleott Differential Revision: D14413406 fbshipit-source-id: f36473b91dcf3c835fd6d50e2eb6002afa75f11a
-
- 16 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505 Differential Revision: D14110201 Pulled By: myleott fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d
-
- 30 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU. - `0d76427`: fix assertion error when training language model with dataset containing empty sentences - minor bug and style fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/483 Differential Revision: D13867899 Pulled By: myleott fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda
-
- 25 Jan, 2019 2 commits
-
-
Myle Ott authored
Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/471 Differential Revision: D13818918 Pulled By: myleott fbshipit-source-id: d3b8dc50e81ee1d2dcc5efc5815998be8461085f
-
- 24 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/470 Differential Revision: D13803964 Pulled By: myleott fbshipit-source-id: 91b66599e9a539833fcedea07c608b349ba3b449
-
- 17 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/454 Differential Revision: D13708565 Pulled By: myleott fbshipit-source-id: 5cd0e07e3e1885eef14e3a5e8074f24cf4bde632
-
- 14 Jan, 2019 1 commit
-
-
Huihui Fan authored
Summary: minor fixes: 1- adding fairseq logo 2- encoder padding for fconv self att 3- legacy ddp change Pull Request resolved: https://github.com/pytorch/fairseq/pull/442 Differential Revision: D13651715 Pulled By: myleott fbshipit-source-id: ac93c80f1dbffdfe03fbd4b8a8ea527aecb576a7
-
- 10 Jan, 2019 1 commit
-
-
Wei Ho authored
Summary: https://github.com/pytorch/fairseq/blob/master/fairseq/trainer.py#L164 calls `train()` without any argument Reviewed By: myleott Differential Revision: D13599203 fbshipit-source-id: 3a096a6dd35a7a3f8309fbda3b54a36f606475e3
-
- 05 Jan, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
-
- 26 Dec, 2018 1 commit
-
-
Emanuele Bugliarello authored
Summary: Add argument `--no-token-positional-embeddings` to TransformerModel (currently only available in TransformerLanguageModel) to disable positional embeddings. Pull Request resolved: https://github.com/pytorch/fairseq/pull/421 Differential Revision: D13548450 Pulled By: myleott fbshipit-source-id: b352c702ed1609e3b84d9a8404941d3274a7f883
-
- 06 Dec, 2018 3 commits
-
-
Myle Ott authored
Summary: Not switching to Black formatting just yet, but adding fmt: off directives in case we decide to later. Pull Request resolved: https://github.com/pytorch/fairseq/pull/399 Differential Revision: D13364674 Pulled By: myleott fbshipit-source-id: a20a11a18be3d583ee30eff770278fb4bd05b93c
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/398 Differential Revision: D13358876 Pulled By: myleott fbshipit-source-id: 57673f2643aac01492cb8f5728bb9f1a34ba6aa7
-
Teng Li authored
Summary: As the title says, better to enable this for certain use cases to make sure things are right Reviewed By: myleott, pietern Differential Revision: D13351753 fbshipit-source-id: cf495960fda71ebd679c23212e19703c93a9dbdc
-
- 27 Nov, 2018 1 commit
-
-
Liezl Puzon authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/386 Pull Request resolved: https://github.com/pytorch/translate/pull/266 This allows decoder embedding sharing for denoising autoencoder modules with different decoders (one for src decoding and one for tgt decoding) Reviewed By: dpacgopinath Differential Revision: D13133015 fbshipit-source-id: 3c98be639d705744ccf5ba3a8fd7d10ddc7aef4a
-
- 26 Nov, 2018 1 commit
-
-
Myle Ott authored
Fix some recursive functions (e.g., reorder_incremental_state) to only touch each module once (#379) Summary: This can happen if a module is registered in more than one place in the network. Pull Request resolved: https://github.com/pytorch/fairseq/pull/379 Differential Revision: D13154498 Pulled By: myleott fbshipit-source-id: a35575d1956a46cd35ac8b16a719ad20ac3e380a
-
- 17 Nov, 2018 1 commit
-
-
Myle Ott authored
Summary: This should bring back the speedup with --update-freq that we reported in the Scaling Neural Machine Translation paper. Pull Request resolved: https://github.com/pytorch/fairseq/pull/370 Differential Revision: D13100281 Pulled By: myleott fbshipit-source-id: 4a81b51bb7390a197add314a4be5512bbf68c085
-
- 07 Nov, 2018 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/352 Differential Revision: D12956930 Pulled By: myleott fbshipit-source-id: 39334a79544bac570feb04be9103269d7c1563f9
-
- 21 Oct, 2018 1 commit
-
-
Peng-Jen Chen authored
Summary: Manually port fairinternal fairseq-py pull request #385 [1] to fbcode. Resolve the merge conflict of removing fp16_trainer per offline discussion with Myle. Also updated codes to make generate.py works. [1] https://github.com/fairinternal/fairseq-py/pull/385/commits/18fa6e154781cf0c4b1596429dba7e753a545069 Reviewed By: liezl200 Differential Revision: D10052908 fbshipit-source-id: c3c378d78dc1e9ac087c815f359e78c0048ff2f5
-
- 19 Oct, 2018 1 commit
-
-
Peng-Jen Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/317 When upgrading `state_dict` variable, `upgrade_state_dict` function in TransformerEncoder/TransformerDecoder doesn't handle multiple encoders/decoders, however, D10052908 will be the case. Before the change, we will hit error message [1] when loading checkpoint for multilingual_transformer model in D10052908. This diff will fix it. Reviewed By: myleott, liezl200 Differential Revision: D10375418 fbshipit-source-id: 7104c1a463e78f3fa33d8479a37c51608be50610
-
- 03 Oct, 2018 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/302 Differential Revision: D10174608 Pulled By: myleott fbshipit-source-id: 4e2dfc76eae97afc5488f29b47e74f9897a643ff
-
- 25 Sep, 2018 4 commits
-
-
Myle Ott authored
-
Alexei Baevski authored
-
Myle Ott authored
-
Sergey Edunov authored
- no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
-
- 03 Sep, 2018 7 commits