"scripts/convert_versatile_diffusion_to_diffusers.py" did not exist on "d9cfe325a53502641f16ce4f839391c5b0d0a684"
- 19 May, 2019 1 commit
-
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/570 Pull Request resolved: https://github.com/pytorch/fairseq/pull/731 Currently the LearnedPositionalEmbedding module computes the position tensor based on the input data. However this really doesnt work for XLM where we have different behavior based on the Masked LM and Translation LM. In this diff I keep the same default behavior for LearnedPositionalEmbedding as before but add the ability for these models to work with pre-computed position tensors. Reviewed By: myleott Differential Revision: D15305474 fbshipit-source-id: de7d908245a2a620b58d36055211600a08f2d1dc
-
- 17 May, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/588 Differential Revision: D15389638 Pulled By: myleott fbshipit-source-id: 4632ce22d51dc2c74d250bae999630095d849701
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586 Differential Revision: D15372949 Pulled By: myleott fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da
-
- 16 May, 2019 5 commits
-
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/744 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/587 After we added additional prediciton layers for language model predictions. The fine-tuning is broken because of 2 reasons. 1. checkpoint cannot be loaded since we didn't update state_dict names 2. lm_output_learned_bias is not initialize if load_softmax is false Reviewed By: myleott Differential Revision: D15377380 fbshipit-source-id: d58544b1d2c549586abef42fec19ec8bf27a994a
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/743 Original commit changeset: 0afe37c9a031 According to edunov: "We need to be careful here with shared parameters, I believe right now it is broken if you have shared encoder/decoder input embeddings (encoder.embed_tokens.weight and decoder.embed_tokens.weight) as they get updated several times" We also have OSS issues that look related, e.g., https://github.com/pytorch/fairseq/issues/732. Backing this out until we can confirm the correct behavior for shared params. Differential Revision: D15372673 fbshipit-source-id: 8683c0f2514e21fa1e9d2fe6dfc48d98957a2831
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/585 Differential Revision: D15372416 fbshipit-source-id: add226a4558ae4d84dd261e9317b80c43970f771
-
Peng-Jen Chen authored
Summary: Similar to TranslationTask, we want to enable multilingual translation task to be able to load 'train{k}' datasets from data-bin folder. Reviewed By: lematt1991 Differential Revision: D15363481 fbshipit-source-id: 5fed7be19383023b792ed2fd38e655cbcecc8b90 -
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/584 Reviewed By: myleott Differential Revision: D15360774 Pulled By: myleott fbshipit-source-id: b18efbb6ff5a8832c61b689f3d87c958cbd908e9
-
- 15 May, 2019 7 commits
-
-
Ruty Rinott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/583 D14610694 fixed issues in layerNorm exporting by making it conditional. D15260838 changed the implementation of TransformerDecoderLayer to the one under transformer, thus losing the fix. Bringing it back here. Reviewed By: myleott, geof90, liaimi Differential Revision: D15357119 fbshipit-source-id: e29e053ca5beca0008d7a8dad9880a483a14c7b9
-
Naman Goyal authored
Summary: added shuffle as arg for masked_lm for experimenting with pad effecient batching Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/582 Reviewed By: jingfeidu Differential Revision: D15355105 Pulled By: jingfeidu fbshipit-source-id: 9925271a0bc2f9d283f354d158bd4b5ec8788b39
-
Naman Goyal authored
Summary: 1) Added pooled_output for sentence classification as `Tanh(Linear())`. 2) Added lm_head_transform as `LayerNorm(GeLU(Linear(x)))` 3) `act_dropout = 0.0` 4) added `lm_output_learned_bias` Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/581 Reviewed By: borguz Differential Revision: D15353575 Pulled By: borguz fbshipit-source-id: 4ff64c6ceed23f3e99348f73d189546f1d84452e
-
Myle Ott authored
Summary: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - update docs Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561 Differential Revision: D15271142 Pulled By: myleott fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/578 Differential Revision: D15352060 Pulled By: myleott fbshipit-source-id: 7dc2fceca37ec96c89356662831b0d82f28bef6f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/579 Differential Revision: D15352058 Pulled By: myleott fbshipit-source-id: cebef02edcfcb203ef2e32c64f7f28e08c4e46b0
-
Myle Ott authored
Summary: Various fixes for Masked LM - use --activation-fn instead of --gelu - use --dataset-impl instead of --lazy-load - add embed_scale option to TransformerSentenceEncoder - fix encoder_normalize_before to include a final layer norm - delete BertLayerNorm Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/573 Reviewed By: borguz Differential Revision: D15317933 Pulled By: myleott fbshipit-source-id: 8ecb46556ad43e76e92d41ed8f5a62e8516fd375
-
- 14 May, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/575 Differential Revision: D15318004 Pulled By: myleott fbshipit-source-id: ad918d71b1bd8074decf5ec3463dd9bc9487bbe9
-
Nayan Singhal authored
Summary: 1. Define a EpochMinibatchIterator which extends the EpochBatchIterator. It has same functionality as EpochBatchIterator except two major changes: use static batching and use MiniBatchIterator for getting the indices. 2. SplitSeqCollater is used instead of Seq2SeqCollater. 3. LSTM_subsample started storing the previous states and reset it once the sample is over. Reviewed By: jay-mahadeokar Differential Revision: D15209023 fbshipit-source-id: 900b8bd1f25159ffc77f8106e26729a3e7422a1f
-
Dmytro Okhonko authored
Summary: Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py Move `get_perplexity` from train.py to utils.py. This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library. Reviewed By: myleott Differential Revision: D15289607 fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1
-
- 13 May, 2019 4 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/576 Differential Revision: D15318086 Pulled By: myleott fbshipit-source-id: c6587737ca7b97edc97ad4aef5c5c9ac7e92b2f2
-
Myle Ott authored
Summary: This was named gelu_fast after the original implementation: https://github.com/hendrycks/GELUs/blob/master/mnist_ae.py#L62-L63 But in practice it's actually slower and uses more memory. Rename to gelu_accurate. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/571 Differential Revision: D15317874 Pulled By: myleott fbshipit-source-id: c96fbc89bf91b27ced1ab8d5b25a8f23f922ec24
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/574 Differential Revision: D15317984 Pulled By: myleott fbshipit-source-id: 09a66229cc6b4c95678ca1ca13c9e0da25b203de
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/572 Differential Revision: D15317928 Pulled By: myleott fbshipit-source-id: b3f0e9229737a63b49937e7c5b918470f18ddc45
-
- 12 May, 2019 2 commits
-
-
zhiqiang authored
Summary: `--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN Pull Request resolved: https://github.com/pytorch/fairseq/pull/735 Differential Revision: D15314625 Pulled By: myleott fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/736 Differential Revision: D15314626 Pulled By: myleott fbshipit-source-id: 1e0c32529afee57e43fe5d6c7991cd13eb8a52c4
-
- 11 May, 2019 2 commits
-
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/568 Differential Revision: D15308483 Pulled By: myleott fbshipit-source-id: 9d898ce523e46e6b6fb444274f478da0b577b603
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/560 Differential Revision: D15260838 Pulled By: myleott fbshipit-source-id: 5f80dd82775c10ce46a3e1c451ccaf0ef55bfa31
-
- 10 May, 2019 2 commits
-
-
Jay Mahadeokar authored
Summary: As in title. Reviewed By: skritika Differential Revision: D15299135 fbshipit-source-id: 2fd513b32c0ab41911cdf0b0186f6c3bb5256285
-
myleott authored
-
- 09 May, 2019 5 commits
-
-
Myle Ott authored
Set initial learning rate in LR schedulers by calling step_update(0) at init
-
Myle Ott authored
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/564 Differential Revision: D15278017 Pulled By: myleott fbshipit-source-id: b6fba1b62145ea533b40f5eb9b134e6aa122e546
-
Jingfei Du authored
Summary: the old no_bias_kv argument for masked_lm models are not used. Split it into 2 arguments and expose them. Reviewed By: myleott Differential Revision: D15266154 fbshipit-source-id: 60b041f8370ca1d8869ed3402fb9a67d1cd8e0e8
-
- 08 May, 2019 7 commits
-
-
Myle Ott authored
Reviewed By: jmp84 Differential Revision: D15264847 fbshipit-source-id: 4ba9224d1b35c3de0d26c9b4c1ee6d641d3d8535
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/723 Differential Revision: D15260870 Pulled By: myleott fbshipit-source-id: 73d9b138b9ab44f96824076258f1a6319193d0f7
-
Naman Goyal authored
Summary: 1) Made the model compatible with using either `masked_lm_dataset` or `monolingual_dataset`. 2) fixed default args setting task. (`bert` vs `masked_lm`) myleott should we keep both? 3) bug in setting default value of `sentence_class_num` 4) bug for padding mask in `fp16`. Pull Request resolved: https://github.com/pytorch/fairseq/pull/721 Differential Revision: D15259885 fbshipit-source-id: 9dbf7fb8192992c1251670287bed719e41c08fcc
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720 Differential Revision: D15259091 Pulled By: myleott fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/719 Differential Revision: D15258483 Pulled By: myleott fbshipit-source-id: dd00daa6f1c87264c1196a77dfffc8c876ebde7f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/717 Differential Revision: D15254560 Pulled By: myleott fbshipit-source-id: 2a07614e8d294636f706939e60f0091c73115494
-
Jay Mahadeokar authored
Summary: D15214049 introduced a bug such that if a tasks args does not contain data, then it will give error ``` File "/data/users/jaym/fbsource/fbcode/buck-out/dev/gen/deeplearning/projects/fairspeq/train#link-tree/train.py", line 119, in reload_train if len(args.data.split(":")) == 1: AttributeError: 'Namespace' object has no attribute 'data' ``` This diff checks if data is in args to avoid above error. Reviewed By: myleott, jmp84 Differential Revision: D15253373 fbshipit-source-id: 14fb9ad878ee50f1b7583349bb17e29c03c40815
-