- 10 May, 2019 2 commits
-
-
Jay Mahadeokar authored
Summary: As in title. Reviewed By: skritika Differential Revision: D15299135 fbshipit-source-id: 2fd513b32c0ab41911cdf0b0186f6c3bb5256285
-
myleott authored
-
- 09 May, 2019 5 commits
-
-
Myle Ott authored
Set initial learning rate in LR schedulers by calling step_update(0) at init
-
Myle Ott authored
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/564 Differential Revision: D15278017 Pulled By: myleott fbshipit-source-id: b6fba1b62145ea533b40f5eb9b134e6aa122e546
-
Jingfei Du authored
Summary: the old no_bias_kv argument for masked_lm models are not used. Split it into 2 arguments and expose them. Reviewed By: myleott Differential Revision: D15266154 fbshipit-source-id: 60b041f8370ca1d8869ed3402fb9a67d1cd8e0e8
-
- 08 May, 2019 7 commits
-
-
Myle Ott authored
Reviewed By: jmp84 Differential Revision: D15264847 fbshipit-source-id: 4ba9224d1b35c3de0d26c9b4c1ee6d641d3d8535
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/723 Differential Revision: D15260870 Pulled By: myleott fbshipit-source-id: 73d9b138b9ab44f96824076258f1a6319193d0f7
-
Naman Goyal authored
Summary: 1) Made the model compatible with using either `masked_lm_dataset` or `monolingual_dataset`. 2) fixed default args setting task. (`bert` vs `masked_lm`) myleott should we keep both? 3) bug in setting default value of `sentence_class_num` 4) bug for padding mask in `fp16`. Pull Request resolved: https://github.com/pytorch/fairseq/pull/721 Differential Revision: D15259885 fbshipit-source-id: 9dbf7fb8192992c1251670287bed719e41c08fcc
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720 Differential Revision: D15259091 Pulled By: myleott fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/719 Differential Revision: D15258483 Pulled By: myleott fbshipit-source-id: dd00daa6f1c87264c1196a77dfffc8c876ebde7f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/717 Differential Revision: D15254560 Pulled By: myleott fbshipit-source-id: 2a07614e8d294636f706939e60f0091c73115494
-
Jay Mahadeokar authored
Summary: D15214049 introduced a bug such that if a tasks args does not contain data, then it will give error ``` File "/data/users/jaym/fbsource/fbcode/buck-out/dev/gen/deeplearning/projects/fairspeq/train#link-tree/train.py", line 119, in reload_train if len(args.data.split(":")) == 1: AttributeError: 'Namespace' object has no attribute 'data' ``` This diff checks if data is in args to avoid above error. Reviewed By: myleott, jmp84 Differential Revision: D15253373 fbshipit-source-id: 14fb9ad878ee50f1b7583349bb17e29c03c40815
-
- 07 May, 2019 5 commits
-
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/715 Differential Revision: D15240723 fbshipit-source-id: 11d7280cb187d68f107902822e878f2a04b840c7
-
taineleau authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/711 Differential Revision: D15239618 Pulled By: myleott fbshipit-source-id: 82f3f79501a13a967324b8a66281cd134bf1ef23
-
Davide Caroselli authored
Summary: Following discussion in https://github.com/pytorch/fairseq/issues/574: - Implemented MMapIndexedDataset and MMapIndexedDatasetBuilder compatible with IndexedDataset/IndexedDatasetBuilder - Update scripts/read_binarized.py to support new MMapIndexedDataset - Option '--raw-text' and '--lazy-load' replaced with '--dataset-impl' and moved the option definition custom task args to more high-level options.add_dataset_args() (more appropriate) - Implemented also utils functions in indexed_dataset: make_dataset(), dataset_exists() Pull Request resolved: https://github.com/pytorch/fairseq/pull/589 Differential Revision: D14597128 Pulled By: myleott fbshipit-source-id: 4e92d99920cbaa52cfe5a0f1f5d9ae5c92d4268e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/704 Differential Revision: D15221549 Pulled By: myleott fbshipit-source-id: b0021acdc2d7792ce51421f1432e1f2bd8218f7b
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/710 Previously there was a bug in how we dealt with padding when computing the input representation from the segment and position embedding. D15144912 fixed this by adding an offset based on the padding id. However this makes assumptions about the padding id which may not hold true for vocabularies built outside of pyText and fairseq. Based on a discussion with barlaso, this diff 0's out all the embeddings associated with the padding. Reviewed By: borguz Differential Revision: D15209395 fbshipit-source-id: 5573020e610f5466e673fe3845c3ed34ebb5c44d
-
- 06 May, 2019 5 commits
-
-
Naman Goyal authored
Summary: Co-authored-by:
myleott <myleott@fb.com> Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner. For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`. myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets. Pull Request resolved: https://github.com/pytorch/fairseq/pull/696 Differential Revision: D15214049 fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/707 Differential Revision: D15219014 Pulled By: myleott fbshipit-source-id: f38f2cf817d05e0871ff9084a810d109848e827c
-
Naman Goyal authored
Summary: Co-authored-by:
jingfeidu <jingfeidu@fb.com> 1) Adding `masked_lm` task for BERT like training. Code mostly taken from jingfeidu 's implementation. 2) Added `has_eos` option to `block_pair_dataset` for working with dataset that has been preprocessed with having `eos`. Depends on: https://github.com/pytorch/fairseq/pull/696 Pull Request resolved: https://github.com/pytorch/fairseq/pull/697 Differential Revision: D15214050 fbshipit-source-id: c179ce2d70e59d2ddc941b13ceda99d929878931
-
Maksym Del authored
Summary: Pass required "sample_key" argument to forward-backward call in semi-supervised task. Pull Request resolved: https://github.com/pytorch/fairseq/pull/706 Differential Revision: D15217957 Pulled By: pipibjc fbshipit-source-id: bf943d566c5caa67682dfb16ff8b7c432323cdba
-
Liezl Puzon authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/705 This adds functionality in fairseq to load a pretrained encoder or decoder from another pretrained model into the current model. Reviewed By: jmp84 Differential Revision: D15207084 fbshipit-source-id: 32a710ff77389928e20793c71d312863df9dd8ae
-
- 05 May, 2019 3 commits
-
-
Myle Ott authored
Reviewed By: chenyangyu1988 Differential Revision: D14784219 fbshipit-source-id: 273888d6e3d22a01d5e7edfbc786195e7b78efef
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/703 It's better to write one checkpoint and copy it, rather than repeatedly pickling the model via torch.save. Differential Revision: D15213778 fbshipit-source-id: 27dad39853b09dab7f0e11c030313019f035dbb0
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/695 Differential Revision: D15182613 Pulled By: myleott fbshipit-source-id: 4196346517d8e75ed9e903e9e01ab943d086f6f1
-
- 04 May, 2019 4 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/508 The previous version applied the temperature after the softmax. Fix that, and also generalize so it works with other search approaches. Pull Request resolved: https://github.com/pytorch/fairseq/pull/694 Differential Revision: D15175160 Pulled By: myleott fbshipit-source-id: cc87ff0e97a8a1dd37f9983163f58a8641155ab0
-
Myle Ott authored
Summary: It was tedious defining these, let's try just taking the first batch lazily instead. Pull Request resolved: https://github.com/pytorch/fairseq/pull/699 Differential Revision: D15188266 Pulled By: myleott fbshipit-source-id: a4c9f7ee3111278faaffa8a22ba91ed5f50e143d
-
Naman Goyal authored
Summary: We can later get rid off `BertLayerNorm` also, as I think the implementation of that is exactly same as `LayerNorm`. (will confirm with jingfeidu on that). But this should be drop and replace. Pull Request resolved: https://github.com/pytorch/fairseq/pull/702 Differential Revision: D15213116 Pulled By: myleott fbshipit-source-id: ba5c00e1129a4443ef5d3d8bebd0bb6c6ee3b188
-
Kritika Singh authored
Summary: See comment Reviewed By: jay-mahadeokar Differential Revision: D15070187 fbshipit-source-id: ffefca0effb2cc866ce6fa22a59d5419b592fb7b
-
- 03 May, 2019 2 commits
-
-
Yongqiang Wang authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairspeq/pull/2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/689 We found not raising OOM during trainer.train_step causes various issue, including NCCL hangs / gloo sync errors because gradient is not synced properly. Before we found the root cause, let's give users an option to raise OOMs. Reviewed By: jmp84 Differential Revision: D15170357 fbshipit-source-id: 3e15e4e111a8380612157955509c39821a216ec4
-
Naman Goyal authored
Summary: Added bert_large architecture Pull Request resolved: https://github.com/pytorch/fairseq/pull/698 Differential Revision: D15198698 Pulled By: myleott fbshipit-source-id: 1dc9e8d4c8c877d15afffe5fe581b4b93eefbc66
-
- 02 May, 2019 5 commits
-
-
Peng-Jen Chen authored
Summary: - Add learned positional embedding binary flag to masked LM model. - Add base arch config for masked LM model which sets all the binary parameters to False. Otherwise some of the binary flag parameters will always be override by config in `xlm_architecture` (e.g. encoder_learned_pos) Reviewed By: liezl200 Differential Revision: D15054487 fbshipit-source-id: d78827f352b9160a89c9dc4f45b9fce15a2f234d
-
Myle Ott authored
Summary: This should make rendezvous happen as lazily as possible. Pull Request resolved: https://github.com/pytorch/fairseq/pull/687 Differential Revision: D15151145 Pulled By: myleott fbshipit-source-id: d70816a85414c5d509a6b12e2b339b4736db2c88
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/693 Differential Revision: D15174831 fbshipit-source-id: 98688b1269ead5694e5116659ff64507d3c0d1c0
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/692 Differential Revision: D15174954 fbshipit-source-id: 1a7bff9aeed3e2cc658577be9d79e8c9f72314c2
-
Kritika Singh authored
Summary: Changes include: 1. Added get_normalized_probabilities to the encoder-only base class FairseqEncoderModel 2. Made CTCCriterion work for both batch_first (LSTMSubsampleEncoderModel) and batch_second (LSTMEncoderOnly) encoder types 3. Added tests for different encoder and CTC combinations. TODO: CTC still doesn't work for VGGLSTMEncoderModel so I have disabled that. Will debug and send out fix in another diff. Reviewed By: jay-mahadeokar Differential Revision: D15158818 fbshipit-source-id: acb484bad705c937d676d2c3dcde3e3562d68ed9
-
- 01 May, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/691 Differential Revision: D15172543 Pulled By: myleott fbshipit-source-id: f2b626ff7f5e95f0ddc83c105af7ab9d092a135e
-
taineleau authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/684 Differential Revision: D15154631 Pulled By: myleott fbshipit-source-id: 5e7dd9651d9ed239b60c51b9a11d08c80307d3ba
-