"src/vscode:/vscode.git/clone" did not exist on "b2b3b1a8ab83b020ecaf32f45de3ef23644331cf"
- 03 Oct, 2018 1 commit
-
-
Liezl Puzon authored
Summary: This generalizes BacktranslationDataset to allow us to use any SequenceGenerator class. For example, if we want to use this model in PyTorch Translate, we can pass the following to BacktraanslationDataset init: (1) a PyTorch Translate SequenceGenerator class as generator_class and (2) the appropriate args for initializing that class as kwargs. Reviewed By: xianxl Differential Revision: D10156552 fbshipit-source-id: 0495d825bf4727da96d0d9a40dc434135ff3486c
-
- 02 Oct, 2018 2 commits
-
-
Michael Auli authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/300 Differential Revision: D10154711 Pulled By: edunov fbshipit-source-id: 859d1ac59923b67c1547b6f7acb94f801b0c3318
-
Liezl Puzon authored
Summary: Using argparse Namespace hides the actual args that are expected and makes code harder to read. Note the difference in style for the args list def __init__( self, tgt_dataset, tgt_dict, backtranslation_model, unkpen, sampling, beam, max_len_a, max_len_b, ): instead of def __init__( self, tgt_dataset, tgt_dict, backtranslation_model, unkpen, sampling, beam, max_len_a, max_len_b, ): Reviewed By: dpacgopinath Differential Revision: D10152331 fbshipit-source-id: 6539ccba09d48acf23759996b7e32fb329b3e3f6
-
- 01 Oct, 2018 1 commit
-
-
alexeib authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/296 Differential Revision: D10121830 Pulled By: alexeib fbshipit-source-id: 1b73430bdfdcb20a9a6123abfca3472a0d307b3b
-
- 30 Sep, 2018 3 commits
-
-
Myle Ott authored
Summary: Changelog: - `90f52a1`: Support loading subsets of the data on each worker with the `--fix-batches-to-gpus` flag. This should fix #217 and #266. - `6eda0a9`: Update README for replicating the "Scaling Neural Machine Translation" paper - `b14c7cf`: Fallback to no_c10d backend for pytorch 0.4.1 (fixes #294) Pull Request resolved: https://github.com/pytorch/fairseq/pull/295 Differential Revision: D10121559 Pulled By: myleott fbshipit-source-id: 41c84d0ee4cdd113544b5d3aa38ae8b23acc2c27
-
myleott authored
-
myleott authored
-
- 25 Sep, 2018 18 commits
-
-
Myle Ott authored
Co-authored-by:liezl200 <lie@fb.com>
-
Sergey Edunov authored
-
Myle Ott authored
-
alexeib authored
-
Alexei Baevski authored
-
Myle Ott authored
-
Myle Ott authored
-
Sergey Edunov authored
-
Sergey Edunov authored
-
Myle Ott authored
-
Myle Ott authored
-
Stephen Roller authored
-
Myle Ott authored
-
Myle Ott authored
-
Sergey Edunov authored
- no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
-
Myle Ott authored
-
Stephen Roller authored
-
Stephen Roller authored
-
- 24 Sep, 2018 2 commits
-
-
Sergey Edunov authored
Update readme with WMT'18 model (#433)
-
Sergey Edunov authored
-
- 18 Sep, 2018 4 commits
-
-
Sergey Edunov authored
Oss master
-
Sergey Edunov authored
-
Sergey Edunov authored
-
Sergey Edunov authored
-
- 07 Sep, 2018 1 commit
-
-
Angela Fan authored
-
- 04 Sep, 2018 1 commit
-
-
Myle Ott authored
-
- 03 Sep, 2018 7 commits