- 16 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/580 Differential Revision: D14494390 Pulled By: myleott fbshipit-source-id: 524cc16a106f2af630357e2ebdf7dde35fa7d494
-
- 15 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - 998ba4f: Add language models from Baevski & Auli (2018) - 4294c4f6: Add mixture of experts code from Shen et al. (2019) - 00493490: Add example for multilingual training - 48d9afbe: Speed improvements, including fused operators from apex - 44d27e64: Add Tensorboard support - d17fa851: Add Adadelta optimizer - 9e1c880f: Add `FairseqEncoderModel` - b65c579b: Add `FairseqTask.inference_step` to modularize generate.py - 2ad1178e: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7
-
- 28 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/542 Differential Revision: D14258895 Pulled By: myleott fbshipit-source-id: 950a840e1d001a472be8d4737c9e4de5224137b3
-
- 22 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/351 This makes it easier for tasks to plugin to generate.py/interactive.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/520 Differential Revision: D14183881 Pulled By: myleott fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33
-
- 09 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
-
- 05 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489 Differential Revision: D13956810 Pulled By: myleott fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514
-
- 25 Sep, 2018 1 commit
-
-
Sergey Edunov authored
- no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
-
- 15 Jun, 2018 1 commit
-
-
Myle Ott authored
-
- 02 Mar, 2018 1 commit
-
-
James Reed authored
Remove custom ConvTBC code
-
- 27 Feb, 2018 1 commit
-
-
Myle Ott authored
This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes. Changes: - c7033ef: add support for distributed training! See updated README for usage. - e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc. - 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf - 90c2973 and 1da6265: improve unit test coverage
-
- 22 Jan, 2018 1 commit
-
-
Myle Ott authored
-
- 12 Nov, 2017 1 commit
-
-
Myle Ott authored
Release notes: - 5c7f4954: Added simple LSTM model with input feeding and attention - 6e4b7e22: Refactored model definitions and incremental generation to be cleaner - 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py - 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce strictly greedy outputs. - 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We now left-pad the source and right-pad the target. This should not effect existing trained models, but may change (usually improves) the quality of new models. - f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens). - c6d6256b: Add `--log-format` option and JSON logger
-
- 24 Oct, 2017 1 commit
-
-
James Reed authored
-
- 19 Oct, 2017 1 commit
-
-
Louis Martin authored
-
- 15 Sep, 2017 1 commit
-
-
Sergey Edunov authored
-