Commits · 829bd8ce5fc336991c828f178ac5f231250ef77d · OpenDAS / Fairseq

05 Feb, 2019 1 commit

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

25 Sep, 2018 1 commit

Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 · 1082ba35

Sergey Edunov authored Sep 06, 2018

- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq

1082ba35

15 Jun, 2018 1 commit
- 0.4.0 -> 0.5.0 · d62a8651
  Myle Ott authored Jun 14, 2018
  
  d62a8651
02 Mar, 2018 1 commit
- Use ATen built-in conv_tbc method (#66) · 56f9ec3c
  James Reed authored Mar 01, 2018
```
Remove custom ConvTBC code
```
  56f9ec3c
27 Feb, 2018 1 commit

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206

22 Jan, 2018 1 commit
- Add explicit dimension to softmax calls · 18a6d85c
  Myle Ott authored Dec 26, 2017
  
  18a6d85c
12 Nov, 2017 1 commit

Version 0.1.0 -> 0.2.0 · 13a3c811

Myle Ott authored Nov 08, 2017

Release notes:
- 5c7f4954: Added simple LSTM model with input feeding and attention
- 6e4b7e22: Refactored model definitions and incremental generation to be cleaner
- 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py
- 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses
           if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the
           top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce
           strictly greedy outputs.
- 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We
           now left-pad the source and right-pad the target. This should not effect existing trained models, but may
           change (usually improves) the quality of new models.
- f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of
           tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients
           by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens).
- c6d6256b: Add `--log-format` option and JSON logger

13a3c811

24 Oct, 2017 1 commit
- Fix for building under clang: specify C++ build and use C++ linkage (#42) · 30953d8b
  James Reed authored Oct 24, 2017
  
  30953d8b
19 Oct, 2017 1 commit
- Fix flake8 warnings · cb0d7b2a
  Louis Martin authored Sep 25, 2017
  
  cb0d7b2a
15 Sep, 2017 1 commit
- Initial commit · e734b0fa
  Sergey Edunov authored Sep 14, 2017
  
  e734b0fa