Commits · 63dc27e89f33b2e2ef2e4a374c1c9e21612b0f0b · OpenDAS / Fairseq

13 Nov, 2017 3 commits
- Flush non-TTY logging output after each log interval · 63dc27e8
  Myle Ott authored Nov 12, 2017
  
  63dc27e8
- Fix Flake8 · 557b99d1
  Myle Ott authored Nov 12, 2017
  
  557b99d1
- Fallback to `--log-format=simple` for non-TTY terminals · 1b42c8c4
  Myle Ott authored Nov 12, 2017
  
  1b42c8c4
12 Nov, 2017 7 commits

Merge pull request #54: Version 0.1.0 -> 0.2.0 · e5b3c1f4

Myle Ott authored Nov 11, 2017

Release notes:
- 5c7f4954: Added simple LSTM model with input feeding and attention
- 6e4b7e22: Refactored model definitions and incremental generation to be cleaner
- 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py
- 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses
           if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the
           top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce
           strictly greedy outputs.
- 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We
           now left-pad the source and right-pad the target. This should not effect existing trained models, but may
           change (usually improves) the quality of new models.
- f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of
           tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients
           by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens).
- c6d6256b: Add `--log-format` option and JSON logger

e5b3c1f4

Version 0.1.0 -> 0.2.0 · 13a3c811

Myle Ott authored Nov 08, 2017

Release notes:
- 5c7f4954: Added simple LSTM model with input feeding and attention
- 6e4b7e22: Refactored model definitions and incremental generation to be cleaner
- 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py
- 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses
           if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the
           top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce
           strictly greedy outputs.
- 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We
           now left-pad the source and right-pad the target. This should not effect existing trained models, but may
           change (usually improves) the quality of new models.
- f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of
           tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients
           by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens).
- c6d6256b: Add `--log-format` option and JSON logger

13a3c811

Fix all-reduce for new versions of PyTorch · d7d82715

Myle Ott authored Nov 09, 2017

We previously assumed that once a model parameter's gradient buffer was allocated, it stayed fixed during training.
However, this assumption is violated in recent versions of PyTorch (i.e., the gradient buffer may be reallocated during
training), and it's no longer a safe assumption to make.

This is primarily relevant when we do the all-reduce, since we all-reduce a flattened (i.e., contiguous) copy of the
gradients. We can make this more robust by copying the result of the all-reduce back into the model parameter's gradient
buffers after each update. Intra-device copies are cheap, so this doesn't affect performance.

d7d82715

Fixes for `--log-format` · 83053f97
Myle Ott authored Nov 11, 2017

83053f97
Fix max_positions_valid in train.py · 55a989e8
Myle Ott authored Nov 11, 2017

55a989e8
Add `--log-format` option and JSON logger · c6d6256b
Myle Ott authored Nov 11, 2017

c6d6256b
Don't call forward directly (prefer module(x) to module.forward(x)) · 50fdf591
Myle Ott authored Nov 11, 2017

50fdf591

09 Nov, 2017 1 commit
- Only save most recent optimizer state in checkpoints (#53) · ba5d7dcd
  Myle Ott authored Nov 08, 2017
  
  ba5d7dcd
08 Nov, 2017 24 commits
- Add LSTM · 5c7f4954
  Myle Ott authored Nov 08, 2017
  
  5c7f4954
- Rename LabelSmoothedCrossEntropy to LabelSmoothedNLLLoss · e1f49695
  Myle Ott authored Nov 07, 2017
  
  e1f49695
- Revert `dim` in `F.softmax` for backwards compatibility · b1dfd39e
  Myle Ott authored Nov 07, 2017
  
  b1dfd39e
- Replace unk with original string · 42a0150c
  Louis Martin authored Nov 06, 2017
```
* Add <eos> for unk replacement
* Add IndexedRawTextDataset to load raw text files
* Replace unk with original string
* Add load_raw_text_dataset() and --output-format
* Move has_binary_files to data.py
```
  42a0150c
- Loop over evaluation dataloader in descending order · 7d44181d
  Myle Ott authored Nov 04, 2017
  
  7d44181d
- Add --max-sentence option for batching based on # sentences · f442f896
  Myle Ott authored Nov 04, 2017
  
  f442f896
- Update README with interactive.py and fix it · 2ef422f6
  Louis Martin authored Nov 02, 2017
  
  2ef422f6
- Add dim to F.softmax calls · 49553018
  Myle Ott authored Nov 01, 2017
  
  49553018
- Fix flake8 lint · 3278e854
  Myle Ott authored Nov 01, 2017
  
  3278e854
- Use `--lrshrink` as the reduction factor in ReduceLROnPlateau · 56c28099
  Myle Ott authored Nov 01, 2017
  
  56c28099
- Fix interactive.py · e21901e8
  Myle Ott authored Oct 31, 2017
  
  e21901e8
- Improvements to data loader · 8f9dd964
  Myle Ott authored Oct 31, 2017
  
  8f9dd964
- Left pad source and right pad target · 97d7fcb9
  Myle Ott authored Oct 31, 2017
  
  97d7fcb9
- Refactor generation · 7ae79c12
  Louis Martin authored Oct 30, 2017
```
* Split generate.py to generate.py and interactive.py and refactor code

The main motivation behind these changes is to try to decorrelate use
cases in order to implement future improvements such as unk replacement
with original string during evaluation on test and writing predictions
to output file.
The previous implementation worked well but I found it difficult to
integrate these future improvements.

* Add --replace-unk arg to be used without align dict

Replacing <unk> tokens can be beneficial even without an alignment
dictionary.
```
  7ae79c12
- Upgrade args with max_source_positions and max_target_positions · 8df95dcc
  Myle Ott authored Oct 27, 2017
  
  8df95dcc
- Fix seed so that data is properly shuffled between epochs · 5ef59abd
  Myle Ott authored Oct 26, 2017
  
  5ef59abd
- Fix call to non-existing to_string method · ae6f6d56
  Louis Martin authored Oct 26, 2017
  
  ae6f6d56
- Support different max_source_positions and max_target_positions · 2f781c5a
  Myle Ott authored Oct 25, 2017
  
  2f781c5a
- Added -unkpen flag to generate.py following logic of Lua/Torch version · 5fe8ea46
  Michael Auli authored Oct 25, 2017
  
  5fe8ea46
- Refactor model definitions · 6e4b7e22
  Myle Ott authored Oct 25, 2017
```
* Move some functionality out of FConvModel into FairseqModel base class
* Move incremental decoding functionality into FairseqIncrementalDecoder module
* Refactor positional embeddings to be more specific to FConvModel
```
  6e4b7e22
- Add `--curriculum` option · 820f796f
  Myle Ott authored Oct 25, 2017
  
  820f796f
- Support custom dictionary in preprocess.py · 3af8ec82
  Myle Ott authored Oct 23, 2017
  
  3af8ec82
- Fix description for `--sample-without-replacement` option · 415bf630
  Myle Ott authored Oct 23, 2017
  
  415bf630
- Only consider EOS in beam search if it's among top-k candidates · 19a3865d
  Myle Ott authored Oct 23, 2017
  
  19a3865d
02 Nov, 2017 1 commit
- Force UTF-8 encoding for dictionary files ( #41 ) · f6ac1aec
  Sergey Edunov authored Nov 01, 2017
  
  f6ac1aec
01 Nov, 2017 1 commit
- Update README with note about Docker (#49) · bb3be24d
  Myle Ott authored Nov 01, 2017
  
  bb3be24d
24 Oct, 2017 1 commit
- Fix for building under clang: specify C++ build and use C++ linkage (#42) · 30953d8b
  James Reed authored Oct 24, 2017
  
  30953d8b
19 Oct, 2017 2 commits
- Merge pull request #33 from facebookresearch/oss-merge-internal · c83efd21
  Myle Ott authored Oct 19, 2017
```
Changes:
Add support for NCCL v2
Add support for additional optimizers
SequenceGenerator returns attention matrix
Misc bugfixes (e.g., fixes #32) and cleanup
```
  c83efd21
- Set seed after each epoch to improve consistency when resuming · 104cead1
  Myle Ott authored Oct 19, 2017
  
  104cead1