Commits · 7fc9a3be80e8417bc177df9f8717aec4ae53aacb · OpenDAS / Fairseq

11 Mar, 2019 1 commit

Matt Le authored Mar 11, 2019

Summary: This allows one to call fairseq_cli functions from within python without dispatching to bash.

Reviewed By: myleott

Differential Revision: D14404719

fbshipit-source-id: 044eb652045bb15fc40e72ecbaf6fb10df9f8c61

7fc9a3be

28 Feb, 2019 1 commit

Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) · f296824f

Vladimir Karpukhin authored Feb 28, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8

f296824f

22 Feb, 2019 1 commit

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

16 Feb, 2019 1 commit

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

05 Feb, 2019 1 commit

Add standalone binaries · 829bd8ce

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

30 Jan, 2019 1 commit

Merge internal changes (#483) · 42be3ebd

Myle Ott authored Jan 30, 2019

Summary:
Changelog:
- `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU.
- `0d76427`: fix assertion error when training language model with dataset containing empty sentences
- minor bug and style fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/483

Differential Revision: D13867899

Pulled By: myleott

fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda

42be3ebd

16 Jan, 2019 1 commit

FIX: '--user-dir' on multi-gpu (#449) · 7853818c

Davide Caroselli authored Jan 16, 2019

Summary:
On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`.

This pull request fixes this problem: custom module import in now explicit on every `main()` function.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/449

Differential Revision: D13676922

Pulled By: myleott

fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6

7853818c

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

26 Dec, 2018 1 commit

Merge internal changes (#422) · 8ce6499d

Myle Ott authored Dec 26, 2018

Summary:
- 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
- 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
Pull Request resolved: https://github.com/pytorch/fairseq/pull/422

Differential Revision: D13548445

Pulled By: myleott

fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9

8ce6499d

03 Sep, 2018 4 commits
- Add documentation · 6381cc97
  Myle Ott authored Sep 03, 2018
  
  6381cc97
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
- Factor out search logic in SequenceGenerator · ef43da72
  Myle Ott authored Aug 09, 2018
  
  ef43da72
25 Jul, 2018 2 commits
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
- disable printing alignment by default (for perf) and add a flag to enable it · 89e19d42
  Alexei Baevski authored Jul 10, 2018
  
  89e19d42
08 Jul, 2018 1 commit
- add model override argument from load_ensemble_for_inference at generation... · 30ef667d
  Angela Fan authored Jul 07, 2018
```
add model override argument from load_ensemble_for_inference at generation time, updating readme for stories
```
  30ef667d
21 Jun, 2018 1 commit
- Support FP16 during inference · 930c9580
  Myle Ott authored Jun 19, 2018
  
  930c9580
15 Jun, 2018 10 commits

Change --path to be colon-separated instead of comma-separated · 16caed31
Myle Ott authored Jun 14, 2018

16caed31

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Unify various sharding into ShardedIterator · 24d7de44
Myle Ott authored May 30, 2018

24d7de44
Migrate all binaries to use options.parse_args_and_arch · 76b5ecab
Myle Ott authored May 30, 2018

76b5ecab
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
added multiscale gated self attention layer with multiple heads, and pretrained fusion models · b59815bc
Angela Fan authored May 09, 2018

b59815bc

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

also report sentence/s timing when generating · bf47b956
Alexei Baevski authored May 17, 2018

bf47b956
allow specifying max_tokens for generation · 67af40c9
Alexei Baevski authored May 15, 2018

67af40c9

remove completed sentences from batch · 2a84f46b

Alexei Baevski authored Apr 12, 2018

remove completed sentences from batch and allow batching uneven lengths (with fixes to make padded sequences work correctly in all models)

2a84f46b

02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

05 Mar, 2018 1 commit
- Allow more flexible pre-processing and generation (#227) · b03b53b4
  Sergey Edunov authored Mar 05, 2018
```
* Allow more flexible pre-processing and generation

* Addressing CR comments

* small fix
```
  b03b53b4
01 Mar, 2018 1 commit
- More fixes for recent PyTorch (incl. topk issue) (#113) · 3bde773d
  Myle Ott authored Mar 01, 2018
  
  3bde773d
27 Feb, 2018 3 commits

Add support to prefixes (#221) · 866b27d5

Dario Pavllo authored Feb 23, 2018

* Add prefix

* Fixes

* Keep original scores with prefix

* Improve prefix code

* Replace 'repeat' with 'expand'

866b27d5

More unit test fixes · 0d90e35f
Myle Ott authored Feb 15, 2018

0d90e35f

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206

22 Jan, 2018 2 commits
- Support deprecation of volatile Variables in latest PyTorch · 7da4e062
  Myle Ott authored Dec 22, 2017
  
  7da4e062
- Add support for sharded generation · 9f7c3ec6
  Myle Ott authored Dec 21, 2017
  
  9f7c3ec6
13 Nov, 2017 1 commit
- Fallback to `--log-format=simple` for non-TTY terminals · 1b42c8c4
  Myle Ott authored Nov 12, 2017
  
  1b42c8c4
12 Nov, 2017 1 commit
- Add `--log-format` option and JSON logger · c6d6256b
  Myle Ott authored Nov 11, 2017
  
  c6d6256b
08 Nov, 2017 3 commits
- Replace unk with original string · 42a0150c
  Louis Martin authored Nov 06, 2017
```
* Add <eos> for unk replacement
* Add IndexedRawTextDataset to load raw text files
* Replace unk with original string
* Add load_raw_text_dataset() and --output-format
* Move has_binary_files to data.py
```
  42a0150c
- Add --max-sentence option for batching based on # sentences · f442f896
  Myle Ott authored Nov 04, 2017
  
  f442f896
- Update README with interactive.py and fix it · 2ef422f6
  Louis Martin authored Nov 02, 2017
  
  2ef422f6