Commits · df2f84ce619edddb80e720a56abc74d5490fed99 · OpenDAS / Fairseq

03 Dec, 2019 1 commit

Myle Ott authored Dec 03, 2019

Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd582)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e9)
- TransformerEncoder returns namedtuples instead of dict (27568a7e)

New features:
- Add `--fast-stat-sync` option (e1ba32aa)
- Add `--empty-cache-freq` option (315c463d)
- Support criterions with parameters (ba5f829f)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c99)
- Levenshtein Transformer (86857a58, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f2)
- Jointly Learning to Align and Translate with Transformer Models (1c667929)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef46)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5eaa)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcdad)
- CamemBERT: a French BERT (b31849aa)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564d)
- Cythonization of various dataloading components (4fc39538, ...)
- Don't project mask tokens for MLM training (718677eb)
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727

df2f84ce

26 Nov, 2019 1 commit

Make torch.hub interface automatically apply tokenization and BPE · cb6c67bc

Myle Ott authored Nov 25, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/926

Differential Revision: D18685772

Pulled By: myleott

fbshipit-source-id: 0f99d79ed6ee72e9d3ced786d75ab9504d0dfcf0

cb6c67bc

13 Nov, 2019 1 commit

Have `setup.py clean` remove compiled Cython files · 4d21c157

Myle Ott authored Nov 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/907

Differential Revision: D18480215

Pulled By: myleott

fbshipit-source-id: b02002f631f6d47380f309d4f464bd135d623280

4d21c157

02 Nov, 2019 1 commit

Fix building of docs · a0f75996

Myle Ott authored Nov 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340

Differential Revision: D18289455

Pulled By: myleott

fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378

a0f75996

27 Sep, 2019 1 commit

Levenshtein Transformer paper code · 86857a58

Changhan Wang authored Sep 27, 2019

Summary:
Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
* Added Levenshtein Transformer model, task and criterion class
* Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
* Add an option for prepending BOS to dictionary class and translation task class

Reviewed By: myleott

Differential Revision: D17297372

fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7

86857a58

03 Sep, 2019 1 commit

added cython to install_requires · 1f0f7cd8

Naman Goyal authored Sep 03, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/856

Reviewed By: myleott

Differential Revision: D17162411

Pulled By: myleott

fbshipit-source-id: e70ecc802398bbba2b5326e9700f2121c422fd18

1f0f7cd8

31 Aug, 2019 2 commits

Cleaner handling of numpy-based extensions in setup.py · 8d4588b1

Myle Ott authored Aug 31, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/853

Differential Revision: D17147879

Pulled By: myleott

fbshipit-source-id: b1f5e838533de62ade52fa82112ea5308734c70f

8d4588b1

Improve support for `python setup.py build_ext --inplace` · 746e59a2

Myle Ott authored Aug 31, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/852

Differential Revision: D17147452

Pulled By: myleott

fbshipit-source-id: 5fd9c7da3cc019c7beec98d41db1aef1329ee57a

746e59a2

27 Aug, 2019 2 commits

Minor cleanup for setup.py · d2410c42

Myle Ott authored Aug 27, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1078

Differential Revision: D17072514

Pulled By: myleott

fbshipit-source-id: 69a8c8c9cc7caa7e04c414329a5d79e6e1a6621c

d2410c42

installing numpy headers for cython · 396ff7f5

Naman Goyal authored Aug 27, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/848

Differential Revision: D17060283

fbshipit-source-id: c7e61cae76a0566cc3e2ddc3ab4d48f8dec9d777

396ff7f5

26 Aug, 2019 1 commit

fix cython dependency in the setup (#847) · 8a8c0691

Naman Goyal authored Aug 26, 2019

Summary:
Fixes broken build for `pytext` https://github.com/pytorch/fairseq/commit/4fc39538aec5141aa41f5d6d7dc0097e7c0f7b48

Earlier version of setup tools required `cython` to be installed before even starting setup.py. This one fixes it.
More details: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#180
and https://stackoverflow.com/questions/37471313/setup-requires-with-cython
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/847

Differential Revision: D16997450

fbshipit-source-id: 5f65026c228a1b94280ca73937078ee3e21ce4f8

8a8c0691

23 Aug, 2019 1 commit

Cythonize token block dataset (#834) · 4fc39538

Naman Goyal authored Aug 23, 2019

Summary:
Cythonized token block dataset code, it's `> 100x` faster. Token block for entire `bookwiki+CC+stories+openweb` is just ~`39.9` seconds.

TODO:
1) I think, I can make it 2x more faster.
2) cleanup.

EDIT History:
~~First pass at parellelizing `token_block_dataset`. The code feels somewhat complicated and cluttered.
This is 2-3x faster though on my tests on `bookwiki` dataset with both `complete` and `complete_doc` modes.
myleott Can you take a look for correctness as I am still not 100% sure that I am not missing corner cases.~~
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/834

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Test workflow: f133816198

Reviewed By: myleott

Differential Revision: D16970257

Pulled By: myleott

fbshipit-source-id: ec45a308193c9e9f3e7075336c15df4723228d6f

4fc39538

14 Aug, 2019 1 commit

v0.7.2 -> v0.8.0 (#1017) · ffffe04e

Myle Ott authored Aug 14, 2019

Summary:
Changelog:
- Relicensed under MIT license
- Add RoBERTa
- Add wav2vec
- Add WMT'19 models
- Add initial ASR code
- Changed torch.hub interface (`generate` renamed to `translate`)
- Add `--tokenizer` and `--bpe`
- f812e529: Renamed data.transforms -> data.encoders
- 654affc0: New Dataset API (optional)
- `47fd9852`: Deprecate old Masked LM components
- `5f78106a`: Set mmap as default dataset format and infer format automatically
- Misc fixes for sampling
- Misc fixes to support PyTorch 1.2
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017

Differential Revision: D16799880

Pulled By: myleott

fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7

ffffe04e

13 Aug, 2019 1 commit

Add fairseq-validate · d015d23a

Myle Ott authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/765

Differential Revision: D16763357

Pulled By: myleott

fbshipit-source-id: 758b03158e486ee82786e2d5bf4e46073b50c503

d015d23a

02 Aug, 2019 1 commit

Update READMEs for torch.hub · abb7ed4c

Myle Ott authored Aug 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/795

Differential Revision: D16620488

Pulled By: myleott

fbshipit-source-id: 1998a9ccd8816fc7f590861fb4898f910a36bc1e

abb7ed4c

30 Jul, 2019 1 commit

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

19 Jul, 2019 1 commit

v0.7.1 -> v0.7.2 (#891) · b002d009

Myle Ott authored Jul 19, 2019

Summary:
No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/891

Differential Revision: D16377132

Pulled By: myleott

fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7

b002d009

06 Jul, 2019 1 commit

Add specific compile flags for macOS (#862) · cc292afa

Louis MARTIN authored Jul 06, 2019

Summary:
Fairseq wouldn't install on macOS.
A workaround was found here: https://github.com/pytorch/fairseq/issues/289
This is now automatic in setup.py, maybe be there's a cleaner way to do it.

I checked that it compiles fine on Linux and macOS.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/862

Differential Revision: D16142105

Pulled By: myleott

fbshipit-source-id: 998ac7781d7a1ac047f4f9239c1fe16eab4be0dd

cc292afa

20 Jun, 2019 2 commits

v0.7.1: fix PyPI setup and tests · 881381cf

Myle Ott authored Jun 20, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818

Differential Revision: D15916265

Pulled By: myleott

fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86

881381cf

v0.7.0 (#817) · bd710e75

Myle Ott authored Jun 19, 2019

Summary:
Notable (possibly breaking) changes:
- d45db804: Remove checkpoint utility functions from utils.py into checkpoint_utils.py
- f2563c21: Move LM definitions into separate files
- dffb1674: Updates to model API:
  - `FairseqModel` -> `FairseqEncoderDecoderModel`
  - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
  - `encoder_out_dict` -> `encoder_out`
  - rm unused `remove_head` functions
- 34726d56: Move `distributed_init` into `DistributedFairseqModel`
- cf17068a: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU)
- d45db804: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed`
- 96ac28d3: Rename `--sampling-temperature` -> `--temperature`
- fc1a19a3: Deprecate dummy batches
- a1c997bd: Add memory mapped datasets
- 0add50c2: Allow cycling over multiple datasets, where each one becomes an "epoch"

Plus many additional features and bugfixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/817

Differential Revision: D15913844

Pulled By: myleott

fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b

bd710e75

11 Jun, 2019 1 commit

Python3.5 compat (#794) · a8f28ecb

Bairen Yi authored Jun 11, 2019

Summary:
See #467. Ping myleott to review.

This is a work-related contribution. Ping lark to review.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/794

Differential Revision: D15756816

Pulled By: myleott

fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b

a8f28ecb

16 Mar, 2019 1 commit

Update setup.py · 66f033e6

Myle Ott authored Mar 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/580

Differential Revision: D14494390

Pulled By: myleott

fbshipit-source-id: 524cc16a106f2af630357e2ebdf7dde35fa7d494

66f033e6

15 Mar, 2019 1 commit

0.6.1 -> 0.6.2 (#577) · e6422528

Myle Ott authored Mar 15, 2019

Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f6: Add mixture of experts code from Shen et al. (2019)
- 00493490: Add example for multilingual training
- 48d9afbe: Speed improvements, including fused operators from apex
- 44d27e64: Add Tensorboard support
- d17fa851: Add Adadelta optimizer
- 9e1c880f: Add `FairseqEncoderModel`
- b65c579b: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178e: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7

e6422528

28 Feb, 2019 1 commit

Add sacrebleu to requirements · 139e3a3c

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/542

Differential Revision: D14258895

Pulled By: myleott

fbshipit-source-id: 950a840e1d001a472be8d4737c9e4de5224137b3

139e3a3c

22 Feb, 2019 1 commit

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

09 Feb, 2019 1 commit

Add fairseq to PyPI (#495) · fbd4cef9

Myle Ott authored Feb 08, 2019

Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235

fbd4cef9

05 Feb, 2019 1 commit

Add standalone binaries · 829bd8ce

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

25 Sep, 2018 1 commit

Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 · 1082ba35

Sergey Edunov authored Sep 06, 2018

- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq

1082ba35

15 Jun, 2018 1 commit
- 0.4.0 -> 0.5.0 · d62a8651
  Myle Ott authored Jun 14, 2018
  
  d62a8651
02 Mar, 2018 1 commit
- Use ATen built-in conv_tbc method (#66) · 56f9ec3c
  James Reed authored Mar 01, 2018
```
Remove custom ConvTBC code
```
  56f9ec3c
27 Feb, 2018 1 commit

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206

22 Jan, 2018 1 commit
- Add explicit dimension to softmax calls · 18a6d85c
  Myle Ott authored Dec 26, 2017
  
  18a6d85c
12 Nov, 2017 1 commit

Version 0.1.0 -> 0.2.0 · 13a3c811

Myle Ott authored Nov 08, 2017

Release notes:
- 5c7f4954: Added simple LSTM model with input feeding and attention
- 6e4b7e22: Refactored model definitions and incremental generation to be cleaner
- 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py
- 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses
           if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the
           top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce
           strictly greedy outputs.
- 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We
           now left-pad the source and right-pad the target. This should not effect existing trained models, but may
           change (usually improves) the quality of new models.
- f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of
           tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients
           by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens).
- c6d6256b: Add `--log-format` option and JSON logger

13a3c811

24 Oct, 2017 1 commit
- Fix for building under clang: specify C++ build and use C++ linkage (#42) · 30953d8b
  James Reed authored Oct 24, 2017
  
  30953d8b
19 Oct, 2017 1 commit
- Fix flake8 warnings · cb0d7b2a
  Louis Martin authored Sep 25, 2017
  
  cb0d7b2a
15 Sep, 2017 1 commit
- Initial commit · e734b0fa
  Sergey Edunov authored Sep 14, 2017
  
  e734b0fa