Commits · e98bf7e64f0a224e72b7b0ae0bf5733e8c7fea3e · OpenDAS / Fairseq

08 Nov, 2019 1 commit

Move fb_pathmgr registration out of train.py · e98bf7e6

Myle Ott authored Nov 08, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/903

Reviewed By: sujitoc

Differential Revision: D18327653

fbshipit-source-id: 739ddbaf54862acdf7b4f1bc3ad538bde5ae00fd

e98bf7e6

24 Oct, 2019 1 commit

Reset both WPS and UPS on first minibatch (#891) · 39faa0a4

Jerry Ma authored Oct 23, 2019

Summary:
Makes more sense to reset either both meters or neither of them.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/891

Differential Revision: D18109027

Pulled By: jma127

fbshipit-source-id: f63baed9a6b928a6f591a76e69ef6e9c524e4398

39faa0a4

12 Oct, 2019 1 commit

Added option to save checkpoints using Path Manager. · d80ad54f

Sujit Verma authored Oct 11, 2019

Summary: Added option to save checkpoints using Path Manager.

Reviewed By: hudeven

Differential Revision: D17392754

fbshipit-source-id: 4b8e556ef8455a1548e5a083d779ed809cd785be

d80ad54f

27 Sep, 2019 1 commit

Levenshtein Transformer paper code · 86857a58

Changhan Wang authored Sep 27, 2019

Summary:
Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
* Added Levenshtein Transformer model, task and criterion class
* Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
* Add an option for prepending BOS to dictionary class and translation task class

Reviewed By: myleott

Differential Revision: D17297372

fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7

86857a58

20 Sep, 2019 1 commit

added multilingual masked LM training (#849) · 32335404

Naman Goyal authored Sep 20, 2019

Summary:
The multilingual-RoBERTa training is working with aconneau XLM data.

Two pieces remaining:

1) `XLM` limits batch to be from same language, I am not 100% sure about the reason for that, but should be easy to implement, basically we can add `batch_by_size_and_language` instead of default `batch_by_size` function. If it's not critical, I would want to leave it out as it keeps the code very clean and simple.

2) `sample_ratio` in `ConcatDataset` works with `int` by tiling the datasets based on ratio. Currently I am handling it by sounding off the ratio to `first decimal` and then multiplying by `10`. We can see if some such simple heuristics are good enough, there are other options (we can talk about them offline).
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/849

Differential Revision: D17162460

fbshipit-source-id: d967f3d872f7a1f0aa4ea418bd362b68af9e432f

32335404

31 Aug, 2019 1 commit

Improve support for `python setup.py build_ext --inplace` · 746e59a2

Myle Ott authored Aug 31, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/852

Differential Revision: D17147452

Pulled By: myleott

fbshipit-source-id: 5fd9c7da3cc019c7beec98d41db1aef1329ee57a

746e59a2

30 Aug, 2019 1 commit

set numpy seed explicitly + other minor fixes (#850) · 4a7cd582

alexeib authored Aug 30, 2019

Summary:
not setting the numpy seed explicitly at the beginning was an extremely annoying bug to find. it it caused different gpus to have a different view of data if some randomization was used in the dataset (e.g. subsample dataset)
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/850

Differential Revision: D17085006

Pulled By: alexeib

fbshipit-source-id: 62bb2116369fb703df878e6bc24c06f1ea4e75a0

4a7cd582

29 Aug, 2019 1 commit

Fix multi-gpu training (fixes #1088) · 0a96d22f

Myle Ott authored Aug 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1089

Differential Revision: D17108918

Pulled By: myleott

fbshipit-source-id: 818c77a5bbf3b146028991aca64d79b93f144b28

0a96d22f

27 Aug, 2019 1 commit

wav2vec everstore support fix · 3ab8e0fd

Alexei Baevski authored Aug 26, 2019

Summary: fixes some merge issues that prevented wav2vec from training properly

Reviewed By: myleott

Differential Revision: D16981120

fbshipit-source-id: cad39aaf2f44daabcbafe7b4e8735d055b3842a7

3ab8e0fd

21 Aug, 2019 1 commit

Multiset (#838) · a2f5361d

alexeib authored Aug 21, 2019

Summary:
Adds ability to tag individual examples with the names of their datasets, along with some minor miscellaneous fixes and improvements
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/838

Differential Revision: D16919175

Pulled By: alexeib

fbshipit-source-id: 4bf493299645bae63f3ee6382e15f18a9f73666c

a2f5361d

30 Jul, 2019 2 commits

1) replaced fstring 2) fixed error from max-positions arg · 3b2cecda

Naman Goyal authored Jul 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/787

Differential Revision: D16562052

fbshipit-source-id: 640e30b2378ec917d60092558d3088a77f9741cb

3b2cecda

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

29 Jul, 2019 1 commit

adding glue data preprocessing scripts (#771) · 138dc8e4

Naman Goyal authored Jul 29, 2019

Summary:
1) Added glue data pre-processing script.
2) updated README with usage.

TODO:
1) releasing fairseq dictionary and remove hardcoded path.
2) remove hard-coded path for bpe-encoding,

myleott what do you recommend for above TODOs?
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/771

Reviewed By: myleott

Differential Revision: D16547679

Pulled By: myleott

fbshipit-source-id: 6a6562d9b6215523d048fdf3daee63ffac21e231

138dc8e4

24 Jul, 2019 1 commit

check save_dir before beginning training · b49ea81c

Spencer Poff authored Jul 24, 2019

Summary: I sadly discovery that my checkpoint directory wasn't globally readable after 8 hours of training. Adding this check at the beginning of train loop to keep that from happening again!

Reviewed By: myleott

Differential Revision: D16455394

fbshipit-source-id: 35959aa058150b2afb63710c468d01ebc8a12b0c

b49ea81c

02 Jul, 2019 1 commit

add --max-tokens-valid option for validation · bccfddbb

Xutai Ma authored Jul 01, 2019

Summary: Add the max-token-valid option. Sometime a separate max batch tokens for validation may be helpful, for example when there is a long sequence in validation set thats larger than max_tokens (it's rare in MT but could happen in ASR or AST).

Reviewed By: myleott

Differential Revision: D16076951

fbshipit-source-id: ae7f4218594580b9450a8196d7afa1e7e2018aee

bccfddbb

30 Jun, 2019 1 commit

Add additional options for configuring writing of checkpoints · 89e077c3

Myle Ott authored Jun 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/697

Differential Revision: D16068465

Pulled By: myleott

fbshipit-source-id: c2563c3c682e7e8406e6d7c8e895d8afbec551eb

89e077c3

26 Jun, 2019 1 commit

FIx dataset loading when there are multiple valid subsets (#835) · 8b514b9f

Liang Wang authored Jun 26, 2019

Summary:
When we have multiple valid subsets, say `valid`, `valid1` and `valid2`, if `combine=True` holds, when loading `valid` subset, it will try to locate and load `valid`, `valid1`, `valid2`... and then combine them into one dataset. Set `combine` to `False` solves this issue.

In my experiment, I have 3 valid subsets with 3000, 5000 and 8701 examples, with argument `--valid-subset valid,valid1,valid2`, the log is as follows:

```
......
| ./mix_data/bin valid src-trg 3000 examples
| ./mix_data/bin valid1 src-trg 5000 examples
| ./mix_data/bin valid2 src-trg 7801 examples
| ./mix_data/bin valid1 src-trg 5000 examples
| ./mix_data/bin valid2 src-trg 7801 examples
......
```

As shown above, `valid1` and `valid2` subsets are incorrectly loaded twice.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/835

Differential Revision: D16006343

Pulled By: myleott

fbshipit-source-id: ece7fee3a00f97a6b3409defbf7f7ffaf0a54fdc

8b514b9f

21 May, 2019 1 commit

Don't load training set twice · 4604b4a5

Myle Ott authored May 21, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/595

Differential Revision: D15428242

Pulled By: myleott

fbshipit-source-id: 3cec83a2353498a4802398eba8bcb1aefaf6d5c4

4604b4a5

20 May, 2019 2 commits

Add --disable-validation · b71f8f45

Myle Ott authored May 20, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/592

Differential Revision: D15415499

Pulled By: myleott

fbshipit-source-id: 87ba09b9b38501daebd95bbf28815e048c78f9a3

b71f8f45

Fix for tasks that don't define args.data · 5aebd096

Myle Ott authored May 20, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/591

Differential Revision: D15415490

Pulled By: myleott

fbshipit-source-id: c45df5f3b5327911e2c9b11642e7da2e8bb835dc

5aebd096

17 May, 2019 1 commit

Clean up sharded train iterator · 3bfbb49b

Myle Ott authored May 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586

Differential Revision: D15372949

Pulled By: myleott

fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da

3bfbb49b

14 May, 2019 1 commit

Move save/load checkpoint functions to utils · cd1e5c09

Dmytro Okhonko authored May 14, 2019

Summary:
Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py
Move `get_perplexity` from train.py to utils.py.
This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library.

Reviewed By: myleott

Differential Revision: D15289607

fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1

cd1e5c09

08 May, 2019 2 commits

Cleanup LM + Flake8 · f2563c21

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720

Differential Revision: D15259091

Pulled By: myleott

fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e

f2563c21

bugfix data not in args · 6a7eb6ce

Jay Mahadeokar authored May 07, 2019

Summary:
D15214049 introduced a bug such that if a tasks args does not contain data, then it will give error
```
File "/data/users/jaym/fbsource/fbcode/buck-out/dev/gen/deeplearning/projects/fairspeq/train#link-tree/train.py", line 119, in reload_train
   if len(args.data.split(":")) == 1:
AttributeError: 'Namespace' object has no attribute 'data'
```

This diff checks if data is in args to avoid above error.

Reviewed By: myleott, jmp84

Differential Revision: D15253373

fbshipit-source-id: 14fb9ad878ee50f1b7583349bb17e29c03c40815

6a7eb6ce

06 May, 2019 1 commit

allowing sharded dataset (#696) · 0add50c2

Naman Goyal authored May 06, 2019

Summary:
Co-authored-by: myleott <myleott@fb.com>

Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner.

For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`.

myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/696

Differential Revision: D15214049

fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013

0add50c2

05 May, 2019 2 commits

Speed up saving checkpoints (#703) · 437c2386

Myle Ott authored May 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/703

It's better to write one checkpoint and copy it, rather than repeatedly pickling the model via torch.save.

Differential Revision: D15213778

fbshipit-source-id: 27dad39853b09dab7f0e11c030313019f035dbb0

437c2386

Initialize distributed using multiproc with all visible GPUs · cf17068a

Myle Ott authored May 04, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/695

Differential Revision: D15182613

Pulled By: myleott

fbshipit-source-id: 4196346517d8e75ed9e903e9e01ab943d086f6f1

cf17068a

04 May, 2019 1 commit

Deprecate dummy_batch (#699) · fc1a19a3

Myle Ott authored May 04, 2019

Summary:
It was tedious defining these, let's try just taking the first batch lazily instead.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/699

Differential Revision: D15188266

Pulled By: myleott

fbshipit-source-id: a4c9f7ee3111278faaffa8a22ba91ed5f50e143d

fc1a19a3

02 May, 2019 2 commits

Move distributed_init into DistributedFairseqModel (#687) · 34726d56

Myle Ott authored May 02, 2019

Summary:
This should make rendezvous happen as lazily as possible.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/687

Differential Revision: D15151145

Pulled By: myleott

fbshipit-source-id: d70816a85414c5d509a6b12e2b339b4736db2c88

34726d56

Validate on all sets based on --save-interval-updates · fb18be00

Myle Ott authored May 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/693

Differential Revision: D15174831

fbshipit-source-id: 98688b1269ead5694e5116659ff64507d3c0d1c0

fb18be00

30 Apr, 2019 1 commit

Merge internal changes (#654) · d45db804

Myle Ott authored Apr 29, 2019

Summary:
- Add --add-bos-token option to LM task
- Cleanup utils.py and options.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/654

Differential Revision: D15041794

Pulled By: myleott

fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e

d45db804

24 Apr, 2019 1 commit

Don't reload best validation loss when using --reset-optimizer · 0020477a

Myle Ott authored Apr 24, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/661

Differential Revision: D15068312

Pulled By: myleott

fbshipit-source-id: 1216835fd4c7f83ea5e350bff83901c93ac57447

0020477a

15 Apr, 2019 1 commit

fix checkpoint timer (#634) · de8aeab5

freewym authored Apr 15, 2019

Summary:
If arg.keep_interval_updates or args.keep_last_epochs > 0, `checkpoints` would refer to a list of checkpoint files to be removed, which can be empty. So moved the logging code to the right position.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/634

Differential Revision: D14933655

Pulled By: myleott

fbshipit-source-id: 68182ee99d9701e1536833d31e0a7c5d2eb2d679

de8aeab5

09 Apr, 2019 1 commit

Fix save_dir creation while training on multiple nodes (#626) · 94e9d77c

Kartikay Khandelwal authored Apr 09, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/626

While training a model on multiple GPUs, the current fairseq train workflow fails while creating the directory from which to load a checkpoint. This seems to be happening because multiple nodes attempt to create the same directory thus causing some weird interaction with os.makedirs option "exist_ok=True". Fixing this by making sure only rank 0 creates this directory.

Reviewed By: myleott

Differential Revision: D14841304

fbshipit-source-id: c9b73ba804de97e2cb19a616189fefce476d8c74

94e9d77c

07 Apr, 2019 1 commit

move distributed_init after get_batch_iterator · 34028c63

Haoran Li authored Apr 07, 2019

Summary: There are constantly wait timeout issue for using multiple nodes, even setting copylocallytempdir:/ doesn't help, eg f105637629. It seems to be working after I moved distributed_init after get_batch_iterator, eg f106520580

Reviewed By: myleott

Differential Revision: D14817769

fbshipit-source-id: edbb101a28d8082241c7bdd8c5500c9dad27647c

34028c63

02 Apr, 2019 2 commits

Add checkpoint write timer · eef6663c

Myle Ott authored Apr 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/613

Differential Revision: D14712311

Pulled By: myleott

fbshipit-source-id: 3e7646629b539c10b6af89dece2c0c564f31125f

eef6663c

Use --train-subset and --valid-subset properly · e88ad84b

Myle Ott authored Apr 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/614

Differential Revision: D14712321

Pulled By: myleott

fbshipit-source-id: 8ef973c5d30ebccf0df0f1cabdddd590248a8f8d

e88ad84b

12 Mar, 2019 1 commit

Handle 3+ dimensional input in sequence_generator + nits · 860010e9

Dmytro Okhonko authored Mar 12, 2019

Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.

Reviewed By: myleott

Differential Revision: D14420044

fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015

860010e9

11 Mar, 2019 1 commit

Add missing parentheses in regex expression (#567) · fef4e002

Jose Fonollosa authored Mar 11, 2019

Summary:
The regex pattern without parentheses is not correct. The checkpoints are not sorted in descending order
Pull Request resolved: https://github.com/pytorch/fairseq/pull/567

Differential Revision: D14404380

Pulled By: myleott

fbshipit-source-id: 98cd0cfa8c92b78a03ffbb94840bc0f7a118eca1

fef4e002

04 Mar, 2019 1 commit

Add --curriculum (fixes #533) · 2ad1178e

Myle Ott authored Mar 04, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/554

Differential Revision: D14300596

Pulled By: myleott

fbshipit-source-id: f38c8e58daef99d5e4b97dd423e4142e4294a4f0

2ad1178e