Commits · a2f5361d7021ca49184545170d6dddc032647f3d · OpenDAS / Fairseq

21 Aug, 2019 2 commits

alexeib authored Aug 21, 2019

Summary:
Adds ability to tag individual examples with the names of their datasets, along with some minor miscellaneous fixes and improvements
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/838

Differential Revision: D16919175

Pulled By: alexeib

fbshipit-source-id: 4bf493299645bae63f3ee6382e15f18a9f73666c

a2f5361d

vggblock support without pooling and pooling_kernel_size missing self (#839) · 7a31fe06

Siddharth Dalmia authored Aug 20, 2019

Summary:
1) VggBlock was not supported if pooling kernel size was None.
2) Since we modify pooling kernel size by using _pair. We should use self.pooling_kernel_size. But I agree it doesn't matter as pytorch is robust to this.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/839

Differential Revision: D16934112

Pulled By: okhonko

fbshipit-source-id: b6b95163b0e7f7203d76d535f01a41912382bdc3

7a31fe06

20 Aug, 2019 2 commits

Give path when checkpoint can't be found (#1040) · 9e5edc10

Arya McCarthy authored Aug 20, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1040

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/836

Reviewed By: myleott, liezl200

Differential Revision: D16889252

fbshipit-source-id: 45a1b6c1217fb099f0350096e38e1c7d83ea0a64

9e5edc10

Fix method has same name as property · 4812f64b

Dmytro Okhonko authored Aug 20, 2019

Summary:
Training is failing sometimes because `self.collater` can be both method and property for AsrDataset
https://github.com/pytorch/fairseq/issues/1036

Reviewed By: jcai1

Differential Revision: D16919945

fbshipit-source-id: b34ba54e4dae315b7c723996610a348a8e3031af

4812f64b

19 Aug, 2019 6 commits

Back out "[fairseq][PR] Fix bug (the returned value has a dimension mismatch)... · c81fed46

Myle Ott authored Aug 19, 2019

Back out "[fairseq][PR] Fix bug (the returned value has a dimension mismatch) in label-smoothed-cross-entropy for MoE" (#837)

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/837

Original commit changeset: a73bc03d2280

Differential Revision: D16904372

fbshipit-source-id: b4c4047b2686ba47258cdf0783059726134c920a

c81fed46

Small fixes · 6ce55e4b

Myle Ott authored Aug 19, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/835

Differential Revision: D16904038

Pulled By: myleott

fbshipit-source-id: 2c9d0b913f8d688297ac80fcabd905bd1397f66a

6ce55e4b

Add instructions to resume training from released RoBERTa models (fixes #1034) · 2eb53b8e

Myle Ott authored Aug 19, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1041

Differential Revision: D16904073

Pulled By: myleott

fbshipit-source-id: 22e5e25a15f7a0b6f2d827d98c953a6cec07610e

2eb53b8e

add constrains when checking multiple consecutive blank lines (#1031) · 79460d34

Trinkle23897 authored Aug 19, 2019

Summary:
It will cause runtime error on some standard datasets (e.g. wikitext-103).

Details:
After preprocessing to wikitext-103 folder with current master branch, I use fairseq-train and get the following Error:
```bash
Traceback (most recent call last):
  File "/home/trinkle/.local/bin/fairseq-train", line 11, in <module>
    load_entry_point('fairseq', 'console_scripts', 'fairseq-train')()
  File "/data/git/Transformer/fairseq/fairseq_cli/train.py", line 321, in cli_main
    main(args)
  File "/data/git/Transformer/fairseq/fairseq_cli/train.py", line 46, in main
    task.load_dataset(valid_sub_split, combine=False, epoch=0)
  File "/data/git/Transformer/fairseq/fairseq/tasks/language_modeling.py", line 167, in load_dataset
    break_mode=self.args.sample_break_mode, include_targets=True,
  File "/data/git/Transformer/fairseq/fairseq/data/token_block_dataset.py", line 54, in init
    "Found multiple blank lines in the dataset, please remove them"
AssertionError: Found multiple blank lines in the dataset, please remove them (eg. cat -s raw.txt) and preprocess the data again.
```

It's because these datasets have multiple blank lines. The assertion is added in https://github.com/pytorch/fairseq/commit/851c022610b27da3beaa4e40a6834b5fb3b44f44, however, adding this assertion is not a good way.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1031

Differential Revision: D16892942

Pulled By: myleott

fbshipit-source-id: 90c41b7d98a7b78f506bb57320f9f6b901e05d5b

79460d34

remove shlex.quote in scripts/spm_train.py (#972) · 02cb5a43

freewym authored Aug 19, 2019

Summary:
to resolve the issue https://github.com/pytorch/fairseq/issues/971
Pull Request resolved: https://github.com/pytorch/fairseq/pull/972

Differential Revision: D16892827

Pulled By: myleott

fbshipit-source-id: baf277961f1e292f4593eefe31e3541aa9d0d8c4

02cb5a43

Fix bug (the returned value has a dimension mismatch) in... · 0c75c760

Chunting Zhou authored Aug 19, 2019

Fix bug (the returned value has a dimension mismatch) in label-smoothed-cross-entropy for MoE (#1037)

Summary:
MoE will encounter a dimension mismatch bug when using label-smoothed cross entropy as the criterion, which occurs at [https://github.com/pytorch/fairseq/blob/master/fairseq/tasks/translation_moe.py#L125](url). This is a fix to the bug.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1037

Differential Revision: D16892674

Pulled By: myleott

fbshipit-source-id: a73bc03d2280356667d02422d22ad11d968d0c65

0c75c760

17 Aug, 2019 1 commit

implement tri-stage lr_scheduler (#1028) · 732d15a9

Yongqiang Wang authored Aug 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1028

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/831

tri-stage lr-scheduler consisted of 3 stages: 1. warmup; 2. hold; 3.
(exponentially) decay; used in https://arxiv.org/pdf/1904.08779.pdf

Reviewed By: myleott

Differential Revision: D16806206

fbshipit-source-id: 40e472ec382449a0fb711f8ee980f14d27d2114a

732d15a9

16 Aug, 2019 2 commits

added check in token block dataset for multiple consecutive blank lines · 851c0226

Naman Goyal authored Aug 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/830

Differential Revision: D16861799

fbshipit-source-id: d85deaf78ec5b9c23eafd4145a96252e3901fa22

851c0226

added hf bert bpe · a3cfd51d

Naman Goyal authored Aug 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/829

Differential Revision: D16856693

fbshipit-source-id: 545bbf4815f5c40e72a6ed241312a51dc90e34a1

a3cfd51d

15 Aug, 2019 5 commits

BMUF Resetting local state param · ed27ed8b

Nayan Singhal authored Aug 15, 2019

Summary:
BMUF
1) Resetting BMUF parameters after warmup.
2) Resetting local param state after warmup.
3) Allowing user to pass block momentum value instead of gpu derived Block Momentum.

Reviewed By: skritika, mrshenli

Differential Revision: D16692026

fbshipit-source-id: d02eaf29d0e4b37007418166ec937d4bf5fe6aca

ed27ed8b

Update README · a8e32111

Myle Ott authored Aug 15, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/827

Differential Revision: D16833252

Pulled By: myleott

fbshipit-source-id: 8eded8cc651002dfd60869fc2383d305ed335d3a

a8e32111

Backward reranking public (#667) · 49177c99

Nathan Ng authored Aug 15, 2019

Summary:
Implementation of noisy channel model reranking for release with paper
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/667

Reviewed By: michaelauli

Differential Revision: D15901665

Pulled By: nng555

fbshipit-source-id: 2de2c518be8e5828ffad72db3e741b0940623373

49177c99

Update README · ac66df47

Myle Ott authored Aug 15, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/826

Differential Revision: D16830402

Pulled By: myleott

fbshipit-source-id: 25afaa6d9de7b51cc884e3f417c8e6b349f5a7bc

ac66df47

added effcient wsc task/criterion for winogrande (#825) · 1d44cc85

ngoyal2707 authored Aug 15, 2019

Summary:
1) So far getting `78%`  on winogrande validation dataset comapred to `63.5%` in the paper.
2) Will upgrade readme once everything is finalized.

Questions:

1) Should I just call `binary_wsc_task` instead of `winogrande` to be less specific to dataset and be generic?
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/825

Differential Revision: D16810159

fbshipit-source-id: cfde73561fa4caaaa63a4773c0aecd12ce1fa518

1d44cc85

14 Aug, 2019 5 commits

initial light and dynamic convolution kernels (#547) · f840564d

Nathan Ng authored Aug 14, 2019

Summary:
CUDA code for light/dynamicconv kernels, including pytorch modules. Modules can be built by running setup.py in each respective folder, and can then be imported and used like any other module.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/547

Reviewed By: myleott, shubho

Differential Revision: D15703660

Pulled By: nng555

fbshipit-source-id: e9c913753be3a1cd571965f7200df6678b644520

f840564d

Update READMEs · b8704686

Myle Ott authored Aug 14, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/823

Differential Revision: D16804995

Pulled By: myleott

fbshipit-source-id: abac5dc0ed6b7bfe2309ba273456e54b37340b2c

b8704686

v0.7.2 -> v0.8.0 (#1017) · ffffe04e

Myle Ott authored Aug 14, 2019

Summary:
Changelog:
- Relicensed under MIT license
- Add RoBERTa
- Add wav2vec
- Add WMT'19 models
- Add initial ASR code
- Changed torch.hub interface (`generate` renamed to `translate`)
- Add `--tokenizer` and `--bpe`
- f812e529: Renamed data.transforms -> data.encoders
- 654affc0: New Dataset API (optional)
- `47fd9852`: Deprecate old Masked LM components
- `5f78106a`: Set mmap as default dataset format and infer format automatically
- Misc fixes for sampling
- Misc fixes to support PyTorch 1.2
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017

Differential Revision: D16799880

Pulled By: myleott

fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7

ffffe04e

Fix tests · 7c89e13f

Myle Ott authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/822

Differential Revision: D16800078

Pulled By: myleott

fbshipit-source-id: b86e08e01f2fe13c64b77f1d23a5f6800f252bf7

7c89e13f

Updates for PyTorch 1.2 masking/bool behavior · baa8ce11

Myle Ott authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/821

Differential Revision: D16790120

Pulled By: myleott

fbshipit-source-id: 2fb5070172636561d08596a29f08c93df07548bf

baa8ce11

13 Aug, 2019 4 commits

Add fairseq-validate · d015d23a

Myle Ott authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/765

Differential Revision: D16763357

Pulled By: myleott

fbshipit-source-id: 758b03158e486ee82786e2d5bf4e46073b50c503

d015d23a

Add Commonsense QA task · a33ac060

Myle Ott authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1014

Differential Revision: D16784120

Pulled By: myleott

fbshipit-source-id: 946c0e33b594f8378e4ab6482ce49efcb36e1743

a33ac060

added readme code for inference with GLUE finetuned model · a171c2dd

Naman Goyal authored Aug 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/820

Differential Revision: D16783469

fbshipit-source-id: d5af8ba6a6685608d67b72d584952b8e43eabf9f

a171c2dd

fix cosine scheduler docstring · 577e4fa7

Siddharth Shah authored Aug 12, 2019

Summary: as title

Reviewed By: myleott

Differential Revision: D16773845

fbshipit-source-id: 2d10e197c31f94d894430559327289a4d03e33f7

577e4fa7

12 Aug, 2019 5 commits

ignore files starting with . e.g. .ipynb_checkpoints (#819) · 0563d879

Ilia Kulikov authored Aug 12, 2019

Summary:
.ipynb_checkpoints folder in models folders crashed the importlib
now there is a check for this
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/819

Differential Revision: D16772192

Pulled By: myleott

fbshipit-source-id: 01c956aef4ed312bc7645c31c83dbf98af89d931

0563d879

Minor fixes for RACE finetuning (#818) · d0036640

Myle Ott authored Aug 12, 2019

Summary:
- remove unnecessary extra spaces in RACE data in preprocessing
- fix finetuning instructions (add `--truncate-sequence` and add `--dropout` params)
- close file handle in SentenceRankingTask
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/818

Differential Revision: D16770055

Pulled By: myleott

fbshipit-source-id: 2c80084e92cdf8692f2ea7e43f7c344c402b9e61

d0036640

Lint · 2b68e91f

Myle Ott authored Aug 12, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/817

Differential Revision: D16762905

Pulled By: myleott

fbshipit-source-id: d920595bec44ed26b72dfc6fbc15c0aa107b4e56

2b68e91f

Remove LAMB optimizer (at least until we can test it more) · 969f4474

Myle Ott authored Aug 12, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1008

Differential Revision: D16763315

Pulled By: myleott

fbshipit-source-id: d4bad8384eec273f2d5de4ed29fb8d158ab9187c

969f4474

Update --restore-file logic (partially fixes #999) · 3bbdc554

Myle Ott authored Aug 12, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1007

Differential Revision: D16762490

Pulled By: myleott

fbshipit-source-id: d67137bcf581887850323d188bb4ea643a35ac9e

3bbdc554

10 Aug, 2019 3 commits

Fix torch.hub for MNLI · c0a5d29e

Myle Ott authored Aug 10, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1006

Differential Revision: D16753078

Pulled By: myleott

fbshipit-source-id: 970055632edffcce4e75931ed93b42a249120a4a

c0a5d29e

Add WSC task and criterion · 83249196

Myle Ott authored Aug 10, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1004

Differential Revision: D16751443

Pulled By: myleott

fbshipit-source-id: f70acd6c7be6d69da45b5b32fe4c4eff021539ab

83249196

Fix Python 3.5 compat · a00ce132

Myle Ott authored Aug 09, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1005

Differential Revision: D16751489

Pulled By: myleott

fbshipit-source-id: 6e372ac23643e32a3791044c13f4466bdc28f049

a00ce132

09 Aug, 2019 3 commits

added sentence ranking task and loss (#809) · b6c55b62

Jingfei Du authored Aug 09, 2019

Summary:
This task and loss are used for sentence ranking and multiple choice tasks such as RACE
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/809

Reviewed By: myleott

Differential Revision: D16715745

Pulled By: jingfeidu

fbshipit-source-id: cb4d1c7b26ebb3e2382449ba51af5745ef56f30f

b6c55b62

MacOS requires c++ flag (#1000) · 838e108a

Vincent Quenneville-Belair authored Aug 09, 2019

Summary:
To install on MacOS, `-stdlib=libc++` needs to be specified.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1000

Differential Revision: D16733819

Pulled By: myleott

fbshipit-source-id: 7a1ed11e2b4e1071e61c64c379c84f72e02ad2b5

838e108a

added superglue dev set results to readme · 3563e59a

Naman Goyal authored Aug 09, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/815

Differential Revision: D16733633

fbshipit-source-id: 0a5029e41b6dbb9fb28e9703ad057d939d489d90

3563e59a

08 Aug, 2019 2 commits

replace 'mkdir' with 'mkdir -p' (#997) · 6398aa9e

Hafiz Shafruddin authored Aug 08, 2019

Summary:
Allow shell script to create sub directories with -p flag. Amends readme file too.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/997

Differential Revision: D16710813

Pulled By: myleott

fbshipit-source-id: 89abefa27e8fac99d212fc9b7b0dbc3690043ba0

6398aa9e

Integrate with Apache Arrow/Plasma in-memory store for large datasets (#995) · 439ead5a

Myle Ott authored Aug 08, 2019

Summary:
Datasets with many examples can generate very large indexes in TokenBlockDataset (and possibly elsewhere). When using `--num-workers>0` these indexes are pickled and transferred via a multiprocessing pipe, which is slow and can fail if the index grows beyond 4GB (~0.5B examples). Apache Arrow has an in-memory store called Plasma that will offload these arrays to shared memory, which both reduces duplication of the data and avoids needing to pickle.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/995

Differential Revision: D16697219

Pulled By: myleott

fbshipit-source-id: 1b679ee5b3d2726af54ff418f6159a3671173fb8

439ead5a