"vscode:/vscode.git/clone" did not exist on "81d8f4a9e157fd247addb815433cc8d9b5e59e35"
- 30 Jun, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/698 Differential Revision: D16068477 Pulled By: myleott fbshipit-source-id: a68f6f519dc5481f857d8e10cc443249eccb2545
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/699 Differential Revision: D16068551 Pulled By: myleott fbshipit-source-id: dddd8768b531032af7c4598af9dae3c6c00ff9ac
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/697 Differential Revision: D16068465 Pulled By: myleott fbshipit-source-id: c2563c3c682e7e8406e6d7c8e895d8afbec551eb
-
- 27 Jun, 2019 2 commits
-
-
Bao-Yu authored
Summary: Repeated use of 'i' in evaluate may cause some problems. Pull Request resolved: https://github.com/pytorch/fairseq/pull/831 Differential Revision: D15980227 Pulled By: myleott fbshipit-source-id: 7b6b54a6b54938ad63ed1720d930505b56e5c84b
-
Nayan Singhal authored
Summary: Added BMUF implementation. Todo: 1) Add unit test case for testing model averaging and bmuf 2) Add warm before actually start training the model Reviewed By: jay-mahadeokar Differential Revision: D15871477 fbshipit-source-id: 866b0aba2d5bea5b65b4438acb49c886c4a87924
-
- 26 Jun, 2019 3 commits
-
-
Alexander Rives authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/684 Differential Revision: D16006333 Pulled By: myleott fbshipit-source-id: 95bd4215734281194008fa029e81407d63b335ac
-
Liang Wang authored
Summary: When we have multiple valid subsets, say `valid`, `valid1` and `valid2`, if `combine=True` holds, when loading `valid` subset, it will try to locate and load `valid`, `valid1`, `valid2`... and then combine them into one dataset. Set `combine` to `False` solves this issue. In my experiment, I have 3 valid subsets with 3000, 5000 and 8701 examples, with argument `--valid-subset valid,valid1,valid2`, the log is as follows: ``` ...... | ./mix_data/bin valid src-trg 3000 examples | ./mix_data/bin valid1 src-trg 5000 examples | ./mix_data/bin valid2 src-trg 7801 examples | ./mix_data/bin valid1 src-trg 5000 examples | ./mix_data/bin valid2 src-trg 7801 examples ...... ``` As shown above, `valid1` and `valid2` subsets are incorrectly loaded twice. Pull Request resolved: https://github.com/pytorch/fairseq/pull/835 Differential Revision: D16006343 Pulled By: myleott fbshipit-source-id: ece7fee3a00f97a6b3409defbf7f7ffaf0a54fdc
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/687 Differential Revision: D16005399 Pulled By: myleott fbshipit-source-id: bf099c17e2095394acc452e9abcb4ee04afd0426
-
- 25 Jun, 2019 1 commit
-
-
freewym authored
Summary: … enabled. When doing multi-gpu training with --use-bmuf turned on and --global-sync-iter > 1, each replica may not sync with other replicas at each iteration. So logging_outputs only has stats of their own. On the other hand, logging_outputs may be empty at the end of an epoch after "a dummy iteration" because the number of replicas does not divide the number of batches of the training data. If this happens, sample_size and ntokens would be 0 for some replica and cause "divided by 0" error. This fix sets *loss to 0 if sample_size/ntokens is 0. Pull Request resolved: https://github.com/pytorch/fairseq/pull/812 Reviewed By: myleott, yqwangustc Differential Revision: D15908614 Pulled By: nayansinghal fbshipit-source-id: c92e8e095f012bdb4ef753a3c627fd215afa215d
-
- 24 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/679 Test Plan: https://our.intern.facebook.com/intern/chronos/jobinstance/?jobinstanceid=5191319216&smc=chronos_gp_admin_client&log_type=stdout&offset=0&pretty_logs=false Differential Revision: D15961008 Pulled By: myleott fbshipit-source-id: cf214de96665b33887ef64cfcb45a51f81002ed1
-
- 23 Jun, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/678 Differential Revision: D15956712 Pulled By: myleott fbshipit-source-id: 5048d06ddfbec0045558a22c777a966cca1ec396
-
Alex Mathai authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/830 Differential Revision: D15960624 Pulled By: myleott fbshipit-source-id: ecfef5c51b886e3162bb8e07d232c6e9ea1169b0
-
Qian Wang authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/828 Differential Revision: D15960629 Pulled By: myleott fbshipit-source-id: ca631651e9a90ce8ed90ca23987519001fea3656
-
- 21 Jun, 2019 2 commits
-
-
James Cross authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/673 This function breaks when leaving the argument `max_positions` with the default value `None`, which is presumably not the intended behavior. Reviewed By: theweiho, myleott Differential Revision: D15937221 fbshipit-source-id: 1f5dc1c27ad9b6a89501d2dc015de12181059349
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/671 Differential Revision: D15925248 fbshipit-source-id: 9eeea8a257929347e2458afdfc1def8dbb925a72
-
- 20 Jun, 2019 6 commits
-
-
Matt Le authored
Summary: Use bert init for xlm_base. This seems to be much closer to what is done in the [XLM](https://github.com/facebookresearch/XLM/blob/master/src/model/transformer.py#L44) repo. At update 10 with BERT init (f121471600), loss starts at 14.234 At update 10 without BERT init (f121471612), loss starts at 154.423 Reviewed By: liezl200, pipibjc Differential Revision: D15874836 fbshipit-source-id: f81bf83a078992d7476ba7fdf263b731a9f5b66d
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86
-
davidecaroselli authored
Summary: I have made an upgrade to my previous implementation of MMapIndexedDataset, now: - It uses up to **4 times less memory and disk space** - Words per second is slightly improved thanks to less memory access Pull Request resolved: https://github.com/pytorch/fairseq/pull/816 Differential Revision: D15899848 Pulled By: myleott fbshipit-source-id: 9ddeb4809729ef69cc6b0867b33ee71184d845e6
-
Peng-Jen Chen authored
Summary: In https://github.com/pytorch/fairseq/issues/656, people are often confused about how to set multilingual translation parameters at inference time. This diff add more checks to ensure the arguments (`--lang-pairs`, `--encoder-langtok`, `--decoder-langtok`) load from checkpoint are consistent with arguments specified in generate/interactive command line. We also add a section in example page to explain how to set the arguments Reviewed By: myleott Differential Revision: D15682169 fbshipit-source-id: 64e6db94cd72ea7ce2d0aa1067c9c2dcd3b8a2ac
-
alexeib authored
Summary: Merging wav2vec to master. Includes renames (Cpc -> wav2vec) and some light example files. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/654 Differential Revision: D15913409 Pulled By: alexeib fbshipit-source-id: f723e6f211706cd9431c7d76dc12c4e80c9cfc80
-
Myle Ott authored
Summary: Notable (possibly breaking) changes: - d45db804: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - f2563c21: Move LM definitions into separate files - dffb1674: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - 34726d56: Move `distributed_init` into `DistributedFairseqModel` - cf17068a: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - d45db804: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - 96ac28d3: Rename `--sampling-temperature` -> `--temperature` - fc1a19a3: Deprecate dummy batches - a1c997bd: Add memory mapped datasets - 0add50c2: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b
-
- 19 Jun, 2019 4 commits
-
-
Michael Wu authored
Summary: add flags to freeze embedding parameters and transformer layer parameters in `TransformerSentenceEncoder`. Reviewed By: myleott Differential Revision: D15866135 fbshipit-source-id: e634d7adfd5e81eacccf2b9cf6bc15bad30bd1fe
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/811 Differential Revision: D15880880 Pulled By: myleott fbshipit-source-id: c47e09a90c945aca82b26edb4a8af93e063d5b00
-
freewym authored
Summary: …rch.distributed.ReduceOp Pull Request resolved: https://github.com/pytorch/fairseq/pull/804 Differential Revision: D15877033 Pulled By: myleott fbshipit-source-id: 58e7c39a88b67345a55b761fee4d9f211a5ee82c
-
Arya McCarthy authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/813 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/663 Pull Request resolved: https://github.com/fairinternal/fairspeq/pull/4 Introduce new training for speech models which accept additional training data. Reviewed By: liezl200 Differential Revision: D15846661 fbshipit-source-id: 8b2cbfd56a86cf03c0b34c4a025bebdd5db7204e
-
- 15 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/655 Differential Revision: D15816573 fbshipit-source-id: ac0118a1d407dc132cc7d82e029eac6c8ec76d2a
-
- 13 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: It's so much faster to extract (3 minutes instead of 20). Pull Request resolved: https://github.com/pytorch/fairseq/pull/803 Differential Revision: D15795810 Pulled By: myleott fbshipit-source-id: 3b2ae8bd7924a77ac8e795f5e1a7da0c4ae27374
-
- 12 Jun, 2019 3 commits
-
-
Nayan Singhal authored
Summary: Implemented model averaging for fairseq. Removed the ddp wrapper if global optimizer is provided. Syncing all the models based on the iteration provide in the input TODO: 1) Fix throughput and wps meter. Need to check other meters too. 2) Replace Model average code with BMUF algorithm implementation. Reviewed By: myleott Differential Revision: D15711044 fbshipit-source-id: 58a4af74db2a61d06762597b95836cbeb1ed82cc
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/801 Differential Revision: D15781975 Pulled By: myleott fbshipit-source-id: b86276cd3a40138c09494637c43ce52a56c4aced
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/799 Differential Revision: D15773932 Pulled By: myleott fbshipit-source-id: 650c0621bedb3b7ecebc0654d8e10d7692c50994
-
- 11 Jun, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/793 Differential Revision: D15758755 Pulled By: myleott fbshipit-source-id: b93e4ac11bde36a0b59b4d6d1c84d31c3124d767
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/797 Differential Revision: D15761071 Pulled By: myleott fbshipit-source-id: 257d4a2297e83da7e59baed154dbafd6bfe614bf
-
Myle Ott authored
Summary: This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally. Pull Request resolved: https://github.com/pytorch/fairseq/pull/796 Differential Revision: D15760808 Pulled By: myleott fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230
-
Bairen Yi authored
Summary: See #467. Ping myleott to review. This is a work-related contribution. Ping lark to review. Pull Request resolved: https://github.com/pytorch/fairseq/pull/794 Differential Revision: D15756816 Pulled By: myleott fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/792 Differential Revision: D15741781 Pulled By: myleott fbshipit-source-id: c256c7900c307d485904e69b1526b9acbe08fec9
-
yilinyang7 authored
when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713) Summary: https://github.com/pytorch/fairseq/issues/712 Pull Request resolved: https://github.com/pytorch/fairseq/pull/713 Differential Revision: D15242432 Pulled By: myleott fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179
-
Sergey Edunov authored
Summary: Multi-Head attention is currently not TPU-friendly, specifically .data_ptr() is not supported and should not be used. Also there are potential issues with correctness of existing code (e.g. data_ptr() can point to the same storage for different tensors). Rather than rely on data_ptr() we should explicitly set self_attention or encoder_decoder_attention flags. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/636 Reviewed By: myleott Differential Revision: D15709898 Pulled By: edunov fbshipit-source-id: f931713193c51be848a5de20da730ac3a3ce0187
-
- 10 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: - make it possible to load file_utils.py without the dependencies - add some more demo features Pull Request resolved: https://github.com/pytorch/fairseq/pull/791 Differential Revision: D15739950 Pulled By: myleott fbshipit-source-id: 38df5209973a6fe2e3651575b97134e096aaf5bf
-
freewym authored
Summary: In the current progress bar, the counter for log_interval will always start from 0, which is not correct if reloading from a checkpoint in the middle of an epoch. This fix obtains the offset from the iterator to set the counter correctly. Pull Request resolved: https://github.com/pytorch/fairseq/pull/778 Differential Revision: D15739953 Pulled By: myleott fbshipit-source-id: a1d13403ec5783b22e01d7cb63874fd8dea7f8b0
-
- 07 Jun, 2019 1 commit
-
-
Ning Dong authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/770 Without this change comment here https://fburl.com/w1cejgw9 is inconsistent with the implementation. Reviewed By: xianxl Differential Revision: D15582826 fbshipit-source-id: 16d8368560153b251beed8b290f51fcdd8a8faee
-