- 03 Dec, 2019 1 commit
-
-
Myle Ott authored
Summary: Possibly breaking changes: - Set global numpy seed (4a7cd582) - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e9) - TransformerEncoder returns namedtuples instead of dict (27568a7e) New features: - Add `--fast-stat-sync` option (e1ba32aa) - Add `--empty-cache-freq` option (315c463d) - Support criterions with parameters (ba5f829f) New papers: - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c99) - Levenshtein Transformer (86857a58, ...) - Cross+Self-Attention for Transformer Models (4ac2c5f2) - Jointly Learning to Align and Translate with Transformer Models (1c667929) - Reducing Transformer Depth on Demand with Structured Dropout (dabbef46) - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5eaa) - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcdad) - CamemBERT: a French BERT (b31849aa) Speed improvements: - Add CUDA kernels for LightConv and DynamicConv (f840564d) - Cythonization of various dataloading components (4fc39538, ...) - Don't project mask tokens for MLM training (718677eb) Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452 Differential Revision: D18798409 Pulled By: myleott fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
-
- 26 Nov, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/926 Differential Revision: D18685772 Pulled By: myleott fbshipit-source-id: 0f99d79ed6ee72e9d3ced786d75ab9504d0dfcf0
-
- 13 Nov, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/907 Differential Revision: D18480215 Pulled By: myleott fbshipit-source-id: b02002f631f6d47380f309d4f464bd135d623280
-
- 02 Nov, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1340 Differential Revision: D18289455 Pulled By: myleott fbshipit-source-id: a1c8163a35273b6c646d300142701e8a317d7378
-
- 27 Sep, 2019 1 commit
-
-
Changhan Wang authored
Summary: Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006) * Added Levenshtein Transformer model, task and criterion class * Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines * Add an option for prepending BOS to dictionary class and translation task class Reviewed By: myleott Differential Revision: D17297372 fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7
-
- 03 Sep, 2019 1 commit
-
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/856 Reviewed By: myleott Differential Revision: D17162411 Pulled By: myleott fbshipit-source-id: e70ecc802398bbba2b5326e9700f2121c422fd18
-
- 31 Aug, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/853 Differential Revision: D17147879 Pulled By: myleott fbshipit-source-id: b1f5e838533de62ade52fa82112ea5308734c70f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/852 Differential Revision: D17147452 Pulled By: myleott fbshipit-source-id: 5fd9c7da3cc019c7beec98d41db1aef1329ee57a
-
- 27 Aug, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1078 Differential Revision: D17072514 Pulled By: myleott fbshipit-source-id: 69a8c8c9cc7caa7e04c414329a5d79e6e1a6621c
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/848 Differential Revision: D17060283 fbshipit-source-id: c7e61cae76a0566cc3e2ddc3ab4d48f8dec9d777
-
- 26 Aug, 2019 1 commit
-
-
Naman Goyal authored
Summary: Fixes broken build for `pytext` https://github.com/pytorch/fairseq/commit/4fc39538aec5141aa41f5d6d7dc0097e7c0f7b48 Earlier version of setup tools required `cython` to be installed before even starting setup.py. This one fixes it. More details: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#180 and https://stackoverflow.com/questions/37471313/setup-requires-with-cython Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/847 Differential Revision: D16997450 fbshipit-source-id: 5f65026c228a1b94280ca73937078ee3e21ce4f8
-
- 23 Aug, 2019 1 commit
-
-
Naman Goyal authored
Summary: Cythonized token block dataset code, it's `> 100x` faster. Token block for entire `bookwiki+CC+stories+openweb` is just ~`39.9` seconds. TODO: 1) I think, I can make it 2x more faster. 2) cleanup. EDIT History: ~~First pass at parellelizing `token_block_dataset`. The code feels somewhat complicated and cluttered. This is 2-3x faster though on my tests on `bookwiki` dataset with both `complete` and `complete_doc` modes. myleott Can you take a look for correctness as I am still not 100% sure that I am not missing corner cases.~~ Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/834 Test Plan: Imported from GitHub, without a `Test Plan:` line. Test workflow: f133816198 Reviewed By: myleott Differential Revision: D16970257 Pulled By: myleott fbshipit-source-id: ec45a308193c9e9f3e7075336c15df4723228d6f
-
- 14 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - Relicensed under MIT license - Add RoBERTa - Add wav2vec - Add WMT'19 models - Add initial ASR code - Changed torch.hub interface (`generate` renamed to `translate`) - Add `--tokenizer` and `--bpe` - f812e529: Renamed data.transforms -> data.encoders - 654affc0: New Dataset API (optional) - `47fd9852`: Deprecate old Masked LM components - `5f78106a`: Set mmap as default dataset format and infer format automatically - Misc fixes for sampling - Misc fixes to support PyTorch 1.2 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1017 Differential Revision: D16799880 Pulled By: myleott fbshipit-source-id: 45ad8bc531724a53063cbc24ca1c93f715cdc5a7
-
- 13 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/765 Differential Revision: D16763357 Pulled By: myleott fbshipit-source-id: 758b03158e486ee82786e2d5bf4e46073b50c503
-
- 02 Aug, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/795 Differential Revision: D16620488 Pulled By: myleott fbshipit-source-id: 1998a9ccd8816fc7f590861fb4898f910a36bc1e
-
- 30 Jul, 2019 1 commit
-
-
Myle Ott authored
Summary: The previous BSD+PATENTS license was controversial. We have been approved to relicense fairseq under the MIT license. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786 Differential Revision: D16560654 Pulled By: myleott fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034
-
- 19 Jul, 2019 1 commit
-
-
Myle Ott authored
Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7
-
- 06 Jul, 2019 1 commit
-
-
Louis MARTIN authored
Summary: Fairseq wouldn't install on macOS. A workaround was found here: https://github.com/pytorch/fairseq/issues/289 This is now automatic in setup.py, maybe be there's a cleaner way to do it. I checked that it compiles fine on Linux and macOS. Pull Request resolved: https://github.com/pytorch/fairseq/pull/862 Differential Revision: D16142105 Pulled By: myleott fbshipit-source-id: 998ac7781d7a1ac047f4f9239c1fe16eab4be0dd
-
- 20 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/818 Differential Revision: D15916265 Pulled By: myleott fbshipit-source-id: c66c0bd988d3472c4150226952f34ee8d4c3db86
-
Myle Ott authored
Summary: Notable (possibly breaking) changes: - d45db804: Remove checkpoint utility functions from utils.py into checkpoint_utils.py - f2563c21: Move LM definitions into separate files - dffb1674: Updates to model API: - `FairseqModel` -> `FairseqEncoderDecoderModel` - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer` - `encoder_out_dict` -> `encoder_out` - rm unused `remove_head` functions - 34726d56: Move `distributed_init` into `DistributedFairseqModel` - cf17068a: Simplify distributed launch by automatically launching multiprocessing on each node for all visible GPUs (allows launching just one job per node instead of one per GPU) - d45db804: Change default LR scheduler from `reduce_lr_on_plateau` to `fixed` - 96ac28d3: Rename `--sampling-temperature` -> `--temperature` - fc1a19a3: Deprecate dummy batches - a1c997bd: Add memory mapped datasets - 0add50c2: Allow cycling over multiple datasets, where each one becomes an "epoch" Plus many additional features and bugfixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/817 Differential Revision: D15913844 Pulled By: myleott fbshipit-source-id: d5b5d678efdd9dd3e4d7ca848ddcf1ec2b21bf6b
-
- 11 Jun, 2019 1 commit
-
-
Bairen Yi authored
Summary: See #467. Ping myleott to review. This is a work-related contribution. Ping lark to review. Pull Request resolved: https://github.com/pytorch/fairseq/pull/794 Differential Revision: D15756816 Pulled By: myleott fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b
-
- 16 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/580 Differential Revision: D14494390 Pulled By: myleott fbshipit-source-id: 524cc16a106f2af630357e2ebdf7dde35fa7d494
-
- 15 Mar, 2019 1 commit
-
-
Myle Ott authored
Summary: Changelog: - 998ba4f: Add language models from Baevski & Auli (2018) - 4294c4f6: Add mixture of experts code from Shen et al. (2019) - 00493490: Add example for multilingual training - 48d9afbe: Speed improvements, including fused operators from apex - 44d27e64: Add Tensorboard support - d17fa851: Add Adadelta optimizer - 9e1c880f: Add `FairseqEncoderModel` - b65c579b: Add `FairseqTask.inference_step` to modularize generate.py - 2ad1178e: Add back `--curriculum` - Misc bug fixes and other features Pull Request resolved: https://github.com/pytorch/fairseq/pull/577 Differential Revision: D14481233 Pulled By: myleott fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7
-
- 28 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/542 Differential Revision: D14258895 Pulled By: myleott fbshipit-source-id: 950a840e1d001a472be8d4737c9e4de5224137b3
-
- 22 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/351 This makes it easier for tasks to plugin to generate.py/interactive.py Pull Request resolved: https://github.com/pytorch/fairseq/pull/520 Differential Revision: D14183881 Pulled By: myleott fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33
-
- 09 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
-
- 05 Feb, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489 Differential Revision: D13956810 Pulled By: myleott fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514
-
- 25 Sep, 2018 1 commit
-
-
Sergey Edunov authored
- no more FP16Trainer, we just have an FP16Optimizer wrapper - most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time - Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0 - Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq
-
- 15 Jun, 2018 1 commit
-
-
Myle Ott authored
-
- 02 Mar, 2018 1 commit
-
-
James Reed authored
Remove custom ConvTBC code
-
- 27 Feb, 2018 1 commit
-
-
Myle Ott authored
This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes. Changes: - c7033ef: add support for distributed training! See updated README for usage. - e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc. - 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf - 90c2973 and 1da6265: improve unit test coverage
-
- 22 Jan, 2018 1 commit
-
-
Myle Ott authored
-
- 12 Nov, 2017 1 commit
-
-
Myle Ott authored
Release notes: - 5c7f4954: Added simple LSTM model with input feeding and attention - 6e4b7e22: Refactored model definitions and incremental generation to be cleaner - 7ae79c12: Split interactive generation out of generate.py and into a new binary: interactive.py - 19a3865d: Subtle correctness fix in beam search decoder. Previously, for a beam size of k, we might emit a hypotheses if the <eos> was among the top 2*k candidates. Now we only emit hypotheses for which the <eos> is among the top-k candidates. This may subtly change generation results, and in the case of k=1 we will now produce strictly greedy outputs. - 97d7fcb9: Fixed bug in padding direction, where previously we right-padded the source and left-padded the target. We now left-pad the source and right-pad the target. This should not effect existing trained models, but may change (usually improves) the quality of new models. - f442f896: Add support for batching based on the number of sentences (`--max-sentences`) in addition to the number of tokens (`--max-tokens`). When batching by the number of sentences, one can optionally normalize the gradients by the number of sentences with `--sentence-avg` (the default is to normalize by the number of tokens). - c6d6256b: Add `--log-format` option and JSON logger
-
- 24 Oct, 2017 1 commit
-
-
James Reed authored
-
- 19 Oct, 2017 1 commit
-
-
Louis Martin authored
-
- 15 Sep, 2017 1 commit
-
-
Sergey Edunov authored
-