"docs/source/vscode:/vscode.git/clone" did not exist on "a93e5578d6653e76a493fb0813a1ecf9ee27ed39"
- 21 Jun, 2018 1 commit
-
-
Myle Ott authored
-
- 15 Jun, 2018 10 commits
-
-
Myle Ott authored
-
Myle Ott authored
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss. Changes: - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator. - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position - Remove LEFT_PAD_* constants and make them configurable per task
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
-
Angela Fan authored
-
alexeib authored
This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf There are 3 modes for constructing batches: - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper - eos: one sentence per sample (skip blank lines) some results: GCNN-13 - GBW - 37.46 GCNN-14B - GBW - 33.88 GCNN-8 - Wiki103 - 43.76 GCNN-14 - Wiki103 - 35.66 train: python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500 eval: python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
-
Alexei Baevski authored
-
Alexei Baevski authored
-
Alexei Baevski authored
remove completed sentences from batch and allow batching uneven lengths (with fixes to make padded sequences work correctly in all models)
-
- 02 Apr, 2018 1 commit
-
-
Myle Ott authored
Changes: - 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search - c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model - 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates - small bugfixes for distributed training, LSTM, inverse square root LR scheduler
-
- 05 Mar, 2018 1 commit
-
-
Sergey Edunov authored
* Allow more flexible pre-processing and generation * Addressing CR comments * small fix
-
- 01 Mar, 2018 1 commit
-
-
Myle Ott authored
-
- 27 Feb, 2018 3 commits
-
-
Dario Pavllo authored
* Add prefix * Fixes * Keep original scores with prefix * Improve prefix code * Replace 'repeat' with 'expand'
-
Myle Ott authored
-
Myle Ott authored
This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes. Changes: - c7033ef: add support for distributed training! See updated README for usage. - e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc. - 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf - 90c2973 and 1da6265: improve unit test coverage
-
- 22 Jan, 2018 2 commits
- 13 Nov, 2017 1 commit
-
-
Myle Ott authored
-
- 12 Nov, 2017 1 commit
-
-
Myle Ott authored
-
- 08 Nov, 2017 8 commits
-
-
Louis Martin authored
* Add <eos> for unk replacement * Add IndexedRawTextDataset to load raw text files * Replace unk with original string * Add load_raw_text_dataset() and --output-format * Move has_binary_files to data.py
-
Myle Ott authored
-
Louis Martin authored
-
Myle Ott authored
-
Myle Ott authored
-
Louis Martin authored
* Split generate.py to generate.py and interactive.py and refactor code The main motivation behind these changes is to try to decorrelate use cases in order to implement future improvements such as unk replacement with original string during evaluation on test and writing predictions to output file. The previous implementation worked well but I found it difficult to integrate these future improvements. * Add --replace-unk arg to be used without align dict Replacing <unk> tokens can be beneficial even without an alignment dictionary.
-
Michael Auli authored
-
Myle Ott authored
* Move some functionality out of FConvModel into FairseqModel base class * Move incremental decoding functionality into FairseqIncrementalDecoder module * Refactor positional embeddings to be more specific to FConvModel
-
- 19 Oct, 2017 5 commits
-
-
Myle Ott authored
-
Myle Ott authored
-
Louis Martin authored
-
Myle Ott authored
-
Myle Ott authored
-
- 11 Oct, 2017 3 commits
-
-
Sergey Edunov authored
-
Sergey Edunov authored
-
Myle Ott authored
-
- 26 Sep, 2017 1 commit
-
-
Myle Ott authored
-
- 18 Sep, 2017 1 commit
-
-
Sergey Edunov authored
-
- 15 Sep, 2017 1 commit
-
-
Sergey Edunov authored
-