"vscode:/vscode.git/clone" did not exist on "fa0b75f14beae7c3acd4b50c3a528e214ee5a4e4"
- 30 Sep, 2018 1 commit
-
-
myleott authored
-
- 25 Sep, 2018 8 commits
-
-
Myle Ott authored
Co-authored-by:liezl200 <lie@fb.com>
-
Alexei Baevski authored
-
Myle Ott authored
-
Myle Ott authored
-
Stephen Roller authored
-
Myle Ott authored
-
Myle Ott authored
-
Stephen Roller authored
-
- 03 Sep, 2018 9 commits
- 25 Jul, 2018 1 commit
-
-
Myle Ott authored
-
- 25 Jun, 2018 2 commits
- 24 Jun, 2018 1 commit
-
-
Myle Ott authored
-
- 21 Jun, 2018 2 commits
- 15 Jun, 2018 11 commits
-
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss. Changes: - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator. - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position - Remove LEFT_PAD_* constants and make them configurable per task
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
-
alexeib authored
-
alexeib authored
-
alexeib authored
This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf There are 3 modes for constructing batches: - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper - eos: one sentence per sample (skip blank lines) some results: GCNN-13 - GBW - 37.46 GCNN-14B - GBW - 33.88 GCNN-8 - Wiki103 - 43.76 GCNN-14 - Wiki103 - 35.66 train: python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(...
-
Myle Ott authored
-
Myle Ott authored
-
- 24 May, 2018 1 commit
-
-
Myle Ott authored
-
- 02 Apr, 2018 1 commit
-
-
Myle Ott authored
Changes: - 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search - c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model - 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates - small bugfixes for distributed training, LSTM, inverse square root LR scheduler
-
- 05 Mar, 2018 1 commit
-
-
Myle Ott authored
-
- 01 Mar, 2018 1 commit
-
-
Myle Ott authored
-
- 27 Feb, 2018 1 commit
-
-
Myle Ott authored
-