README.md 4 KB
Newer Older
1
# Hierarchical Neural Story Generation (Fan et al., 2018)
2

3
The following commands provide an example of pre-processing data, training a model, and generating text for story generation with the WritingPrompts dataset.
4

5
6
7
8
9
10
## Pre-trained models

Description | Dataset | Model | Test set(s)
---|---|---|---
Stories with Convolutional Model <br> ([Fan et al., 2018](https://arxiv.org/abs/1805.04833)) | [WritingPrompts](https://arxiv.org/abs/1805.04833) | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/stories_checkpoint.tar.bz2) | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/data/stories_test.tar.bz2)

Angela Fan's avatar
Angela Fan committed
11
We provide sample stories generated by the [convolutional seq2seq model](https://dl.fbaipublicfiles.com/fairseq/data/seq2seq_stories.txt) and [fusion model](https://dl.fbaipublicfiles.com/fairseq/data/fusion_stories.txt) from [Fan et al., 2018](https://arxiv.org/abs/1805.04833). The corresponding prompts for the fusion model can be found [here](https://dl.fbaipublicfiles.com/fairseq/data/fusion_prompts.txt). Note that there are unk in the file, as we modeled a small full vocabulary (no BPE or pre-training). We did not use these unk prompts for human evaluation.
12
13
14

## Dataset

15
16
The dataset can be downloaded like this:

Myle Ott's avatar
Myle Ott committed
17
```bash
18
cd examples/stories
19
curl https://dl.fbaipublicfiles.com/fairseq/data/writingPrompts.tar.gz | tar xvzf -
20
21
```

22
and contains a train, test, and valid split. The dataset is described here: https://arxiv.org/abs/1805.04833. We model only the first 1000 words of each story, including one newLine token.
23

24
25
## Example usage

Myle Ott's avatar
Myle Ott committed
26
27
28
29
30
31
32
33
34
35
First we will preprocess the dataset. Note that the dataset release is the full data, but the paper models the first 1000 words of each story. Here is example code that trims the dataset to the first 1000 words of each story:
```python
data = ["train", "test", "valid"]
for name in data:
    with open(name + ".wp_target") as f:
        stories = f.readlines()
    stories = [" ".join(i.split()[0:1000]) for i in stories]
    with open(name + ".wp_target", "w") as o:
        for line in stories:
            o.write(line.strip() + "\n")
36
```
37

Myle Ott's avatar
Myle Ott committed
38
39
Once we've trimmed the data we can binarize it and train our model:
```bash
40
# Binarize the dataset:
Myle Ott's avatar
Myle Ott committed
41
42
43
44
export TEXT=examples/stories/writingPrompts
fairseq-preprocess --source-lang wp_source --target-lang wp_target \
    --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
    --destdir data-bin/writingPrompts --padding-factor 1 --thresholdtgt 10 --thresholdsrc 10
45
46

# Train the model:
Myle Ott's avatar
Myle Ott committed
47
fairseq-train data-bin/writingPrompts -a fconv_self_att_wp --lr 0.25 --clip-norm 0.1 --max-tokens 1500 --lr-scheduler reduce_lr_on_plateau --decoder-attention True --encoder-attention False --criterion label_smoothed_cross_entropy --weight-decay .0000001 --label-smoothing 0 --source-lang wp_source --target-lang wp_target --gated-attention True --self-attention True --project-input True --pretrained False
48
49
50
51
52

# Train a fusion model:
# add the arguments: --pretrained True --pretrained-checkpoint path/to/checkpoint

# Generate:
53
54
# Note: to load the pretrained model at generation time, you need to pass in a model-override argument to communicate to the fusion model at generation time where you have placed the pretrained checkpoint. By default, it will load the exact path of the fusion model's pretrained model from training time. You should use model-override if you have moved the pretrained model (or are using our provided models). If you are generating from a non-fusion model, the model-override argument is not necessary.

55
fairseq-generate data-bin/writingPrompts --path /path/to/trained/model/checkpoint_best.pt --batch-size 32 --beam 1 --sampling --sampling-topk 10 --temperature 0.8 --nbest 1 --model-overrides "{'pretrained_checkpoint':'/path/to/pretrained/model/checkpoint'}"
56
```
57
58
59
60
61
62
63
64
65
66

## Citation
```bibtex
@inproceedings{fan2018hierarchical,
  title = {Hierarchical Neural Story Generation},
  author = {Fan, Angela and Lewis, Mike and Dauphin, Yann},
  booktitle = {Conference of the Association for Computational Linguistics (ACL)},
  year = 2018,
}
```