Commit e6d5c392 authored by zihanl's avatar zihanl
Browse files

update README.md

parent 10962f0c
# Multi-Stage Prompting for Knowledgeable Dialogue Generation # Multi-Stage Prompting for Knowledgeable Dialogue Generation
We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the baselines, finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM). We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the finetuning-based models (i.e., finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM)).
## Multi-Stage Dialogue Prompting (MSDP) ## Multi-Stage Dialogue Prompting (MSDP)
...@@ -11,22 +11,18 @@ We present the steps to run our multi-stage dialogue prompting (MSDP), as well a ...@@ -11,22 +11,18 @@ We present the steps to run our multi-stage dialogue prompting (MSDP), as well a
### Knowledge Generation ### Knowledge Generation
1. The script ```tasks/knwl_dialo/scripts/prompt_knwl_gen.sh``` provides an example for how to perform the knowledge generation prompting. 1. The script ```tasks/knwl_dialo/scripts/prompt_knwl_gen.sh``` provides an example for how to perform the knowledge generation prompting.
2. The F1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics follow the [nlg-eval](https://github.com/Maluuba/nlg-eval). 2. The F1/FK1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics (i.e., BLEU, METEOR, and ROUGE-L) follow the [nlg-eval](https://github.com/Maluuba/nlg-eval).
### Response Generation ### Response Generation
1. Prepare the input file for the response generation (based on the previously generated knowledge file): 1. Prepare the input file for the response generation (based on the previously generated knowledge file):
2. The script ```tasks/knwl_dialo/scripts/prompt_resp_gen.sh``` provides an example for how to perform the response generation prompting. 2. The script ```tasks/knwl_dialo/scripts/prompt_resp_gen.sh``` provides an example for how to perform the response generation prompting.
3. The automatic evaluations are the same as mentioned aboved for the knowledge generation. 3. The automatic evaluations are the same as mentioned aboved for the knowledge generation.
## FKG ## Finetuning-based Models
### Knowledge Generation ### FKG
The script ```tasks/knwl_dialo/scripts/finetune_knwl_gen.sh``` provides an example for how to train a finetuning-based knowledge generation (FKG) model.
### Response Generation
## FCM
### Knowledge Generation ### FCM
The script ```tasks/knwl_dialo/scripts/finetune_resp_gen.sh``` provides an example for how to train a finetuning-based conversational model (FCM).
### Response Generation
#!/bin/bash
WORLD_SIZE=8
DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6000"
CHECKPOINT_PATH=<Specify path for the language model>
OUTPUT_MODEL_PATH=<Specify path for the saved model>
VOCAB_PATH=<Specify path for the vocab file>
MERGE_PATH=<Specify path for the merge file>
TRAIN_PATH=<Specify path for the training dataset>
TEST_PATH=<Specify path for the test dataset>
python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
--tensor-model-parallel-size 1 \
--pipeline-model-parallel-size 1 \
--num-layers 24 \
--hidden-size 1024 \
--num-attention-heads 16 \
--seq-length 2048 \
--max-position-embeddings 2048 \
--micro-batch-size 4 \
--global-batch-size 64 \
--train-samples 61000 \
--lr-decay-samples 50000 \
--lr-warmup-samples 5000 \
--lr 1.5e-5 \
--min-lr 1.0e-5 \
--lr-decay-style cosine \
--log-interval 100 \
--vocab-file ${VOCAB_PATH} \
--merge-file ${MERGE_PATH} \
--save-interval 10000 \
--save ${OUTPUT_MODEL_PATH} \
--pretrained-checkpoint ${CHECKPOINT_PATH} \
--clip-grad 1.0 \
--weight-decay 0.1 \
--adam-beta1 0.9 \
--adam-beta2 0.95 \
--init-method-std 0.02 \
--log-params-norm \
--log-num-zeros-in-grad \
--fp16 \
--DDP-impl torch \
--checkpoint-activations \
--epochs 4 \
--task KNWL-DIALO-FINETUNE \
--module knowledge \
--spec-toks [SEP],[CTRL],[PAD] \
--train-data-path ${TRAIN_PATH} \
--test-data-path ${TEST_PATH} \
--max-seq-len 1024 \
--tokenizer-type GPT2BPETokenizer
#!/bin/bash
WORLD_SIZE=8
DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6000"
CHECKPOINT_PATH=<Specify path for the language model>
OUTPUT_MODEL_PATH=<Specify path for the saved model>
VOCAB_PATH=<Specify path for the vocab file>
MERGE_PATH=<Specify path for the merge file>
TRAIN_PATH=<Specify path for the training dataset>
TEST_PATH=<Specify path for the test dataset>
python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
--tensor-model-parallel-size 1 \
--pipeline-model-parallel-size 1 \
--num-layers 24 \
--hidden-size 1024 \
--num-attention-heads 16 \
--seq-length 2048 \
--max-position-embeddings 2048 \
--micro-batch-size 4 \
--global-batch-size 64 \
--train-samples 142000 \
--lr-decay-samples 10000 \
--lr-warmup-samples 3000 \
--lr 1.0e-5 \
--min-lr 5.0e-6 \
--lr-decay-style cosine \
--log-interval 100 \
--vocab-file ${VOCAB_PATH} \
--merge-file ${MERGE_PATH} \
--save-interval 10000 \
--save ${OUTPUT_MODEL_PATH} \
--pretrained-checkpoint ${CHECKPOINT_PATH} \
--clip-grad 1.0 \
--weight-decay 0.1 \
--adam-beta1 0.9 \
--adam-beta2 0.95 \
--init-method-std 0.02 \
--log-params-norm \
--log-num-zeros-in-grad \
--fp16 \
--DDP-impl torch \
--checkpoint-activations \
--epochs 3 \
--task KNWL-DIALO-FINETUNE \
--module response \
--spec-toks [SEP],[CTRL],[PAD] \
--train-data-path ${TRAIN_PATH} \
--test-data-path ${TEST_PATH} \
--max-seq-len 1024 \
--tokenizer-type GPT2BPETokenizer
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment