Commit e6d5c392 authored by zihanl's avatar zihanl
Browse files

update README.md

parent 10962f0c
# Multi-Stage Prompting for Knowledgeable Dialogue Generation
We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the baselines, finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM).
We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the finetuning-based models (i.e., finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM)).
## Multi-Stage Dialogue Prompting (MSDP)
......@@ -11,22 +11,18 @@ We present the steps to run our multi-stage dialogue prompting (MSDP), as well a
### Knowledge Generation
1. The script ```tasks/knwl_dialo/scripts/prompt_knwl_gen.sh``` provides an example for how to perform the knowledge generation prompting.
2. The F1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics follow the [nlg-eval](https://github.com/Maluuba/nlg-eval).
2. The F1/FK1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics (i.e., BLEU, METEOR, and ROUGE-L) follow the [nlg-eval](https://github.com/Maluuba/nlg-eval).
### Response Generation
1. Prepare the input file for the response generation (based on the previously generated knowledge file):
2. The script ```tasks/knwl_dialo/scripts/prompt_resp_gen.sh``` provides an example for how to perform the response generation prompting.
3. The automatic evaluations are the same as mentioned aboved for the knowledge generation.
## FKG
## Finetuning-based Models
### Knowledge Generation
### Response Generation
## FCM
### FKG
The script ```tasks/knwl_dialo/scripts/finetune_knwl_gen.sh``` provides an example for how to train a finetuning-based knowledge generation (FKG) model.
### Knowledge Generation
### Response Generation
### FCM
The script ```tasks/knwl_dialo/scripts/finetune_resp_gen.sh``` provides an example for how to train a finetuning-based conversational model (FCM).
#!/bin/bash
WORLD_SIZE=8
DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6000"
CHECKPOINT_PATH=<Specify path for the language model>
OUTPUT_MODEL_PATH=<Specify path for the saved model>
VOCAB_PATH=<Specify path for the vocab file>
MERGE_PATH=<Specify path for the merge file>
TRAIN_PATH=<Specify path for the training dataset>
TEST_PATH=<Specify path for the test dataset>
python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
--tensor-model-parallel-size 1 \
--pipeline-model-parallel-size 1 \
--num-layers 24 \
--hidden-size 1024 \
--num-attention-heads 16 \
--seq-length 2048 \
--max-position-embeddings 2048 \
--micro-batch-size 4 \
--global-batch-size 64 \
--train-samples 61000 \
--lr-decay-samples 50000 \
--lr-warmup-samples 5000 \
--lr 1.5e-5 \
--min-lr 1.0e-5 \
--lr-decay-style cosine \
--log-interval 100 \
--vocab-file ${VOCAB_PATH} \
--merge-file ${MERGE_PATH} \
--save-interval 10000 \
--save ${OUTPUT_MODEL_PATH} \
--pretrained-checkpoint ${CHECKPOINT_PATH} \
--clip-grad 1.0 \
--weight-decay 0.1 \
--adam-beta1 0.9 \
--adam-beta2 0.95 \
--init-method-std 0.02 \
--log-params-norm \
--log-num-zeros-in-grad \
--fp16 \
--DDP-impl torch \
--checkpoint-activations \
--epochs 4 \
--task KNWL-DIALO-FINETUNE \
--module knowledge \
--spec-toks [SEP],[CTRL],[PAD] \
--train-data-path ${TRAIN_PATH} \
--test-data-path ${TEST_PATH} \
--max-seq-len 1024 \
--tokenizer-type GPT2BPETokenizer
#!/bin/bash
WORLD_SIZE=8
DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6000"
CHECKPOINT_PATH=<Specify path for the language model>
OUTPUT_MODEL_PATH=<Specify path for the saved model>
VOCAB_PATH=<Specify path for the vocab file>
MERGE_PATH=<Specify path for the merge file>
TRAIN_PATH=<Specify path for the training dataset>
TEST_PATH=<Specify path for the test dataset>
python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
--tensor-model-parallel-size 1 \
--pipeline-model-parallel-size 1 \
--num-layers 24 \
--hidden-size 1024 \
--num-attention-heads 16 \
--seq-length 2048 \
--max-position-embeddings 2048 \
--micro-batch-size 4 \
--global-batch-size 64 \
--train-samples 142000 \
--lr-decay-samples 10000 \
--lr-warmup-samples 3000 \
--lr 1.0e-5 \
--min-lr 5.0e-6 \
--lr-decay-style cosine \
--log-interval 100 \
--vocab-file ${VOCAB_PATH} \
--merge-file ${MERGE_PATH} \
--save-interval 10000 \
--save ${OUTPUT_MODEL_PATH} \
--pretrained-checkpoint ${CHECKPOINT_PATH} \
--clip-grad 1.0 \
--weight-decay 0.1 \
--adam-beta1 0.9 \
--adam-beta2 0.95 \
--init-method-std 0.02 \
--log-params-norm \
--log-num-zeros-in-grad \
--fp16 \
--DDP-impl torch \
--checkpoint-activations \
--epochs 3 \
--task KNWL-DIALO-FINETUNE \
--module response \
--spec-toks [SEP],[CTRL],[PAD] \
--train-data-path ${TRAIN_PATH} \
--test-data-path ${TEST_PATH} \
--max-seq-len 1024 \
--tokenizer-type GPT2BPETokenizer
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment