update README.md

e6d5c392 · zihanl · 10962f0c · e6d5c392 · e6d5c392 · e6d5c392
Commit e6d5c392 authored Nov 27, 2021 by zihanl
3 changed files
--- a/tasks/knwl_dialo/README.md
+++ b/tasks/knwl_dialo/README.md

 # Multi-Stage Prompting for Knowledgeable Dialogue Generation

-We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the baselines, finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM).
+We present the steps to run our multi-stage dialogue prompting (MSDP), as well as the finetuning-based models (i.e., finetuning-based knowledge generation (FKG) and finetuning-based coversation model (FCM)).

 ## Multi-Stage Dialogue Prompting (MSDP)

@@ -11,22 +11,18 @@ We present the steps to run our multi-stage dialogue prompting (MSDP), as well a

 ### Knowledge Generation
 1. The script ```tasks/knwl_dialo/scripts/prompt_knwl_gen.sh``` provides an example for how to perform the knowledge generation prompting.
-2. The F1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics follow the [nlg-eval](https://github.com/Maluuba/nlg-eval).
+2. The F1/FK1 score can be evaluated through ```tasks/knwl_dialo/scripts/eval_generation.sh```. Other automatic metrics (i.e., BLEU, METEOR, and ROUGE-L) follow the [nlg-eval](https://github.com/Maluuba/nlg-eval).

 ### Response Generation
 1. Prepare the input file for the response generation (based on the previously generated knowledge file):
 2. The script ```tasks/knwl_dialo/scripts/prompt_resp_gen.sh``` provides an example for how to perform the response generation prompting.
 3. The automatic evaluations are the same as mentioned aboved for the knowledge generation.

-## FKG
+## Finetuning-based Models

-### Knowledge Generation
-
-### Response Generation
-
-## FCM
+### FKG
+The script ```tasks/knwl_dialo/scripts/finetune_knwl_gen.sh``` provides an example for how to train a finetuning-based knowledge generation (FKG) model.

-### Knowledge Generation
-
-### Response Generation
+### FCM
+The script ```tasks/knwl_dialo/scripts/finetune_resp_gen.sh``` provides an example for how to train a finetuning-based conversational model (FCM).

--- a/tasks/knwl_dialo/scripts/finetune_knwl_gen.sh
+++ b/tasks/knwl_dialo/scripts/finetune_knwl_gen.sh
+#!/bin/bash
+
+WORLD_SIZE=8
+
+DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
+                  --nnodes 1 \
+                  --node_rank 0 \
+                  --master_addr localhost \
+                  --master_port 6000"
+
+CHECKPOINT_PATH=<Specify path for the language model>
+OUTPUT_MODEL_PATH=<Specify path for the saved model>
+VOCAB_PATH=<Specify path for the vocab file>
+MERGE_PATH=<Specify path for the merge file>
+TRAIN_PATH=<Specify path for the training dataset>
+TEST_PATH=<Specify path for the test dataset>
+
+python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
+        --tensor-model-parallel-size 1 \
+        --pipeline-model-parallel-size 1 \
+        --num-layers 24 \
+        --hidden-size 1024 \
+        --num-attention-heads 16 \
+        --seq-length 2048 \
+        --max-position-embeddings 2048 \
+        --micro-batch-size 4 \
+        --global-batch-size 64 \
+        --train-samples 61000 \
+        --lr-decay-samples 50000 \
+        --lr-warmup-samples 5000 \
+        --lr 1.5e-5 \
+        --min-lr 1.0e-5 \
+        --lr-decay-style cosine \
+        --log-interval 100 \
+        --vocab-file ${VOCAB_PATH} \
+        --merge-file ${MERGE_PATH} \
+        --save-interval 10000 \
+        --save ${OUTPUT_MODEL_PATH} \
+        --pretrained-checkpoint ${CHECKPOINT_PATH} \
+        --clip-grad 1.0 \
+        --weight-decay 0.1 \
+        --adam-beta1 0.9 \
+        --adam-beta2 0.95 \
+        --init-method-std 0.02 \
+        --log-params-norm \
+        --log-num-zeros-in-grad \
+        --fp16 \
+        --DDP-impl torch \
+        --checkpoint-activations \
+        --epochs 4 \
+        --task KNWL-DIALO-FINETUNE \
+        --module knowledge \
+        --spec-toks [SEP],[CTRL],[PAD] \
+        --train-data-path ${TRAIN_PATH} \
+        --test-data-path ${TEST_PATH} \
+        --max-seq-len 1024 \
+        --tokenizer-type GPT2BPETokenizer
--- a/tasks/knwl_dialo/scripts/finetune_resp_gen.sh
+++ b/tasks/knwl_dialo/scripts/finetune_resp_gen.sh
+#!/bin/bash
+
+WORLD_SIZE=8
+
+DISTRIBUTED_ARGS="--nproc_per_node $WORLD_SIZE \
+                  --nnodes 1 \
+                  --node_rank 0 \
+                  --master_addr localhost \
+                  --master_port 6000"
+
+CHECKPOINT_PATH=<Specify path for the language model>
+OUTPUT_MODEL_PATH=<Specify path for the saved model>
+VOCAB_PATH=<Specify path for the vocab file>
+MERGE_PATH=<Specify path for the merge file>
+TRAIN_PATH=<Specify path for the training dataset>
+TEST_PATH=<Specify path for the test dataset>
+
+python -m torch.distributed.launch $DISTRIBUTED_ARGS ./tasks/main.py \
+        --tensor-model-parallel-size 1 \
+        --pipeline-model-parallel-size 1 \
+        --num-layers 24 \
+        --hidden-size 1024 \
+        --num-attention-heads 16 \
+        --seq-length 2048 \
+        --max-position-embeddings 2048 \
+        --micro-batch-size 4 \
+        --global-batch-size 64 \
+        --train-samples 142000 \
+        --lr-decay-samples 10000 \
+        --lr-warmup-samples 3000 \
+        --lr 1.0e-5 \
+        --min-lr 5.0e-6 \
+        --lr-decay-style cosine \
+        --log-interval 100 \
+        --vocab-file ${VOCAB_PATH} \
+        --merge-file ${MERGE_PATH} \
+        --save-interval 10000 \
+        --save ${OUTPUT_MODEL_PATH} \
+        --pretrained-checkpoint ${CHECKPOINT_PATH} \
+        --clip-grad 1.0 \
+        --weight-decay 0.1 \
+        --adam-beta1 0.9 \
+        --adam-beta2 0.95 \
+        --init-method-std 0.02 \
+        --log-params-norm \
+        --log-num-zeros-in-grad \
+        --fp16 \
+        --DDP-impl torch \
+        --checkpoint-activations \
+        --epochs 3 \
+        --task KNWL-DIALO-FINETUNE \
+        --module response \
+        --spec-toks [SEP],[CTRL],[PAD] \
+        --train-data-path ${TRAIN_PATH} \
+        --test-data-path ${TEST_PATH} \
+        --max-seq-len 1024 \
+        --tokenizer-type GPT2BPETokenizer