Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
4af94b3f
Commit
4af94b3f
authored
Dec 06, 2021
by
zihanl
Browse files
delete finetune_knwl_gen.sh
parent
243e5a4e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
0 additions
and
54 deletions
+0
-54
tasks/knwl_dialo/scripts/finetune_knwl_gen.sh
tasks/knwl_dialo/scripts/finetune_knwl_gen.sh
+0
-54
No files found.
tasks/knwl_dialo/scripts/finetune_knwl_gen.sh
deleted
100644 → 0
View file @
243e5a4e
#!/bin/bash
# Finetune a pretrained language model to generate the context-relevant knowledge
# The input is the dialogue context, and output is the relevant knowledge
# The size of the pretrained language model is 357M
WORLD_SIZE
=
8
DISTRIBUTED_ARGS
=
"--nproc_per_node
$WORLD_SIZE
\
--nnodes 1
\
--node_rank 0
\
--master_addr localhost
\
--master_port 6000"
CHECKPOINT_PATH
=
<PATH_OF_THE_LANGUAGE_MODEL>
OUTPUT_MODEL_PATH
=
<PATH_OF_THE_SAVED_MODEL>
VOCAB_PATH
=
<PATH_OF_THE_VOCAB_FILE>
MERGE_PATH
=
<PATH_OF_THE_MERGE_FILE>
TRAIN_PATH
=
<PATH_OF_THE_TRAINING_DATASET>
TEST_PATH
=
<PATH_OF_THE_TEST_DATASET>
python
-m
torch.distributed.launch
$DISTRIBUTED_ARGS
./tasks/main.py
\
--num-layers
24
\
--hidden-size
1024
\
--num-attention-heads
16
\
--seq-length
2048
\
--max-position-embeddings
2048
\
--micro-batch-size
4
\
--global-batch-size
64
\
--train-samples
61000
\
--lr-decay-samples
50000
\
--lr-warmup-samples
5000
\
--lr
1.5e-5
\
--min-lr
1.0e-5
\
--lr-decay-style
cosine
\
--vocab-file
${
VOCAB_PATH
}
\
--merge-file
${
MERGE_PATH
}
\
--save-interval
10000
\
--save
${
OUTPUT_MODEL_PATH
}
\
--pretrained-checkpoint
${
CHECKPOINT_PATH
}
\
--weight-decay
0.1
\
--adam-beta2
0.95
\
--log-params-norm
\
--log-num-zeros-in-grad
\
--fp16
\
--DDP-impl
torch
\
--checkpoint-activations
\
--epochs
4
\
--task
KNWL-DIALO-FINETUNE
\
--module
knowledge
\
--spec-toks
[
SEP],[CTRL],[PAD]
\
--train-data
${
TRAIN_PATH
}
\
--test-data
${
TEST_PATH
}
\
--tokenizer-type
GPT2BPETokenizer
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment