Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Megatron-LM
Commits
423c51b0
Commit
423c51b0
authored
Mar 31, 2020
by
Neel Kant
Browse files
Bugfix and remove unneeded script
parent
e949a5c5
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
1 addition
and
35 deletions
+1
-35
megatron/data_utils/datasets.py
megatron/data_utils/datasets.py
+1
-1
run_bert_ict.sh
run_bert_ict.sh
+0
-34
No files found.
megatron/data_utils/datasets.py
View file @
423c51b0
...
...
@@ -966,7 +966,7 @@ class InverseClozeDataset(data.Dataset):
padless_max_len
=
self
.
max_seq_len
-
2
# select a random sentence from the document as input
input_sentence_idx
=
rng
.
randint
(
num_sentences
)
input_sentence_idx
=
rng
.
randint
(
0
,
num_sentences
-
1
)
tokens
,
token_types
=
self
.
sentence_tokenize
(
doc
[
input_sentence_idx
],
0
)
input_tokens
,
input_token_types
=
tokens
[:
target_seq_length
],
token_types
[:
target_seq_length
]
if
not
len
(
input_tokens
)
>
0
:
...
...
run_bert_ict.sh
deleted
100755 → 0
View file @
e949a5c5
#!/bin/bash
LENGTH
=
512
CHKPT
=
"chkpts/debug"
COMMAND
=
"/home/scratch.gcf/adlr-utils/release/cluster-interface/latest/mp_launch python pretrain_bert_ict.py
\
--num-layers 6
\
--hidden-size 768
\
--num-attention-heads 12
\
--batch-size 1
\
--checkpoint-activations
\
--seq-length
$LENGTH
\
--max-position-embeddings
$LENGTH
\
--train-iters 1000
\
--no-save-optim --no-save-rng
\
--save
$CHKPT
\
--resume-dataloader
\
--train-data /home/universal-lm-data.cosmos549/datasets/wikipedia/wikidump_lines.json
\
--presplit-sentences
\
--loose-json
\
--text-key text
\
--data-loader lazy
\
--tokenizer-type BertWordPieceTokenizer
\
--cache-dir cache
\
--split 58,1,1
\
--distributed-backend nccl
\
--lr 0.00015
\
--num-workers 0
\
--no-load-optim --finetune
\
--lr-decay-style cosine
\
--weight-decay 1e-2
\
--clip-grad 1.0
\
--warmup .01
\
--save-interval 1000
\
--fp16 --adlr-autoresume --adlr-autoresume-interval 5000"
submit_job
--image
'http://gitlab-master.nvidia.com/adlr/megatron-lm/megatron:rouge_score'
--mounts
/home/universal-lm-data.cosmos549,/home/raulp
-c
"
${
COMMAND
}
"
--name
ict_test
--partition
interactive
--gpu
8
--nodes
2
--autoresume_timer
300
-i
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment