Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
2ae98336
Commit
2ae98336
authored
Feb 18, 2020
by
VictorSanh
Browse files
fix vocab size in binarized_data (distil): int16 vs int32
parent
0dbddba6
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
2 deletions
+6
-2
examples/distillation/scripts/binarized_data.py
examples/distillation/scripts/binarized_data.py
+6
-2
No files found.
examples/distillation/scripts/binarized_data.py
View file @
2ae98336
...
@@ -75,13 +75,17 @@ def main():
...
@@ -75,13 +75,17 @@ def main():
iter
+=
1
iter
+=
1
if
iter
%
interval
==
0
:
if
iter
%
interval
==
0
:
end
=
time
.
time
()
end
=
time
.
time
()
logger
.
info
(
f
"
{
iter
}
examples processed. -
{
(
end
-
start
)
/
interval
:.
2
f
}
s/
expl"
)
logger
.
info
(
f
"
{
iter
}
examples processed. -
{
(
end
-
start
)
:.
2
f
}
s/
{
interval
}
expl"
)
start
=
time
.
time
()
start
=
time
.
time
()
logger
.
info
(
"Finished binarization"
)
logger
.
info
(
"Finished binarization"
)
logger
.
info
(
f
"
{
len
(
data
)
}
examples processed."
)
logger
.
info
(
f
"
{
len
(
data
)
}
examples processed."
)
dp_file
=
f
"
{
args
.
dump_file
}
.
{
args
.
tokenizer_name
}
.pickle"
dp_file
=
f
"
{
args
.
dump_file
}
.
{
args
.
tokenizer_name
}
.pickle"
vocab_size
=
tokenizer
.
vocab_size
if
vocab_size
<
(
1
<<
16
):
rslt_
=
[
np
.
uint16
(
d
)
for
d
in
rslt
]
rslt_
=
[
np
.
uint16
(
d
)
for
d
in
rslt
]
else
:
rslt_
=
[
np
.
int32
(
d
)
for
d
in
rslt
]
random
.
shuffle
(
rslt_
)
random
.
shuffle
(
rslt_
)
logger
.
info
(
f
"Dump to
{
dp_file
}
"
)
logger
.
info
(
f
"Dump to
{
dp_file
}
"
)
with
open
(
dp_file
,
"wb"
)
as
handle
:
with
open
(
dp_file
,
"wb"
)
as
handle
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment