Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
3e20c2e8
"...git@developer.sourcefind.cn:chenpangpang/open-webui.git" did not exist on "b4cd084117d4cb40987122da5ebfb4d493e2bd0d"
Commit
3e20c2e8
authored
Nov 12, 2019
by
Louis MARTIN
Committed by
Julien Chaumond
Nov 16, 2019
Browse files
Update demo_camembert.py with new classes
parent
f12e4d8d
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
14 deletions
+3
-14
examples/contrib/demo_camembert.py
examples/contrib/demo_camembert.py
+3
-14
No files found.
examples/contrib/demo_camembert.py
View file @
3e20c2e8
...
@@ -5,7 +5,7 @@ import urllib.request
...
@@ -5,7 +5,7 @@ import urllib.request
import
torch
import
torch
from
transformers.tokenization_camembert
import
CamembertTokenizer
from
transformers.tokenization_camembert
import
CamembertTokenizer
from
transformers.modeling_
ro
bert
a
import
Ro
bert
a
ForMaskedLM
from
transformers.modeling_
camem
bert
import
Camem
bertForMaskedLM
def
fill_mask
(
masked_input
,
model
,
tokenizer
,
topk
=
5
):
def
fill_mask
(
masked_input
,
model
,
tokenizer
,
topk
=
5
):
...
@@ -40,19 +40,8 @@ def fill_mask(masked_input, model, tokenizer, topk=5):
...
@@ -40,19 +40,8 @@ def fill_mask(masked_input, model, tokenizer, topk=5):
return
topk_filled_outputs
return
topk_filled_outputs
model_path
=
Path
(
'camembert.v0.pytorch'
)
tokenizer
=
CamembertTokenizer
.
from_pretrained
(
'camembert-base'
)
if
not
model_path
.
exists
():
model
=
CamembertForMaskedLM
.
from_pretrained
(
'camembert-base'
)
compressed_path
=
model_path
.
with_suffix
(
'.tar.gz'
)
url
=
'http://dl.fbaipublicfiles.com/camembert/camembert.v0.pytorch.tar.gz'
print
(
'Downloading model...'
)
urllib
.
request
.
urlretrieve
(
url
,
compressed_path
)
print
(
'Extracting model...'
)
with
tarfile
.
open
(
compressed_path
)
as
f
:
f
.
extractall
(
model_path
.
parent
)
assert
model_path
.
exists
()
tokenizer_path
=
model_path
/
'sentencepiece.bpe.model'
tokenizer
=
CamembertTokenizer
.
from_pretrained
(
tokenizer_path
)
model
=
RobertaForMaskedLM
.
from_pretrained
(
model_path
)
model
.
eval
()
model
.
eval
()
masked_input
=
"Le camembert est <mask> :)"
masked_input
=
"Le camembert est <mask> :)"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment