Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
0ed7c00b
Unverified
Commit
0ed7c00b
authored
Aug 13, 2020
by
cedspam
Committed by
GitHub
Aug 13, 2020
Browse files
Update README.md (#6435)
* Update README.md * Update README.md * Update README.md
parent
e983da0e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
39 additions
and
0 deletions
+39
-0
model_cards/cedpsam/chatbot_fr/README.md
model_cards/cedpsam/chatbot_fr/README.md
+39
-0
No files found.
model_cards/cedpsam/chatbot_fr/README.md
View file @
0ed7c00b
---
---
language
:
fr
language
:
fr
tags
:
-
conversational
widget
:
-
text
:
"
bonjour."
widget
:
-
text
:
"
mais
encore"
widget
:
-
text
:
"
est
ce
que
l'argent
achete
le
bonheur?"
---
---
## a dialoggpt model trained on french opensubtitles with custom tokenizer
## a dialoggpt model trained on french opensubtitles with custom tokenizer
trained with this notebook
trained with this notebook
https://colab.research.google.com/drive/1pfCV3bngAmISNZVfDvBMyEhQKuYw37Rl#scrollTo=AyImj9qZYLRi&uniqifier=3
https://colab.research.google.com/drive/1pfCV3bngAmISNZVfDvBMyEhQKuYw37Rl#scrollTo=AyImj9qZYLRi&uniqifier=3
config from microsoft/DialoGPT-medium
### How to use
Now we are ready to try out how the model works as a chatting partner!
```
python
import
torch
from
transformers
import
AutoTokenizer
,
AutoModelWithLMHead
tokenizer
=
AutoTokenizer
.
from_pretrained
(
"cedpsam/chatbot_fr"
)
model
=
AutoModelWithLMHead
.
from_pretrained
(
"cedpsam/chatbot_fr"
)
for
step
in
range
(
6
):
# encode the new user input, add the eos_token and return a tensor in Pytorch
new_user_input_ids
=
tokenizer
.
encode
(
input
(
">> User:"
)
+
tokenizer
.
eos_token
,
return_tensors
=
'pt'
)
# print(new_user_input_ids)
# append the new user input tokens to the chat history
bot_input_ids
=
torch
.
cat
([
chat_history_ids
,
new_user_input_ids
],
dim
=-
1
)
if
step
>
0
else
new_user_input_ids
# generated a response while limiting the total chat history to 1000 tokens,
chat_history_ids
=
model
.
generate
(
bot_input_ids
,
max_length
=
1000
,
pad_token_id
=
tokenizer
.
eos_token_id
,
top_p
=
0.92
,
top_k
=
50
)
# pretty print last ouput tokens from bot
print
(
"DialoGPT: {}"
.
format
(
tokenizer
.
decode
(
chat_history_ids
[:,
bot_input_ids
.
shape
[
-
1
]:][
0
],
skip_special_tokens
=
True
)))
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment