Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
769e6ba0
Unverified
Commit
769e6ba0
authored
Jul 28, 2020
by
Ramsri Goutham Golla
Committed by
GitHub
Jul 27, 2020
Browse files
Create README.md (#6032)
Adding model card - readme
parent
fd347e0d
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
84 additions
and
0 deletions
+84
-0
model_cards/ramsrigouthamg/t5_paraphraser/README.md
model_cards/ramsrigouthamg/t5_paraphraser/README.md
+84
-0
No files found.
model_cards/ramsrigouthamg/t5_paraphraser/README.md
0 → 100644
View file @
769e6ba0
## Model in Action 🚀
```
python
import
torch
from
transformers
import
T5ForConditionalGeneration
,
T5Tokenizer
def
set_seed
(
seed
):
torch
.
manual_seed
(
seed
)
if
torch
.
cuda
.
is_available
():
torch
.
cuda
.
manual_seed_all
(
seed
)
set_seed
(
42
)
model
=
T5ForConditionalGeneration
.
from_pretrained
(
'ramsrigouthamg/t5_paraphraser'
)
tokenizer
=
T5Tokenizer
.
from_pretrained
(
'ramsrigouthamg/t5_paraphraser'
)
device
=
torch
.
device
(
"cuda"
if
torch
.
cuda
.
is_available
()
else
"cpu"
)
print
(
"device "
,
device
)
model
=
model
.
to
(
device
)
sentence
=
"Which course should I take to get started in data science?"
# sentence = "What are the ingredients required to bake a perfect cake?"
# sentence = "What is the best possible approach to learn aeronautical engineering?"
# sentence = "Do apples taste better than oranges in general?"
text
=
"paraphrase: "
+
sentence
+
" </s>"
max_len
=
256
encoding
=
tokenizer
.
encode_plus
(
text
,
pad_to_max_length
=
True
,
return_tensors
=
"pt"
)
input_ids
,
attention_masks
=
encoding
[
"input_ids"
].
to
(
device
),
encoding
[
"attention_mask"
].
to
(
device
)
# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
beam_outputs
=
model
.
generate
(
input_ids
=
input_ids
,
attention_mask
=
attention_masks
,
do_sample
=
True
,
max_length
=
256
,
top_k
=
120
,
top_p
=
0.98
,
early_stopping
=
True
,
num_return_sequences
=
10
)
print
(
"
\n
Original Question ::"
)
print
(
sentence
)
print
(
"
\n
"
)
print
(
"Paraphrased Questions :: "
)
final_outputs
=
[]
for
beam_output
in
beam_outputs
:
sent
=
tokenizer
.
decode
(
beam_output
,
skip_special_tokens
=
True
,
clean_up_tokenization_spaces
=
True
)
if
sent
.
lower
()
!=
sentence
.
lower
()
and
sent
not
in
final_outputs
:
final_outputs
.
append
(
sent
)
for
i
,
final_output
in
enumerate
(
final_outputs
):
print
(
"{}: {}"
.
format
(
i
,
final_output
))
```
## Output
```
Original Question ::
Which course should I take to get started in data science?
Paraphrased Questions ::
0: What should I learn to become a data scientist?
1: How do I get started with data science?
2: How would you start a data science career?
3: How can I start learning data science?
4: How do you get started in data science?
5: What's the best course for data science?
6: Which course should I start with for data science?
7: What courses should I follow to get started in data science?
8: What degree should be taken by a data scientist?
9: Which course should I follow to become a Data Scientist?
```
## Detailed blog post available here :
https://towardsdatascience.com/paraphrase-any-question-with-t5-text-to-text-transfer-transformer-pretrained-model-and-cbb9e35f1555
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment