Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
25895056
Unverified
Commit
25895056
authored
Jun 23, 2020
by
furunkel
Committed by
GitHub
Jun 22, 2020
Browse files
Add model card for StackOBERTflow-comments-small (#5008)
* Create README.md * Update README.md
parent
d8c26ed1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
39 additions
and
0 deletions
+39
-0
model_cards/giganticode/StackOBERTflow-comments-small-v1/README.md
...ds/giganticode/StackOBERTflow-comments-small-v1/README.md
+39
-0
No files found.
model_cards/giganticode/StackOBERTflow-comments-small-v1/README.md
0 → 100644
View file @
25895056
# StackOBERTflow-comments-small
StackOBERTflow is a RoBERTa model trained on StackOverflow comments.
A Byte-level BPE tokenizer with dropout was used (using the
`tokenizers`
package).
The model is
*small*
, i.e. has only 6-layers and the maximum sequence length was restricted to 256 tokens.
The model was trained for 6 epochs on several GBs of comments from the StackOverflow corpus.
## Quick start: masked language modeling prediction
```
python
from
transformers
import
pipeline
from
pprint
import
pprint
COMMENT
=
"You really should not do it this way, I would use <mask> instead."
fill_mask
=
pipeline
(
"fill-mask"
,
model
=
"giganticode/StackOBERTflow-comments-small-v1"
,
tokenizer
=
"giganticode/StackOBERTflow-comments-small-v1"
)
pprint
(
fill_mask
(
COMMENT
))
# [{'score': 0.019997311756014824,
# 'sequence': '<s> You really should not do it this way, I would use jQuery instead.</s>',
# 'token': 1738},
# {'score': 0.01693696901202202,
# 'sequence': '<s> You really should not do it this way, I would use arrays instead.</s>',
# 'token': 2844},
# {'score': 0.013411642983555794,
# 'sequence': '<s> You really should not do it this way, I would use CSS instead.</s>',
# 'token': 2254},
# {'score': 0.013224546797573566,
# 'sequence': '<s> You really should not do it this way, I would use it instead.</s>',
# 'token': 300},
# {'score': 0.011984303593635559,
# 'sequence': '<s> You really should not do it this way, I would use classes instead.</s>',
# 'token': 1779}]
```
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment