Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
0eecacea
Unverified
Commit
0eecacea
authored
Aug 30, 2020
by
Rodolfo De Nadai
Committed by
GitHub
Aug 30, 2020
Browse files
BR_BERTo model card (#6793)
parent
d176aaad
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
5 deletions
+9
-5
model_cards/rdenadai/BR_BERTo/README.md
model_cards/rdenadai/BR_BERTo/README.md
+9
-5
No files found.
model_cards/rdenadai/BR_BERTo/README.md
View file @
0eecacea
...
@@ -14,13 +14,17 @@ Portuguese (Brazil) model for text inference.
...
@@ -14,13 +14,17 @@ Portuguese (Brazil) model for text inference.
## Params
## Params
Trained on a corpus of
5_258_624
sentences
, with 132_807_374 non unique tokens (992_418 unique tokens)
.
Trained on a corpus of
6_993_330
sentences.
-
Vocab size:
22
0_000
-
Vocab size:
15
0_000
-
RobertaForMaskedLM size :
3
2
-
RobertaForMaskedLM size :
51
2
-
Num train epochs:
2
-
Num train epochs:
3
-
Time to train: ~
23h
s (on GCP with a Nvidia T4)
-
Time to train: ~
10day
s (on GCP with a Nvidia T4)
I follow the great tutorial from HuggingFace team:
I follow the great tutorial from HuggingFace team:
[
How to train a new language model from scratch using Transformers and Tokenizers
](
https://huggingface.co/blog/how-to-train
)
[
How to train a new language model from scratch using Transformers and Tokenizers
](
https://huggingface.co/blog/how-to-train
)
More infor here:
[
BR_BERTo
](
https://github.com/rdenadai/BR-BERTo
)
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment