Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
7ccd973e
"...data/git@developer.sourcefind.cn:wangsen/paddle_dbnet.git" did not exist on "311569b2bca6b12ff7eaa6781b2de03c51d6e8dc"
Unverified
Commit
7ccd973e
authored
Dec 08, 2020
by
Nguyen Van Nha
Committed by
GitHub
Dec 07, 2020
Browse files
Update README.txt (#8957)
parent
37f4c24f
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
18 additions
and
1 deletion
+18
-1
model_cards/NlpHUST/vibert4news-base-cased/README.md
model_cards/NlpHUST/vibert4news-base-cased/README.md
+18
-1
No files found.
model_cards/NlpHUST/vibert4news-base-cased/README.md
View file @
7ccd973e
---
---
language
:
vn
language
:
vn
---
---
# BERT for Vietnamese is trained on more 20 GB news dataset
# BERT for Vietnamese is trained on more 20 GB news dataset
Apply for task sentiment analysis on using
[
AIViVN's comments dataset
](
https://www.aivivn.com/contests/6
)
Apply for task sentiment analysis on using
[
AIViVN's comments dataset
](
https://www.aivivn.com/contests/6
)
...
@@ -19,7 +18,24 @@ You can download trained model:
...
@@ -19,7 +18,24 @@ You can download trained model:
-
[
tensorflow
](
https://drive.google.com/file/d/1X-sRDYf7moS_h61J3L79NkMVGHP-P-k5/view?usp=sharing
)
.
-
[
tensorflow
](
https://drive.google.com/file/d/1X-sRDYf7moS_h61J3L79NkMVGHP-P-k5/view?usp=sharing
)
.
-
[
pytorch
](
https://drive.google.com/file/d/11aFSTpYIurn-oI2XpAmcCTccB_AonMOu/view?usp=sharing
)
.
-
[
pytorch
](
https://drive.google.com/file/d/11aFSTpYIurn-oI2XpAmcCTccB_AonMOu/view?usp=sharing
)
.
Use with huggingface/transformers
```
bash
import torch
from transformers import AutoTokenizer,AutoModel
tokenizer
=
AutoTokenizer.from_pretrained
(
"NlpHUST/vibert4news-base-cased"
)
bert_model
=
AutoModel.from_pretrained
(
"NlpHUST/vibert4news-base-cased"
)
line
=
"Tôi là sinh viên trường Bách Khoa Hà Nội ."
input_id
=
tokenizer.encode
(
line,add_special_tokens
=
True
)
att_mask
=
[
int
(
token_id
>
0
)
for
token_id
in
input_id]
input_ids
=
torch.tensor
([
input_id]
)
att_masks
=
torch.tensor
([
att_mask]
)
with torch.no_grad
()
:
features
=
bert_model
(
input_ids,att_masks
)
print
(
features
)
```
Run training with base config
Run training with base config
...
@@ -36,3 +52,4 @@ python train_pytorch.py \
...
@@ -36,3 +52,4 @@ python train_pytorch.py \
### Contact information
### Contact information
For personal communication related to this project, please contact Nha Nguyen Van (nha282@gmail.com).
For personal communication related to this project, please contact Nha Nguyen Van (nha282@gmail.com).
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment