Update README.txt (#8957)

7ccd973e · Nguyen Van Nha · GitHub · 37f4c24f · 7ccd973e
Unverified Commit 7ccd973e authored Dec 08, 2020 by Nguyen Van Nha Committed by GitHub Dec 07, 2020
Show whitespace changes
Inline Side-by-side

Showing with 18 additions and 1 deletion

model_cards/NlpHUST/vibert4news-base-cased/README.md model_cards/NlpHUST/vibert4news-base-cased/README.md +18 -1

No files found.
--- a/model_cards/NlpHUST/vibert4news-base-cased/README.md
+++ b/model_cards/NlpHUST/vibert4news-base-cased/README.md
 ---
 language: vn
 ---
 # BERT for Vietnamese is trained on more 20 GB news dataset
 Apply for task sentiment analysis on using [AIViVN's comments dataset](https://www.aivivn.com/contests/6)
@@ -19,7 +18,24 @@ You can download trained model:
 - [tensorflow](https://drive.google.com/file/d/1X-sRDYf7moS_h61J3L79NkMVGHP-P-k5/view?usp=sharing).
 - [pytorch](https://drive.google.com/file/d/11aFSTpYIurn-oI2XpAmcCTccB_AonMOu/view?usp=sharing).
+Use with huggingface/transformers
+``` bash
+import torch
+from transformers import AutoTokenizer,AutoModel
+tokenizer= AutoTokenizer.from_pretrained("NlpHUST/vibert4news-base-cased")
+bert_model = AutoModel.from_pretrained("NlpHUST/vibert4news-base-cased")
+line = "Tôi là sinh viên trường Bách Khoa Hà Nội ."
+input_id = tokenizer.encode(line,add_special_tokens = True)
+att_mask = [int(token_id > 0) for token_id in input_id]
+input_ids = torch.tensor([input_id])
+att_masks = torch.tensor([att_mask])
+with torch.no_grad():
+    features = bert_model(input_ids,att_masks)
+print(features)
+```
 Run training with base config
@@ -36,3 +52,4 @@ python train_pytorch.py \
 ### Contact information
 For personal communication related to this project, please contact Nha Nguyen Van (nha282@gmail.com).