README.md 2.22 KB
Newer Older
yangzhong's avatar
yangzhong committed
1
2
# Bert-large infer

yangzhong's avatar
yangzhong committed
3
### inference BERT on SQuAD1.0
yangzhong's avatar
yangzhong committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

The [`run_qa.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/question-answering/run_qa.py) script
allows to fine-tune any model from our [hub](https://huggingface.co/models) (as long as its architecture has a `ForQuestionAnswering` version in the library) on a question-answering dataset (such as SQuAD, or any other QA dataset available in the `datasets` library, or your own csv/jsonlines files) as long as they are structured the same way as SQuAD. You might need to tweak the data processing inside the script if your data is structured differently.

**Note:** This script only works with models that have a fast tokenizer (backed by the 🤗 Tokenizers library) as it
uses special features of those tokenizers. You can check if your favorite model has a fast tokenizer in
[this table](https://huggingface.co/transformers/index.html#supported-frameworks), if it doesn't you can still use the old version of the script which can be found [here](https://github.com/huggingface/transformers/tree/main/examples/legacy/question-answering).

Note that if your dataset contains samples with no possible answers (like SQuAD version 2), you need to pass along the flag `--version_2_with_negative`.

- [train-v1.1.json](https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json)
- [dev-v1.1.json](https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json)
- [evaluate-v1.1.py](https://github.com/allenai/bi-att-flow/blob/master/squad/evaluate-v1.1.py)
- This fine-tuned model is available as a checkpoint under the reference [`bert-large-uncased-whole-word-masking-finetuned-squad`](https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad).

yangzhong's avatar
yangzhong committed
19
This example code inference BERT on the SQuAD1.0 dataset.
yangzhong's avatar
yangzhong committed
20
21
22
23
24
25
26
27
28
29
30
31
32
33

```bash
python /nx/transformers/examples/pytorch/question-answering/run_qa.py \
  --model_name_or_path /models/google-bert/bert-large-uncased-whole-word-masking-finetuned-squad \
  --dataset_name squad \
  --do_eval \
  --per_device_train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 2 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /nx/transformers/debug_squad/
```

yangzhong's avatar
yangzhong committed
34
### 多卡inference
yangzhong's avatar
yangzhong committed
35
36

```
yangzhong's avatar
yangzhong committed
37
bash inference_SQuAD1.0.sh
yangzhong's avatar
yangzhong committed
38
39
```