Evaluated on the SQuAD 2.0 English dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).
```
"exact": 79.45759285774446,
"f1": 83.79259828925511,
"total": 11873,
"HasAns_exact": 71.96356275303644,
"HasAns_f1": 80.6460053117963,
"HasAns_total": 5928,
"NoAns_exact": 86.93019343986543,
"NoAns_f1": 86.93019343986543,
"NoAns_total": 5945
```
Evaluated on German [MLQA: test-context-de-question-de.json](https://github.com/facebookresearch/MLQA)
```
"exact": 49.34691166703564,
"f1": 66.15582561674236,
"total": 4517,
```
Evaluated on German [XQuAD: xquad.de.json](https://github.com/deepmind/xquad)
For doing QA at scale (i.e. many docs instead of single paragraph), you can load the model also in [haystack](https://github.com/deepset-ai/haystack/):