README.md 17.1 KB
Newer Older
1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
# XLM-R + NER

This model is a fine-tuned  [XLM-Roberta-base](https://arxiv.org/abs/1911.02116) over the 40 languages proposed in [XTREME]([https://github.com/google-research/xtreme](https://github.com/google-research/xtreme)) from [Wikiann](https://aclweb.org/anthology/P17-1178). This is still an on-going work and the results will be updated everytime an improvement is reached. 

The covered labels are:
```
LOC
ORG
PER
O
```

## Metrics on evaluation set:
### Average over the 40 languages
16
Number of documents: 262300
17
18
19
20
21
22
23
24
25
26
27
28
```
           precision    recall  f1-score   support

      ORG       0.81      0.81      0.81    102452
      PER       0.90      0.91      0.91    108978
      LOC       0.86      0.89      0.87    121868

micro avg       0.86      0.87      0.87    333298
macro avg       0.86      0.87      0.87    333298
```

### Afrikaans
29
Number of documents: 1000
30
31
32
33
34
35
36
37
38
39
40
41
```
           precision    recall  f1-score   support

      ORG       0.89      0.88      0.88       582
      PER       0.89      0.97      0.93       369
      LOC       0.84      0.90      0.86       518

micro avg       0.87      0.91      0.89      1469
macro avg       0.87      0.91      0.89      1469
``` 

### Arabic
42
Number of documents: 10000
43
44
45
46
47
48
49
50
51
52
53
54
```
           precision    recall  f1-score   support

      ORG       0.83      0.84      0.84      3507
      PER       0.90      0.91      0.91      3643
      LOC       0.88      0.89      0.88      3604

micro avg       0.87      0.88      0.88     10754
macro avg       0.87      0.88      0.88     10754
```

### Basque
55
Number of documents: 10000
56
57
58
59
60
61
62
63
64
65
66
67
```
           precision    recall  f1-score   support

      LOC       0.88      0.93      0.91      5228
      ORG       0.86      0.81      0.83      3654
      PER       0.91      0.91      0.91      4072

micro avg       0.89      0.89      0.89     12954
macro avg       0.89      0.89      0.89     12954
```

### Bengali
68
Number of documents: 1000
69
70
71
72
73
74
75
76
77
78
79
80
```
           precision    recall  f1-score   support

      ORG       0.86      0.89      0.87       325
      LOC       0.91      0.91      0.91       406
      PER       0.96      0.95      0.95       364

micro avg       0.91      0.92      0.91      1095
macro avg       0.91      0.92      0.91      1095
```

### Bulgarian
81
Number of documents: 1000
82
83
84
85
86
87
88
89
90
91
92
93
```
           precision    recall  f1-score   support

      ORG       0.86      0.83      0.84      3661
      PER       0.92      0.95      0.94      4006
      LOC       0.92      0.95      0.94      6449

micro avg       0.91      0.92      0.91     14116
macro avg       0.91      0.92      0.91     14116
```

### Burmese
94
Number of documents: 100
95
96
97
98
99
100
101
102
103
104
105
106
```
           precision    recall  f1-score   support

      LOC       0.60      0.86      0.71        37
      ORG       0.68      0.63      0.66        30
      PER       0.44      0.44      0.44        36

micro avg       0.57      0.65      0.61       103
macro avg       0.57      0.65      0.60       103
```

### Chinese
107
Number of documents: 10000
108
109
110
111
112
113
114
115
116
117
118
119
```
           precision    recall  f1-score   support

      ORG       0.70      0.69      0.70      4022
      LOC       0.76      0.81      0.78      3830
      PER       0.84      0.84      0.84      3706

micro avg       0.76      0.78      0.77     11558
macro avg       0.76      0.78      0.77     11558
```

### Dutch
120
Number of documents: 10000
121
122
123
124
125
126
127
128
129
130
131
132
```
           precision    recall  f1-score   support

      ORG       0.87      0.87      0.87      3930
      PER       0.95      0.95      0.95      4377
      LOC       0.91      0.92      0.91      4813

micro avg       0.91      0.92      0.91     13120
macro avg       0.91      0.92      0.91     13120
```

### English
133
Number of documents: 10000
134
135
136
137
138
139
140
141
142
143
144
145
```
           precision    recall  f1-score   support

      LOC       0.83      0.84      0.84      4781
      PER       0.89      0.90      0.89      4559
      ORG       0.75      0.75      0.75      4633

micro avg       0.82      0.83      0.83     13973
macro avg       0.82      0.83      0.83     13973
```

### Estonian
146
Number of documents: 10000
147
148
149
150
151
152
153
154
155
156
157
158
```
           precision    recall  f1-score   support

      LOC       0.89      0.92      0.91      5654
      ORG       0.85      0.85      0.85      3878
      PER       0.94      0.94      0.94      4026

micro avg       0.90      0.91      0.90     13558
macro avg       0.90      0.91      0.90     13558
```

### Finnish
159
Number of documents: 10000
160
161
162
163
164
165
166
167
168
169
170
171
```
           precision    recall  f1-score   support

      ORG       0.84      0.83      0.84      4104
      LOC       0.88      0.90      0.89      5307
      PER       0.95      0.94      0.94      4519

micro avg       0.89      0.89      0.89     13930
macro avg       0.89      0.89      0.89     13930
```

### French
172
Number of documents: 10000
173
174
175
176
177
178
179
180
181
182
183
184
```
           precision    recall  f1-score   support

      LOC       0.90      0.89      0.89      4808
      ORG       0.84      0.87      0.85      3876
      PER       0.94      0.93      0.94      4249

micro avg       0.89      0.90      0.90     12933
macro avg       0.89      0.90      0.90     12933
```

### Georgian
185
Number of documents: 10000
186
187
188
189
190
191
192
193
194
195
196
197
```
           precision    recall  f1-score   support

      PER       0.90      0.91      0.90      3964
      ORG       0.83      0.77      0.80      3757
      LOC       0.82      0.88      0.85      4894

micro avg       0.84      0.86      0.85     12615
macro avg       0.84      0.86      0.85     12615
```

### German
198
Number of documents: 10000
199
200
201
202
203
204
205
206
207
208
209
210
```
           precision    recall  f1-score   support

      LOC       0.85      0.90      0.87      4939
      PER       0.94      0.91      0.92      4452
      ORG       0.79      0.78      0.79      4247

micro avg       0.86      0.86      0.86     13638
macro avg       0.86      0.86      0.86     13638
```

### Greek
211
Number of documents: 10000
212
213
214
215
216
217
218
219
220
221
222
223
```
           precision    recall  f1-score   support

      ORG       0.86      0.85      0.85      3771
      LOC       0.88      0.91      0.90      4436
      PER       0.91      0.93      0.92      3894

micro avg       0.88      0.90      0.89     12101
macro avg       0.88      0.90      0.89     12101
```

### Hebrew
224
Number of documents: 10000
225
226
227
228
229
230
231
232
233
234
235
236
```
           precision    recall  f1-score   support

      PER       0.87      0.88      0.87      4206
      ORG       0.76      0.75      0.76      4190
      LOC       0.85      0.85      0.85      4538

micro avg       0.83      0.83      0.83     12934
macro avg       0.82      0.83      0.83     12934
```

### Hindi
237
Number of documents: 1000
238
239
240
241
242
243
244
245
246
247
248
249
```
           precision    recall  f1-score   support

      ORG       0.78      0.81      0.79       362
      LOC       0.83      0.85      0.84       422
      PER       0.90      0.95      0.92       427

micro avg       0.84      0.87      0.85      1211
macro avg       0.84      0.87      0.85      1211
```

### Hungarian
250
Number of documents: 10000
251
252
253
254
255
256
257
258
259
260
261
262
```
           precision    recall  f1-score   support

      PER       0.95      0.95      0.95      4347
      ORG       0.87      0.88      0.87      3988
      LOC       0.90      0.92      0.91      5544

micro avg       0.91      0.92      0.91     13879
macro avg       0.91      0.92      0.91     13879
```

### Indonesian
263
Number of documents: 10000
264
265
266
267
268
269
270
271
272
273
274
275
```
           precision    recall  f1-score   support

      ORG       0.88      0.89      0.88      3735
      LOC       0.93      0.95      0.94      3694
      PER       0.93      0.93      0.93      3947

micro avg       0.91      0.92      0.92     11376
macro avg       0.91      0.92      0.92     11376
```

### Italian
276
Number of documents: 10000
277
278
279
280
281
282
283
284
285
286
287
288
```
           precision    recall  f1-score   support

      LOC       0.88      0.88      0.88      4592
      ORG       0.86      0.86      0.86      4088
      PER       0.96      0.96      0.96      4732

micro avg       0.90      0.90      0.90     13412
macro avg       0.90      0.90      0.90     13412
```

### Japanese
289
Number of documents: 10000
290
291
292
293
294
295
296
297
298
299
300
301
```
           precision    recall  f1-score   support

      ORG       0.62      0.61      0.62      4184
      PER       0.76      0.81      0.78      3812
      LOC       0.68      0.74      0.71      4281

micro avg       0.69      0.72      0.70     12277
macro avg       0.69      0.72      0.70     12277
```

### Javanese
302
Number of documents: 100
303
304
305
306
307
308
309
310
311
312
313
314
```
           precision    recall  f1-score   support

      ORG       0.79      0.80      0.80        46
      PER       0.81      0.96      0.88        26
      LOC       0.75      0.75      0.75        40

micro avg       0.78      0.82      0.80       112
macro avg       0.78      0.82      0.80       112
```

### Kazakh
315
Number of documents: 1000
316
317
318
319
320
321
322
323
324
325
326
327
```
           precision    recall  f1-score   support

      ORG       0.76      0.61      0.68       307
      LOC       0.78      0.90      0.84       461
      PER       0.87      0.91      0.89       367

micro avg       0.81      0.83      0.82      1135
macro avg       0.81      0.83      0.81      1135
```

### Korean
328
Number of documents: 10000
329
330
331
332
333
334
335
336
337
338
339
340
```
           precision    recall  f1-score   support

      LOC       0.86      0.89      0.88      5097
      ORG       0.79      0.74      0.77      4218
      PER       0.83      0.86      0.84      4014

micro avg       0.83      0.83      0.83     13329
macro avg       0.83      0.83      0.83     13329
```

### Malay
341
Number of documents: 1000
342
343
344
345
346
347
348
349
350
351
352
353
```
           precision    recall  f1-score   support

      ORG       0.87      0.89      0.88       368
      PER       0.92      0.91      0.91       366
      LOC       0.94      0.95      0.95       354

micro avg       0.91      0.92      0.91      1088
macro avg       0.91      0.92      0.91      1088
```

### Malayalam
354
Number of documents: 1000
355
356
357
358
359
360
361
362
363
364
365
366
```
           precision    recall  f1-score   support

      ORG       0.75      0.74      0.75       347
      PER       0.84      0.89      0.86       417
      LOC       0.74      0.75      0.75       391

micro avg       0.78      0.80      0.79      1155
macro avg       0.78      0.80      0.79      1155
```

### Marathi
367
Number of documents: 1000
368
369
370
371
372
373
374
375
376
377
378
379
```
           precision    recall  f1-score   support

      PER       0.89      0.94      0.92       394
      LOC       0.82      0.84      0.83       457
      ORG       0.84      0.78      0.81       339

micro avg       0.85      0.86      0.85      1190
macro avg       0.85      0.86      0.85      1190
```

### Persian
380
Number of documents: 10000
381
382
383
384
385
386
387
388
389
390
391
392
```
           precision    recall  f1-score   support

      PER       0.93      0.92      0.93      3540
      LOC       0.93      0.93      0.93      3584
      ORG       0.89      0.92      0.90      3370

micro avg       0.92      0.92      0.92     10494
macro avg       0.92      0.92      0.92     10494
```

### Portuguese
393
Number of documents: 10000
394
395
396
397
398
399
400
401
402
403
404
405
```
           precision    recall  f1-score   support

      LOC       0.90      0.91      0.91      4819
      PER       0.94      0.92      0.93      4184
      ORG       0.84      0.88      0.86      3670

micro avg       0.89      0.91      0.90     12673
macro avg       0.90      0.91      0.90     12673
```

### Russian
406
Number of documents: 10000
407
408
409
410
411
412
413
414
415
416
417
418
```
           precision    recall  f1-score   support

      PER       0.93      0.96      0.95      3574
      LOC       0.87      0.89      0.88      4619
      ORG       0.82      0.80      0.81      3858

micro avg       0.87      0.88      0.88     12051
macro avg       0.87      0.88      0.88     12051
```

### Spanish
419
Number of documents: 10000
420
421
422
423
424
425
426
427
428
429
430
431
```
           precision    recall  f1-score   support

      PER       0.95      0.93      0.94      3891
      ORG       0.86      0.88      0.87      3709
      LOC       0.89      0.91      0.90      4553

micro avg       0.90      0.91      0.90     12153
macro avg       0.90      0.91      0.90     12153
```

### Swahili
432
Number of documents: 1000
433
434
435
436
437
438
439
440
441
442
443
444
```
           precision    recall  f1-score   support

      ORG       0.82      0.85      0.83       349
      PER       0.95      0.92      0.94       403
      LOC       0.86      0.89      0.88       450

micro avg       0.88      0.89      0.88      1202
macro avg       0.88      0.89      0.88      1202
```

### Tagalog
445
Number of documents: 1000
446
447
448
449
450
451
452
453
454
455
456
457
```
           precision    recall  f1-score   support

      LOC       0.90      0.91      0.90       338
      ORG       0.83      0.91      0.87       339
      PER       0.96      0.93      0.95       350

micro avg       0.90      0.92      0.91      1027
macro avg       0.90      0.92      0.91      1027
```

### Tamil
458
Number of documents: 1000
459
460
461
462
463
464
465
466
467
468
469
470
```
           precision    recall  f1-score   support

      PER       0.90      0.92      0.91       392
      ORG       0.77      0.76      0.76       370
      LOC       0.78      0.81      0.79       421

micro avg       0.82      0.83      0.82      1183
macro avg       0.82      0.83      0.82      1183
```

### Telugu
471
Number of documents: 1000
472
473
474
475
476
477
478
479
480
481
482
483
```
           precision    recall  f1-score   support

      ORG       0.67      0.55      0.61       347
      LOC       0.78      0.87      0.82       453
      PER       0.73      0.86      0.79       393

micro avg       0.74      0.77      0.76      1193
macro avg       0.73      0.77      0.75      1193
```

### Thai
484
Number of documents: 10000
485
486
487
488
489
490
491
492
493
494
495
496
```
           precision    recall  f1-score   support

      LOC       0.63      0.76      0.69      3928
      PER       0.78      0.83      0.80      6537
      ORG       0.59      0.59      0.59      4257

micro avg       0.68      0.74      0.71     14722
macro avg       0.68      0.74      0.71     14722
```

### Turkish
497
Number of documents: 10000
498
499
500
501
502
503
504
505
506
507
508
509
```
           precision    recall  f1-score   support

      PER       0.94      0.94      0.94      4337
      ORG       0.88      0.89      0.88      4094
      LOC       0.90      0.92      0.91      4929

micro avg       0.90      0.92      0.91     13360
macro avg       0.91      0.92      0.91     13360
```

### Urdu
510
Number of documents: 1000
511
512
513
514
515
516
517
518
519
520
521
522
```
           precision    recall  f1-score   support

      LOC       0.90      0.95      0.93       352
      PER       0.96      0.96      0.96       333
      ORG       0.91      0.90      0.90       326

micro avg       0.92      0.94      0.93      1011
macro avg       0.92      0.94      0.93      1011
```

### Vietnamese
523
Number of documents: 10000
524
525
526
527
528
529
530
531
532
533
534
535
```
           precision    recall  f1-score   support

      ORG       0.86      0.87      0.86      3579
      LOC       0.88      0.91      0.90      3811
      PER       0.92      0.93      0.93      3717

micro avg       0.89      0.90      0.90     11107
macro avg       0.89      0.90      0.90     11107
```

### Yoruba
536
Number of documents: 100
537
538
539
540
541
542
543
544
545
546
547
548
```
           precision    recall  f1-score   support

      LOC       0.54      0.72      0.62        36
      ORG       0.58      0.31      0.41        35
      PER       0.77      1.00      0.87        36

micro avg       0.64      0.68      0.66       107
macro avg       0.63      0.68      0.63       107
```

## Reproduce the results
549
Download and prepare the dataset from the [XTREME repo](https://github.com/google-research/xtreme#download-the-data). Next, from the root of the transformers repo run:
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
```
cd examples/ner
python run_tf_ner.py \
--data_dir . \
--labels ./labels.txt \
--model_name_or_path jplu/tf-xlm-roberta-base \
--output_dir model \
--max-seq-length 128 \
--num_train_epochs 2 \
--per_gpu_train_batch_size 16 \
--per_gpu_eval_batch_size 32 \
--do_train \
--do_eval \
--logging_dir logs \
--mode token-classification \
--evaluate_during_training \
--optimizer_name adamw
```

## Usage with pipelines
```python
from transformers import pipeline

nlp_ner = pipeline(
    "ner",
    model="jplu/tf-xlm-r-ner-40-lang",
    tokenizer=(
        'jplu/tf-xlm-r-ner-40-lang',  
578
579
580
        {"use_fast": True}),
    framework="tf"
)
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598

text_fr = "Barack Obama est né à Hawaï."
text_en = "Barack Obama was born in Hawaii."
text_es = "Barack Obama nació en Hawai."
text_zh = "巴拉克·奧巴馬(Barack Obama)出生於夏威夷。"
text_ar = "ولد باراك أوباما في هاواي."

nlp_ner(text_fr)
#Output: [{'word': '▁Barack', 'score': 0.9894659519195557, 'entity': 'PER'}, {'word': '▁Obama', 'score': 0.9888848662376404, 'entity': 'PER'}, {'word': '▁Hawa', 'score': 0.998701810836792, 'entity': 'LOC'}, {'word': 'ï', 'score': 0.9987035989761353, 'entity': 'LOC'}]
nlp_ner(text_en)
#Output: [{'word': '▁Barack', 'score': 0.9929141998291016, 'entity': 'PER'}, {'word': '▁Obama', 'score': 0.9930834174156189, 'entity': 'PER'}, {'word': '▁Hawaii', 'score': 0.9986202120780945, 'entity': 'LOC'}]
nlp_ner(test_es)
#Output: [{'word': '▁Barack', 'score': 0.9944776296615601, 'entity': 'PER'}, {'word': '▁Obama', 'score': 0.9949177503585815, 'entity': 'PER'}, {'word': '▁Hawa', 'score': 0.9987911581993103, 'entity': 'LOC'}, {'word': 'i', 'score': 0.9984861612319946, 'entity': 'LOC'}]
nlp_ner(test_zh)
#Output: [{'word': '夏威夷', 'score': 0.9988449215888977, 'entity': 'LOC'}]
nlp_ner(test_ar)
#Output: [{'word': '▁با', 'score': 0.9903655648231506, 'entity': 'PER'}, {'word': 'راك', 'score': 0.9850614666938782, 'entity': 'PER'}, {'word': '▁أوباما', 'score': 0.9850308299064636, 'entity': 'PER'}, {'word': '▁ها', 'score': 0.9477543234825134, 'entity': 'LOC'}, {'word': 'وا', 'score': 0.9428229928016663, 'entity': 'LOC'}, {'word': 'ي', 'score': 0.9319471716880798, 'entity': 'LOC'}]

599
```