"packaging/git@developer.sourcefind.cn:OpenDAS/vision.git" did not exist on "d2460a75de237cfef8e5c3415f7bb0ad8467c0e5"
Commit b651b000 authored by Nathan Ng's avatar Nathan Ng Committed by Facebook Github Bot
Browse files

Wmt19 models (#767)

Summary:
Release of the WMT 19 pretrained models
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/767

Reviewed By: edunov

Differential Revision: D16472717

Pulled By: nng555

fbshipit-source-id: acf0fa3548c33f2bf2b5f71e551c782ad8c31a42
parent d82517e9
# WMT 19
This page provides pointers to the models of Facebook-FAIR's WMT'19 news translation task submission [(Ng et al., 2019)](https://arxiv.org/abs/1907.06616).
## Pre-trained models
Description | Model
---|---
En->De Ensemble | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/wmt19.en-de.joined-dict.ensemble.tar.bz2)
De->En Ensemble | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/wmt19.de-en.joined-dict.ensemble.tar.bz2)
En->Ru Ensemble | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/wmt19.en-ru.ensemble.tar.bz2)
Ru->En Ensemble | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/wmt19.ru-en.ensemble.tar.bz2)
En LM | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.en.tar.bz2)
De LM | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.de.tar.bz2)
Ru LM | [download (.tar.bz2)](https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.ru.tar.bz2)
## Example usage (torch.hub)
```
>>> import torch
>>> en2de = torch.hub.load(
... 'pytorch/fairseq',
... 'transformer.wmt19.en-de',
... checkpoint_file='model1.pt:model2.pt:model3.pt:model4.pt'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> en2de.generate("Machine learning is great!")
'Maschinelles Lernen ist großartig!'
>>> de2en = torch.hub.load(
... 'pytorch/fairseq',
... 'transformer.wmt19.de-en',
... checkpoint_file='model1.pt:model2.pt:model3.pt:model4.pt'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> de2en.generate("Maschinelles Lernen ist großartig!")
'Machine learning is great!'
>>> en2ru = torch.hub.load(
... 'pytorch/fairseq',
... 'transformer.wmt19.en-ru',
... checkpoint_file='model1.pt:model2.pt:model3.pt:model4.pt'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> en2ru.generate("Machine learning is great!")
'Машинное обучение - это здорово!'
>>> ru2en = torch.hub.load(
... 'pytorch/fairseq',
... 'transformer.wmt19.ru-en',
... checkpoint_file='model1.pt:model2.pt:model3.pt:model4.pt'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> ru2en.generate("Машинное обучение - это здорово!")
'Machine learning is great!'
>>> en_lm = torch.hub.load(
... 'pytorch.fairseq',
... 'transformer_lm.wmt19.en'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> en_lm.generate("Machine learning is")
'Machine learning is the future of computing, says Microsoft boss Satya Nadella ...'
>>> de_lm = torch.hub.load(
... 'pytorch.fairseq',
... 'transformer_lm.wmt19.de'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> de_lm.generate("Maschinelles lernen ist")
''Maschinelles lernen ist das A und O (neues-deutschland.de) Die Arbeitsbedingungen für Lehrerinnen und Lehrer sind seit Jahren verbesserungswürdig ...'
>>> ru_lm = torch.hub.load(
... 'pytorch.fairseq',
... 'transformer_lm.wmt19.ru'
... tokenizer='moses',
... bpe='fastbpe',
... )
>>> ru_lm.generate("машинное обучение это")
'машинное обучение это то, что мы называем "искусственным интеллектом".'
```
## Citation
```bibtex
@inproceedings{ng2019facebook},
title = {Facebook FAIR's WMT19 News Translation Task Submission},
author = {Ng, Nathan and Yee, Kyra and Baevski, Alexei and Ott, Myle and Auli, Michael and Edunov, Sergey},,
booktitle = {Conference of the Association for Computational Linguistics (ACL)},
year = 2019,
}
```
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
from fairseq import file_utils
from fairseq.data.encoders import register_bpe
@register_bpe('fastbpe')
class fastBPE(object):
@staticmethod
def add_args(parser):
# fmt: off
parser.add_argument('--bpe-codes', type=str,
help='path to fastBPE BPE')
# fmt: on
def __init__(self, args):
if args.bpe_codes is None:
raise ValueError('--bpe-codes is required for --bpe=subword_nmt')
codes = file_utils.cached_path(args.bpe_codes)
try:
import fastBPE
self.bpe = fastBPE.fastBPE(codes)
self.bpe_symbol = "@@ "
except ImportError:
raise ImportError('Please install fastbpe at https://github.com/glample/fastBPE')
def encode(self, x: str) -> str:
return self.bpe.apply([x])[0]
def decode(self, x: str) -> str:
return (x + ' ').replace(self.bpe_symbol, '').rstrip()
...@@ -53,6 +53,10 @@ class TransformerModel(FairseqEncoderDecoderModel): ...@@ -53,6 +53,10 @@ class TransformerModel(FairseqEncoderDecoderModel):
'transformer.wmt14.en-fr': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt14.en-fr.joined-dict.transformer.tar.bz2', 'transformer.wmt14.en-fr': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt14.en-fr.joined-dict.transformer.tar.bz2',
'transformer.wmt16.en-de': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt16.en-de.joined-dict.transformer.tar.bz2', 'transformer.wmt16.en-de': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt16.en-de.joined-dict.transformer.tar.bz2',
'transformer.wmt18.en-de': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt18.en-de.ensemble.tar.gz', 'transformer.wmt18.en-de': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt18.en-de.ensemble.tar.gz',
'transformer.wmt19.en-de': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt19.en-de.joined-dict.ensemble.tar.bz2',
'transformer.wmt19.en-ru': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt19.en-ru.ensemble.tar.bz2',
'transformer.wmt19.de-en': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt19.de-en.joined-dict.ensemble.tar.bz2',
'transformer.wmt19.ru-en': 'https://dl.fbaipublicfiles.com/fairseq/models/wmt19.ru-en.ensemble.tar.bz2',
} }
def __init__(self, encoder, decoder): def __init__(self, encoder, decoder):
......
...@@ -29,6 +29,9 @@ class TransformerLanguageModel(FairseqLanguageModel): ...@@ -29,6 +29,9 @@ class TransformerLanguageModel(FairseqLanguageModel):
return { return {
'transformer_lm.gbw.adaptive_huge': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_gbw_huge.tar.bz2', 'transformer_lm.gbw.adaptive_huge': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_gbw_huge.tar.bz2',
'transformer_lm.wiki103.adaptive': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_wiki103.tar.bz2', 'transformer_lm.wiki103.adaptive': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/adaptive_lm_wiki103.tar.bz2',
'transformer_lm.wmt19.en': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.en.tar.bz2',
'transformer_lm.wmt19.de': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.de.tar.bz2',
'transformer_lm.wmt19.ru': 'https://dl.fbaipublicfiles.com/fairseq/models/lm/wmt19.ru.tar.bz2',
} }
def __init__(self, decoder): def __init__(self, decoder):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment