Commit d5d7f1c3 authored by limm's avatar limm
Browse files

Merge branch 'fix_readme' into 'v0.9.0-fastpt'

update README.md

See merge request !1
parents df2f84ce 603d579a
# <img src="fairseq_logo.png" width="30"> Introduction # <div align="center"><strong>Fairseq</strong></div>
## 简介
Fairseq(-py) is a sequence modeling toolkit that allows researchers and Fairseq(-py) 是一个序列建模工具包,它允许研究人员和开发人员训练用于翻译、摘要、语言的自定义模型建模和其他文本生成任务。DAS软件栈中的Fairseq版本,不仅保证了组件核心功能在DCU加速卡的可用性,还针对DCU特有的硬件架构进行了深度定制优化。这使得开发者能够以极低的成本,轻松实现应用程序在DCU加速卡上的快速迁移和性能提升。
developers to train custom models for translation, summarization, language
modeling and other text generation tasks. ## 安装
组件支持组合
### What's New:
| PyTorch版本 | fastpt版本 |Fairseq版本 | DTK版本 | Python版本 | 推荐编译方式 |
- November 2019: [CamemBERT model and code released](examples/camembert/README.md) | ----------- | ----------- | ----------- | ------------------------ | ---------------- | ------------ |
- November 2019: [BART model and code released](examples/bart/README.md) | 2.5.1 | 2.1.0 |0.9.0 | >= 25.04 | 3.8、3.10、3.11 | fastpt不转码 |
- November 2019: [XLM-R models and code released](examples/xlmr/README.md) | 2.4.1 | 2.0.1 |0.9.0 | >= 25.04 | 3.8、3.10、3.11 | fastpt不转码 |
- September 2019: [Nonautoregressive translation code released](examples/nonautoregressive_translation/README.md) | 其他 | 其他 | 其他 | 其他 | 3.8、3.10、3.11 | hip转码 |
- August 2019: [WMT'19 models released](examples/wmt19/README.md)
- July 2019: fairseq relicensed under MIT license + pytorch版本大于2.4.1 && dtk版本大于25.04 推荐使用fastpt不转码编译。
- July 2019: [RoBERTa models and code released](examples/roberta/README.md)
- June 2019: [wav2vec models and code released](examples/wav2vec/README.md) ### 1、使用pip方式安装
fairseq whl包下载目录:[光和开发者社区](https://download.sourcefind.cn:65024/4/main/fairseq),选择对应的pytorch版本和python版本下载对应fairseq的whl包
### Features: ```shell
pip install torch* (下载torch的whl包)
Fairseq provides reference implementations of various sequence-to-sequence models, including: pip install fastpt* --no-deps (下载fastpt的whl包)
- **Convolutional Neural Networks (CNN)** source /usr/local/bin/fastpt -E
- [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/conv_lm/README.md) pip install fairseq* (下载的fairseq-fastpt的whl包)
- [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
- [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
- [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
- [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md)
- **LightConv and DynamicConv models**
- [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
- **Long Short-Term Memory (LSTM) networks**
- Effective Approaches to Attention-based Neural Machine Translation (Luong et al., 2015)
- **Transformer (self-attention) networks**
- Attention Is All You Need (Vaswani et al., 2017)
- [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
- [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
- [Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)](examples/language_model/transformer_lm/README.md)
- [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
- [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md)
- [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md)
- [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md )
- **Non-autoregressive Transformers**
- Non-Autoregressive Neural Machine Translation (Gu et al., 2017)
- Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement (Lee et al. 2018)
- Insertion Transformer: Flexible Sequence Generation via Insertion Operations (Stern et al. 2019)
- Mask-Predict: Parallel Decoding of Conditional Masked Language Models (Ghazvininejad et al., 2019)
- [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md)
**Additionally:**
- multi-GPU (distributed) training on one machine or across multiple machines
- fast generation on both CPU and GPU with multiple search algorithms implemented:
- beam search
- Diverse Beam Search ([Vijayakumar et al., 2016](https://arxiv.org/abs/1610.02424))
- sampling (unconstrained, top-k and top-p/nucleus)
- large mini-batch training even on a single GPU via delayed updates
- mixed precision training (trains faster with less GPU memory on [NVIDIA tensor cores](https://developer.nvidia.com/tensor-cores))
- extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers
We also provide [pre-trained models for translation and language modeling](#pre-trained-models-and-examples)
with a convenient `torch.hub` interface:
```python
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'
``` ```
See the PyTorch Hub tutorials for [translation](https://pytorch.org/hub/pytorch_fairseq_translation/) ### 2、使用源码编译方式安装
and [RoBERTa](https://pytorch.org/hub/pytorch_fairseq_roberta/) for more examples.
![Model](fairseq.gif)
# Requirements and Installation #### 编译环境准备
提供基于fastpt不转码编译:
* [PyTorch](http://pytorch.org/) version >= 1.2.0 1. 基于光源pytorch基础镜像环境:镜像下载地址:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
* Python version >= 3.5
* For training new models, you'll also need an NVIDIA GPU and [NCCL](https://github.com/NVIDIA/nccl)
* **For faster training** install NVIDIA's [apex](https://github.com/NVIDIA/apex) library with the `--cuda_ext` option
To install fairseq: 2. 基于现有python环境:安装pytorch,fastpt whl包下载目录:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
```bash ```shell
pip install fairseq pip install torch* (下载torch的whl包)
pip install fastpt* --no-deps (下载fastpt的whl包, 安装顺序,先安装torch,后安装fastpt)
pip install wheel
``` ```
On MacOS: #### 源码编译安装
```bash - 代码下载
CFLAGS="-stdlib=libc++" pip install fairseq ```shell
git clone http://developer.sourcefind.cn/codes/OpenDAS/fairseq.git # 根据编译需要切换分支
``` ```
- 提供2种源码编译方式(进入fairseq目录):
If you use Docker make sure to increase the shared memory size either with
`--ipc=host` or `--shm-size` as command line options to `nvidia-docker run`.
**Installing from source**
To install fairseq from source and develop locally:
```bash
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .
``` ```
1. 设置不转码编译环境变量
source /usr/local/bin/fastpt -C
# Getting Started 2. 编译whl包并安装
python3 setup.py -v bdist_wheel
The [full documentation](https://fairseq.readthedocs.io/) contains instructions pip install dist/fairseq*
for getting started, training new models and extending fairseq with new model
types and tasks.
# Pre-trained models and examples
We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below,
as well as example training and evaluation commands.
- [Translation](examples/translation/README.md): convolutional and transformer models are available
- [Language Modeling](examples/language_model/README.md): convolutional and transformer models are available
- [wav2vec](examples/wav2vec/README.md): wav2vec large model is available
We also have more detailed READMEs to reproduce results from specific papers: 3. 源码编译安装
- [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md ) pip3 install --editable ./
- [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md) ```
- [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md) #### 注意事项
- [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md) + 若使用pip install下载安装过慢,可添加pypi清华源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
- [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md) + ROCM_PATH为dtk的路径,默认为/opt/dtk
- [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
- [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
- [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
- [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
- [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
- [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
- [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
- [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/conv_lm/README.md)
# Join the fairseq community
* Facebook page: https://www.facebook.com/groups/fairseq.users
* Google group: https://groups.google.com/forum/#!forum/fairseq-users
# License
fairseq(-py) is MIT-licensed.
The license applies to the pre-trained models as well.
# Citation ## 验证
- python -c "import fairseq; fairseq.\_\_version__",版本号与官方版本同步,查询该软件的版本号,例如0.9.0;
Please cite as: ## Known Issue
-
```bibtex ## 参考资料
@inproceedings{ott2019fairseq, - [README_ORIGIN](README_ORIGIN.md)
title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling}, - [https://github.com/facebookresearch/fairseq](https://github.com/facebookresearch/fairseq)
author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
year = {2019},
}
```
# <img src="fairseq_logo.png" width="30"> Introduction
Fairseq(-py) is a sequence modeling toolkit that allows researchers and
developers to train custom models for translation, summarization, language
modeling and other text generation tasks.
### What's New:
- November 2019: [CamemBERT model and code released](examples/camembert/README.md)
- November 2019: [BART model and code released](examples/bart/README.md)
- November 2019: [XLM-R models and code released](examples/xlmr/README.md)
- September 2019: [Nonautoregressive translation code released](examples/nonautoregressive_translation/README.md)
- August 2019: [WMT'19 models released](examples/wmt19/README.md)
- July 2019: fairseq relicensed under MIT license
- July 2019: [RoBERTa models and code released](examples/roberta/README.md)
- June 2019: [wav2vec models and code released](examples/wav2vec/README.md)
### Features:
Fairseq provides reference implementations of various sequence-to-sequence models, including:
- **Convolutional Neural Networks (CNN)**
- [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/conv_lm/README.md)
- [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
- [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
- [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
- [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md)
- **LightConv and DynamicConv models**
- [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
- **Long Short-Term Memory (LSTM) networks**
- Effective Approaches to Attention-based Neural Machine Translation (Luong et al., 2015)
- **Transformer (self-attention) networks**
- Attention Is All You Need (Vaswani et al., 2017)
- [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
- [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
- [Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)](examples/language_model/transformer_lm/README.md)
- [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
- [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md)
- [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md)
- [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md )
- **Non-autoregressive Transformers**
- Non-Autoregressive Neural Machine Translation (Gu et al., 2017)
- Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement (Lee et al. 2018)
- Insertion Transformer: Flexible Sequence Generation via Insertion Operations (Stern et al. 2019)
- Mask-Predict: Parallel Decoding of Conditional Masked Language Models (Ghazvininejad et al., 2019)
- [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md)
**Additionally:**
- multi-GPU (distributed) training on one machine or across multiple machines
- fast generation on both CPU and GPU with multiple search algorithms implemented:
- beam search
- Diverse Beam Search ([Vijayakumar et al., 2016](https://arxiv.org/abs/1610.02424))
- sampling (unconstrained, top-k and top-p/nucleus)
- large mini-batch training even on a single GPU via delayed updates
- mixed precision training (trains faster with less GPU memory on [NVIDIA tensor cores](https://developer.nvidia.com/tensor-cores))
- extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers
We also provide [pre-trained models for translation and language modeling](#pre-trained-models-and-examples)
with a convenient `torch.hub` interface:
```python
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'
```
See the PyTorch Hub tutorials for [translation](https://pytorch.org/hub/pytorch_fairseq_translation/)
and [RoBERTa](https://pytorch.org/hub/pytorch_fairseq_roberta/) for more examples.
![Model](fairseq.gif)
# Requirements and Installation
* [PyTorch](http://pytorch.org/) version >= 1.2.0
* Python version >= 3.5
* For training new models, you'll also need an NVIDIA GPU and [NCCL](https://github.com/NVIDIA/nccl)
* **For faster training** install NVIDIA's [apex](https://github.com/NVIDIA/apex) library with the `--cuda_ext` option
To install fairseq:
```bash
pip install fairseq
```
On MacOS:
```bash
CFLAGS="-stdlib=libc++" pip install fairseq
```
If you use Docker make sure to increase the shared memory size either with
`--ipc=host` or `--shm-size` as command line options to `nvidia-docker run`.
**Installing from source**
To install fairseq from source and develop locally:
```bash
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable .
```
# Getting Started
The [full documentation](https://fairseq.readthedocs.io/) contains instructions
for getting started, training new models and extending fairseq with new model
types and tasks.
# Pre-trained models and examples
We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below,
as well as example training and evaluation commands.
- [Translation](examples/translation/README.md): convolutional and transformer models are available
- [Language Modeling](examples/language_model/README.md): convolutional and transformer models are available
- [wav2vec](examples/wav2vec/README.md): wav2vec large model is available
We also have more detailed READMEs to reproduce results from specific papers:
- [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md )
- [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md)
- [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md)
- [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md)
- [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md)
- [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
- [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
- [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
- [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
- [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
- [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
- [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
- [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/conv_lm/README.md)
# Join the fairseq community
* Facebook page: https://www.facebook.com/groups/fairseq.users
* Google group: https://groups.google.com/forum/#!forum/fairseq-users
# License
fairseq(-py) is MIT-licensed.
The license applies to the pre-trained models as well.
# Citation
Please cite as:
```bibtex
@inproceedings{ott2019fairseq,
title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
year = {2019},
}
```
...@@ -86,7 +86,7 @@ dtypes = { ...@@ -86,7 +86,7 @@ dtypes = {
3: np.int16, 3: np.int16,
4: np.int32, 4: np.int32,
5: np.int64, 5: np.int64,
6: np.float, 6: np.float32,
7: np.double, 7: np.double,
8: np.uint16 8: np.uint16
} }
...@@ -289,7 +289,7 @@ class IndexedDatasetBuilder(object): ...@@ -289,7 +289,7 @@ class IndexedDatasetBuilder(object):
np.int16: 2, np.int16: 2,
np.int32: 4, np.int32: 4,
np.int64: 8, np.int64: 8,
np.float: 4, np.float32: 4,
np.double: 8 np.double: 8
} }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment