@@ -20,8 +20,11 @@ to experiment new research ideas.
...
@@ -20,8 +20,11 @@ to experiment new research ideas.
We provide modeling library to allow users to train custom models for new
We provide modeling library to allow users to train custom models for new
research ideas. Detailed intructions can be found in READMEs in each folder.
research ideas. Detailed intructions can be found in READMEs in each folder.
*[modeling/](modeling): modeling library that provides building blocks (e.g., Layers, Networks, and Models) that can be assembled into transformer-based achitectures .
*[modeling/](modeling): modeling library that provides building blocks
*[data/](data): binaries and utils for input preprocessing, tokenization, etc.
(e.g.,Layers, Networks, and Models) that can be assembled into
transformer-based achitectures .
*[data/](data): binaries and utils for input preprocessing, tokenization,
etc.
### State-of-the-Art models and examples
### State-of-the-Art models and examples
...
@@ -29,9 +32,23 @@ We provide SoTA model implementations, pre-trained models, training and
...
@@ -29,9 +32,23 @@ We provide SoTA model implementations, pre-trained models, training and
evaluation examples, and command lines. Detail instructions can be found in the
evaluation examples, and command lines. Detail instructions can be found in the
READMEs for specific papers.
READMEs for specific papers.
1.[BERT](bert): [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805) by Devlin et al., 2018
1.[BERT](bert): [BERT: Pre-training of Deep Bidirectional Transformers for
2.[ALBERT](albert): [A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942) by Lan et al., 2019
Language Understanding](https://arxiv.org/abs/1810.04805) by Devlin et al.,
3.[XLNet](xlnet): [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237) by Yang et al., 2019
2018
4.[Transformer for translation](transformer): [Attention Is All You Need](https://arxiv.org/abs/1706.03762) by Vaswani et al., 2017
2.[ALBERT](albert):
5.[NHNet](nhnet): [Generating Representative Headlines for News Stories](https://arxiv.org/abs/2001.09386) by Gu et al, 2020
[A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942)
by Lan et al., 2019
3.[XLNet](xlnet):
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
by Yang et al., 2019
4.[Transformer for translation](transformer):
[Attention Is All You Need](https://arxiv.org/abs/1706.03762) by Vaswani et
al., 2017
5.[NHNet](nhnet):
[Generating Representative Headlines for News Stories](https://arxiv.org/abs/2001.09386)
by Gu et al, 2020
### Common Training Driver
We provide a single common driver [train.py](train.py) to train above SoTA
models on popluar tasks. Please see [train.md](train.md) for more details.