README.md 2.9 KB
Newer Older
Hongkun Yu's avatar
Hongkun Yu committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Copyright 2021 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

15
# TensorFlow NLP Modelling Toolkit
Hongkun Yu's avatar
Hongkun Yu committed
16

17
18
19
20
This codebase provides a Natrual Language Processing modeling toolkit written in
[TF2](https://www.tensorflow.org/guide/effective_tf2). It allows researchers and
developers to reproduce state-of-the-art model results and train custom models
to experiment new research ideas.
Hongkun Yu's avatar
Hongkun Yu committed
21

22
## Features
Hongkun Yu's avatar
Hongkun Yu committed
23

24
25
26
27
28
* Reusable and modularized modeling building blocks
* State-of-the-art reproducible
* Easy to customize and extend
* End-to-end training
* Distributed trainable on both GPUs and TPUs
29

30
31
32
33
34
35
36
## Major components

### Libraries

We provide modeling library to allow users to train custom models for new
research ideas. Detailed intructions can be found in READMEs in each folder.

37
38
39
40
41
*   [modeling/](modeling): modeling library that provides building blocks
    (e.g.,Layers, Networks, and Models) that can be assembled into
    transformer-based achitectures .
*   [data/](data): binaries and utils for input preprocessing, tokenization,
    etc.
42
43
44
45
46
47
48

### State-of-the-Art models and examples

We provide SoTA model implementations, pre-trained models, training and
evaluation examples, and command lines. Detail instructions can be found in the
READMEs for specific papers.

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
1.  [BERT](bert): [BERT: Pre-training of Deep Bidirectional Transformers for
    Language Understanding](https://arxiv.org/abs/1810.04805) by Devlin et al.,
    2018
2.  [ALBERT](albert):
    [A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942)
    by Lan et al., 2019
3.  [XLNet](xlnet):
    [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
    by Yang et al., 2019
4.  [Transformer for translation](transformer):
    [Attention Is All You Need](https://arxiv.org/abs/1706.03762) by Vaswani et
    al., 2017
5.  [NHNet](nhnet):
    [Generating Representative Headlines for News Stories](https://arxiv.org/abs/2001.09386)
    by Gu et al, 2020

### Common Training Driver

We provide a single common driver [train.py](train.py) to train above SoTA
68
69
70
71
72
73
74
75
76
models on popluar tasks. Please see [docs/train.md](docs/train.md) for
more details.


### Pre-trained models with checkpoints and TF-Hub

We provide a large collection of baselines and checkpoints for NLP pre-trained
models. Please see [docs/pretrained_models.md](docs/pretrained_models.md) for
more details.