README.md 2.85 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Sequence Projection Models

This repository contains implementation of the following papers.

* [*PRADO: Projection Attention Networks for Document Classification On-Device*](https://www.aclweb.org/anthology/D19-1506/)
* [*Self-Governing Neural Networks for On-Device Short Text Classification*](https://www.aclweb.org/anthology/D18-1105/)

## Description

We provide a family of models that projects sequence to fixed sized features.
The idea behind is to build embedding-free models that minimize the model size.
Instead of using embedding table to lookup embeddings, sequence projection
models computes them on the fly.


## History

### August 24, 2020
* Add PRADO and SGNN implementation.

## Authors or Maintainers

* Prabhu Kaliamoorthi
* Yicheng Fan ([@thunderfyc](https://github.com/thunderfyc))


## Requirements

[![TensorFlow 2.3](https://img.shields.io/badge/TensorFlow-2.3-FF6F00?logo=tensorflow)](https://github.com/tensorflow/tensorflow/releases/tag/v2.3.0)
[![Python 3.6](https://img.shields.io/badge/Python-3.6-3776AB)](https://www.python.org/downloads/release/python-360/)


## Training

Train a PRADO model on civil comments dataset

```shell
38
39
bazel run -c opt :trainer -- \
--config_path=$(pwd)/configs/civil_comments_prado.txt \
40
41
42
43
44
45
46
47
48
49
50
51
52
53
--runner_mode=train --logtostderr --output_dir=/tmp/prado
```

Train a SGNN model to detect languages:

```shell
bazel run -c opt sgnn:train -- --logtostderr --output_dir=/tmp/sgnn
```

## Evaluation

Evaluate PRADO model:

```shell
54
55
56
bazel run -c opt :trainer -- \
--config_path=$(pwd)/configs/civil_comments_prado.txt \
--runner_mode=eval --logtostderr --output_dir=/tmp/prado
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
```

Evaluate SGNN model:
```shell
bazel run -c opt sgnn:run_tflite -- --model=/tmp/sgnn/model.tflite "Hello world"
```


## References

1.  **Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift**<br />
    Sergey Ioffe, Christian Szegedy <br />
    [[link]](https://arxiv.org/abs/1502.03167). In ICML, 2015.

2.  **Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference**<br />
    Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko <br />
    [[link]](https://arxiv.org/abs/1712.05877). In CVPR, 2018.

3.  **PRADO: Projection Attention Networks for Document Classification On-Device**<br/>
    Prabhu Kaliamoorthi, Sujith Ravi, Zornitsa Kozareva <br />
    [[link]](https://www.aclweb.org/anthology/D19-1506/). In EMNLP-IJCNLP, 2019

4.  **Self-Governing Neural Networks for On-Device Short Text Classification**<br />
    Sujith Ravi, Zornitsa Kozareva <br />
    [[link]](https://www.aclweb.org/anthology/D18-1105). In EMNLP, 2018

## License

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

This project is licensed under the terms of the **Apache License 2.0**.