README.md 5.24 KB
Newer Older
turneram's avatar
turneram committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# Exporting Frozen Graphs in Tensorflow 1

## Description
This example demonstrates how to export a frozen graph protobuf in Tensorflow 1.X that can be used as input to MIGraphX. Specifically, this is an example of exporting a frozen protobuf of a tensorflow BERT model. 

## How to Use this Example


In order to support bert from tensorflow's official [repository](https://github.com/google-research/bert), a serving_input_fn for the estimator must be implemented in [run_classifier.py](https://github.com/google-research/bert/blob/master/run_classifier.py). In this script, insert the following function after importing all libraries and setting up flags:

```
#...
flags.DEFINE_integer(
    "num_tpu_cores", 8,
    "Only used if `use_tpu` is True. Total number of TPU cores to use.")

# insert function here
def serving_input_fn():
    label_ids = tf.placeholder(tf.int32, [None], name='label_ids')
    input_ids = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='input_ids')
    input_mask = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='input_mask')
    segment_ids = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='segment_ids')
    input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({
        'label_ids': label_ids,
        'input_ids': input_ids,
        'input_mask': input_mask,
        'segment_ids': segment_ids,
    }, default_batch_size=1)()
    return input_fn
```

Since we are passing dynamic shape placeholders in the serving_input_fn, the default_batch_size value will essentially determine the resulting shape in the graph.

For inference, we will focus on the "probabilities" layer's output, and we can name this layer by modifying the following [line](https://github.com/google-research/bert/blob/master/run_classifier.py#L608):

```
probabilities = tf.nn.softmax(logits, axis=-1, name="output")

```

Next, we need to export the saved model after training:
```
def main(_):
# ...
    with tf.gfile.GFile(output_predict_file, "w") as writer:
      num_written_lines = 0
      tf.logging.info("***** Predict results *****")
      for (i, prediction) in enumerate(result):
        probabilities = prediction["probabilities"]
        if i >= num_actual_predict_examples:
          break
        output_line = "\t".join(
            str(class_probability)
            for class_probability in probabilities) + "\n" 
        writer.write(output_line)
        num_written_lines += 1 
    assert num_written_lines == num_actual_predict_examples

# insert code here
  if FLAGS.do_train:  # optional to attach export to train flag
      estimator._export_to_tpu = False
      estimator.export_savedmodel('saved_models', serving_input_fn)
# ...
```

Run bert with the suggested arguments:
```
export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue

python run_classifier.py \
  --task_name=MRPC \
  --do_train=true \
  --do_eval=true \
  --data_dir=$GLUE_DIR/MRPC \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --train_batch_size=32 \ # change to appropriate size that fits on GPU
  --learning_rate=2e-5 \
  --num_train_epochs=3.0 \
  --output_dir=/tmp/mrpc_output/
```

When running, search for the following lines in the output:
```
INFO:tensorflow:Restoring parameters from /tmp/model.ckpt-1603
INFO:tensorflow:SavedModel written to: saved_models/temp-1564086017/saved_model.pb
```

Note the ID followed by "temp-" (in this case, 1564086017). A directory should exist under saved_models/ that is named with the ID.

We also need to record the name of the output layer in bert. This can be done by inspecting the saved model.
```
saved_model_cli show --dir saved_models/1564086017 --tag_set serve --signature_def serving_default
```
The output should look like this:
```
The given SavedModel SignatureDef contains the following input(s):
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (1, 128)
      name: input_ids_1:0
  inputs['input_mask'] tensor_info:
      dtype: DT_INT32
      shape: (1, 128)
      name: input_mask_1:0
  inputs['label_ids'] tensor_info:
      dtype: DT_INT32
      shape: (1)
      name: label_ids_1:0
  inputs['segment_ids'] tensor_info:
      dtype: DT_INT32
      shape: (1, 128)
      name: segment_ids_1:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['probabilities'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 2)
      name: loss/output:0
Method name is: tensorflow/serving/predict

```
Here the output name is given as "loss/output:0", but we will strip the ":0" from the end, as we are concerned with the node only.

We will use tensorflow's freeze graph utility script and the information gathered above to create the frozen protobuf file. 

```
CKPT_NUM=1603
MODEL_ID=1564086017
OUT_NAME=loss/output

cd /path/to/tensorflow
python tensorflow/python/tools/freeze_graph.py \
  --input_graph=/tmp/mrpc_model/graph.pbtxt \
  --input_binary=false \
  --input_checkpoint=/tmp/mrpc_model/model.ckpt-${CKPT_NUM} \
  --input_saved_model_dir=/path/to/bert/saved_models/${MODEL_ID} \
  --output_graph=/tmp/frozen_bert.pb \
  --output_node_names=${OUT_NAME}
```

The final output should be a frozen protobuf that is compatible with MIGraphX.