README.md 7.95 KB
Newer Older
huchen's avatar
huchen committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225

## Model Preparation

### Clone the repository

```bash
git clone https://github.com/NVIDIA/DeepLearningExamples.git
cd DeepLearningExamples
```

You will build our ConversationalAI in the Tacotron2 folder:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai
```

### Download checkpoints

Download the PyTorch checkpoints from [NGC](https://ngc.nvidia.com/models):
* [Jasper](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16/files)

```bash
wget https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_fp16/versions/1/files/jasper_fp16.pt
```


* [BERT](https://ngc.nvidia.com/catalog/models/nvidia:bert_large_pyt_amp_ckpt_squad_qa1_1/files?version=1)

```bash
wget https://api.ngc.nvidia.com/v2/models/nvidia/bert_large_pyt_amp_ckpt_squad_qa1_1/versions/1/files/bert_large_qa.pt
```


* [Tacotron 2](https://ngc.nvidia.com/catalog/models/nvidia:tacotron2_pyt_ckpt_amp/files?version=19.12.0)
```bash
wget https://api.ngc.nvidia.com/v2/models/nvidia/tacotron2_pyt_ckpt_amp/versions/19.12.0/files/nvidia_tacotron2pyt_fp16.pt
```


* [WaveGlow](https://ngc.nvidia.com/catalog/models/nvidia:waveglow_ckpt_amp_256/files?version=20.01.0)
```bash
wget https://api.ngc.nvidia.com/v2/models/nvidia/waveglow_ckpt_amp_256/versions/20.01.0/files/nvidia_waveglow256pyt_fp16.pt
```


Move the downloaded checkpoints to `models` directory:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai
```

### Prepare Jasper

First, let's generate a TensorRT engine for Jasper using TensorRT version 7.

Download the Jasper checkpoint from [NGC](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_fp16/files)
and move it to `Jasper/checkpoints/` direcotry:

```bash
mkdir -p DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/checkpoints
mv jasper_fp16.pt DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/checkpoints
```

Apply a patch to enable support of TensorRT 7:

```bash
cd DeepLearningExamples/
git apply --ignore-space-change --reject --whitespace=fix ../patch_jasper_trt7
```

Now, build a container for Jasper:

```bash
cd DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/
bash tensorrt/scripts/docker/build.sh
```

To run the container, type:

```bash
cd DeepLearningExamples/PyTorch/SpeechRecognition/Jasper
export JASPER_DIR=${PWD}
export DATA_DIR=$JASPER_DIR/data/
export CHECKPOINT_DIR=$JASPER_DIR/checkpoints/
export RESULT_DIR=$JASPER_DIR/results/
cd $JASPER_DIR
mkdir -p $DATA_DIR $CHECKPOINT_DIR $RESULT_DIR
bash tensorrt/scripts/docker/launch.sh $DATA_DIR $CHECKPOINT_DIR $RESULT_DIR
```

Inside the container export Jasper TensorRT engine by executing:

```bash
pip install --upgrade onnx
mkdir -p /results/onnxs/ /results/engines/
cd /jasper
python tensorrt/perf.py --batch_size 1 --engine_batch_size 1 --model_toml configs/jasper10x5dr_nomask.toml --ckpt_path /checkpoints/jasper_fp16.pt --trt_fp16 --pyt_fp16 --engine_path /results/engines/jasper_fp16.engine --onnx_path /results/onnxs/fp32_DYNAMIC.onnx --seq_len 3600 --make_onnx
```

After successful export, copy the engine to model_repo:

```bash
cd DeepLearningExamples/Pytorch
mkdir -p SpeechSynthesis/Tacotron2/notebooks/conversationalai/model_repo/jasper-trt/1
cp SpeechRecognition/Jasper/results/engines/jasper_fp16.engine SpeechSynthesis/Tacotron2/notebooks/conversationalai/model_repo/jasper-trt/1/
```

You will also need Jasper feature extractor and decoder. Download them from [NGC](https://ngc.nvidia.com/catalog/models/nvidia:jasperpyt_jit_fp16/files) and move to the model_repo:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/model_repo/
mkdir -p jasper-decoder/1 jasper-feature-extractor/1
wget -P jasper-decoder/ https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_jit_fp16/versions/1/files/jasper-decoder/config.pbtxt
wget -P jasper-decoder/1/ https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_jit_fp16/versions/1/files/jasper-decoder/1/jasper-decoder.pt
wget -P jasper-feature-extractor/ https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_jit_fp16/versions/1/files/jasper-feature-extractor/config.pbtxt
wget -P jasper-feature-extractor/1/ https://api.ngc.nvidia.com/v2/models/nvidia/jasperpyt_jit_fp16/versions/1/files/jasper-feature-extractor/1/jasper-feature-extractor.pt
```

### Prepare BERT

With the generated Jasper model, we can proceed to BERT.

Download the BERT checkpoint from [NGC](https://ngc.nvidia.com/catalog/models/nvidia:bert_large_pyt_amp_ckpt_squad_qa1_1/files)
and move it to `BERT/checkpoints/` direcotry:

```bash
mkdir -p DeepLearningExamples/PyTorch/LanguageModeling/BERT/checkpoints/
mv bert_large_qa.pt DeepLearningExamples/PyTorch/LanguageModeling/BERT/checkpoints/bert_qa.pt
```

Now, build a container for BERT:

```bash
cd PyTorch/LanguageModeling/BERT/
bash scripts/docker/build.sh
```

Use the Triton export script to convert the model `checkpoints/bert_large_qa.pt` to ONNX:

```bash
bash triton/export_model.sh
```

The model will be saved in `results/triton_models/bertQA-onnx`, together with Triton configuration file. Copy the model and configuration file to the model_repo:

```bash
cd DeepLearningExamples
cp -r PyTorch/LanguageModeling/BERT/results/triton_models/bertQA-ts-script DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/model_repo/
```

### Prepare Tacotron 2 and WaveGlow

Now to the final part - TTS system.

Download the [Tacotron 2](https://ngc.nvidia.com/models/nvidia:tacotron2pyt_fp16/files?version=2) and [WaveGlow](https://ngc.nvidia.com/models/nvidia:waveglow256pyt_fp16/files) checkpoints from [NGC](https://ngc.nvidia.com/catalog/models/)
and move them to `Tacotron2/checkpoints/` direcotry:

```bash
mkdir -p DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/checkpoints/
mv nvidia_tacotron2pyt_fp16_20190427 nvidia_waveglow256pyt_fp16 DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/checkpoints/
```

Build the Tacotron 2 container:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/
bash scripts/docker/build.sh
```

Run the container in th interactive mode by typing:
```bash
bash scripts/docker/interactive.sh
```

Export Tacotron 2 to TorchScript:

```bash
cd /workspace/tacotron2/
mkdir -p output
python notebooks/conversationalai/export_tacotron2_ts.py --tacotron2 notebooks/conversationalai/nvidia_tacotron2pyt_fp16.pt -o output/tacotron2_fp16.pt --fp16
```

Export WaveGlow to ONNX intermediate representation:

```bash
python tensorrt/convert_waveglow2onnx.py --waveglow notebooks/conversationalai/nvidia_waveglow256pyt_fp16.pt --wn-channels 256 --fp16 -o output/ --config-file config.json
```

Use the exported ONNX IR to generate TensorRT engine:

```bash
pip install pycuda
python tensorrt/convert_onnx2trt.py --waveglow output/waveglow.onnx -o output/ --fp16
```

After successful export, exit the container and copy the Tacotron 2 model and the WaveGlow engine to `model_repo`:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/
mkdir -p notebooks/conversationalai/model_repo/tacotron2/1/ notebooks/conversationalai/model_repo/waveglow-trt/1/
cp output/tacotron2_fp16.pt notebooks/conversationalai/model_repo/tacotron2/1/
cp output/waveglow_fp16.engine notebooks/conversationalai/model_repo/waveglow-trt/1/
```
## Deployment

Will all models ready for deployment, go to the `conversationalai/client` folder and build the Triron client:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/client
docker build -f Dockerfile --network=host -t speech_ai_client:demo .
```

From terminal start the Triton server:

```bash
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai
NV_GPU=1 nvidia-docker run --ipc=host --network=host --rm -p8000:8000 -p8001:8001 -v ${PWD}/model_repo/:/models nvcr.io/nvidia/tritonserver:20.06-v1-py3 tritonserver --model-store=/models --log-verbose 1

```

In another another terminal run the client:

```bash
docker run -it --rm --network=host --device /dev/snd:/dev/snd speech_ai_client:demo bash /workspace/speech_ai_demo/start_jupyter.sh
```