"vscode:/vscode.git/clone" did not exist on "e327e9bad557e9617e7952b6c5ff7c4a23010c06"
README.md 1.24 KB
Newer Older
Sugon_ldc's avatar
Sugon_ldc committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
### Using Tensorrt for Triton ASR Server

```sh
# using docker image runtime/gpu/Dockerfile/Dockerfile.server
docker pull soar97/triton-wenet:22.12
docker run -it --rm --name "wenet_trt_test" --gpus all --shm-size 1g --net host soar97/triton-wenet:22.12
# inside the docker container
git clone https://github.com/wenet-e2e/wenet.git
cd wenet/runtime/gpu/tensorrt
pip3 install nvidia-pyindex
# Use pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple if you encounter network issue
pip3 install -r requirements.txt

bash run_streaming_small_model.sh
```

#### Performance of Small u2pp Model for Streaming ASR

Benchmark(small u2pp onnx) based on Aishell1 test set with server-A10 (16vCPU 60GB Memory)/client(4vCPU 16GB Memory), the total audio duration is 36108.919 seconds.

(Note: using non-simulate-streaming mode)
|concurrent-tasks | processing time(s) |
|----------|--------------------|
| 20 (onnx fp16)                | 123.796 |
| 40 (onnx fp16)                | 84.557  |
| 60 (onnx fp16)                | 73.232  |
| 80 (onnx fp16)                | 66.862  |
| 20 (trt fp16+layernorm plugin)| 90.582  |
| 40 (trt fp16+layernorm plugin)| 75.411  |
| 60 (trt fp16+layernorm plugin)| 69.602  |
| 80 (trt fp16+layernorm plugin)| 65.603  |