RESULTS.md 9.72 KB
Newer Older
sunzhq2's avatar
sunzhq2 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# Conformer-Transducer with auxiliary task (CTC weight = 0.5)

## Environments
- Same as RNN-Transducer (see below)

## Config files
- preprocess config: `conf/specaug.yaml`
- train config: `conf/tuning/transducer/train_conformer-rnn_transducer_aux_ngpu4.yaml`
- lm config: `-` (LM was not used)
- decode config: `conf/tuning/transducer/decode_default.yaml`
- ngpu: `4`

## Results (CER)
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_dev_decode_default|14326|205341|95.8|4.0|0.2|0.1|4.3|33.6|
|decode_test_decode_default|7176|104765|95.3|4.4|0.2|0.1|4.8|36.3|


# Conformer-Transducer

## Environments
- Same as RNN-Transducer (see below)

## Config files
- preprocess config: `conf/specaug.yaml`
- train config: `conf/tuning/transducer/train_conformer-rnn_transducer.yaml`
- lm config: `-` (LM was not used)
- decode config: `conf/tuning/transducer/decode_default.yaml`

## Results (CER)
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_dev_decode_default|14326|205341|95.6|4.2|0.2|0.1|4.5|34.0|
|decode_test_decode_default|7176|104765|95.0|4.7|0.3|0.1|5.0|37.1|


# RNN-Transducer with auxiliary task (CTC weight = 0.1)

## Environments
- Same as RNN-Transducer (see below)

## Config files
- preprocess config: `conf/specaug.yaml`
- train config: `conf/tuning/transducer/train_transducer_aux.yaml`
- lm config: `-` (LM was not used)
- decode config: `conf/tuning/transducer/decode_default.yaml`

## Results (CER)
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_dev_decode_default|14326|205341|93.9|5.8|0.3|0.1|6.3|41.9|
|decode_test_decode_default|7176|104765|93.2|6.5|0.4|0.1|6.9|44.5|


# RNN-Transducer

## Environments
- date: `Thu May 20 05:29:03 UTC 2021`
- python version: `3.7.4 (default, Aug 13 2019, 20:35:49)  [GCC 7.3.0]`
- espnet version: `espnet 0.9.8`
- chainer version: `chainer 6.0.0`
- pytorch version: `pytorch 1.6.0`
- Git hash: `95b3008cdcc2247e781a048bc999243dc7f45fe7`
  - Commit date: `Sat Mar 6 00:48:29 2021 +0000`

## Config files
- preprocess config: `conf/specaug.yaml`
- train config: `conf/tuning/transducer/train_transducer.yaml`
- lm config: `-` (LM was not used)
- decode config: `conf/tuning/transducer/decode_default.yaml`

## Results (CER)
|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_dev_decode_default|14326|205341|93.8|5.9|0.3|0.1|6.3|42.0|
|decode_test_decode_default|7176|104765|92.9|6.7|0.3|0.1|7.2|45.9|


# Conformer (kernel size = 15) + SpecAugment + LM weight = 0.0 result

- training config file: `conf/tuning/train_pytorch_conformer_kernel15.yaml`
- preprocess config file: `conf/specaug.yaml`
- decoding config file: `conf/decode.yaml`, set `lm-weight = 0.0`
- model link: https://drive.google.com/file/d/1pOhwj6JFqVyt5quW7BKWfJ3vfPFRoxpQ/view?usp=sharing
```
exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.6       36.0   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel15_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   95.0        4.9         0.1        0.1        5.1       38.6   |
```

# Conformer (kernel size = 31) + SpecAugment + LM weight = 0.0 result

- training config file: `conf/tuning/train_pytorch_conformer_kernel31.yaml`
- preprocess config file: `conf/specaug.yaml`
- decoding config file: `conf/decode.yaml`, set `lm-weight = 0.0`
```
exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_dev_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   95.4        4.5        0.1        0.1        4.7       36.2   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31_specaug/decode_test_decode_lm0.0/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.9        5.0         0.1        0.1        5.2       39.1   |
```

# Conformer (kernel size = 31) result

- training config file: `conf/tuning/train_pytorch_conformer_kernel31.yaml`
- decoding config file: `conf/decode.yaml`
```
exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_dev_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.9        5.0        0.1        0.1        5.2       38.3   |
exp/train_sp_pytorch_train_pytorch_conformer_kernel31/decode_test_decode/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   94.2        5.4         0.2        0.1        5.8       41.0   |
```

# Transformer result (default transformer with initial learning rate = 1.0 and epochs = 50)

  - Environments (obtained by `$ get_sys_info.sh`)
    - date: `Mon Jun 10 12:34:41 EDT 2019`
    - system information: `Linux b14 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux`
    - python version: `Python 3.7.3`
    - espnet version: `espnet 0.3.1`
    - chainer version: `chainer 6.0.0`
    - pytorch version: `pytorch 1.0.1.post2`
    - Git hash: `82e9b7eb7ccae61e11af28981734ea1c2b315a98`
  - Model files (archived to model.v1.tar.gz by `$ pack_model.sh`)
    - model link: https://drive.google.com/open?id=1BIQBpLRRy3XSMT5IRxnLcgLMirGzu8dg
    - training config file: `conf/train.yaml`
    - decoding config file: `conf/decode.yaml`
    - cmvn file: `data/train_sp/cmvn.ark`
    - e2e file: `exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.last10.avg.best`
    - e2e JSON file: `exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/results/model.json`
    - lm file: `exp/train_rnnlm_pytorch_lm/rnnlm.model.best`
    - lm JSON file: `exp/train_rnnlm_pytorch_lm/model.json`
  - Results (paste them by yourself or obtained by `$ pack_model.sh --results <results>`)
```
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |  14326      205341   |   94.1        5.7        0.2        0.1        6.0       42.0   |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub         Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176      104765   |   93.4        6.4         0.2        0.1        6.7       45.1   |
```

# First result (no tuning, but already very good. cf. Kaldi chain best 7.43% and nnet3 8.64% while ESPnet 8.0%)
```
exp/train_sp_pytorch_no_patience/decode_dev_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |    Corr            Sub           Del           Ins            Err         S.Err     |
|    Sum/Avg    |    14326         205341     |    93.3            6.5           0.2           0.1            6.8          45.2     |
exp/train_sp_pytorch_no_patience/decode_test_beam20_emodel.acc.best_p0.0_len0.0-0.0_ctcw0.6_rnnlm0.3_2layer_unit650_sgd_bs64/result.txt:
|    SPKR       |     # Snt         # Wrd     |     Corr           Sub            Del           Ins            Err         S.Err     |
|    Sum/Avg    |     7176         104765     |     92.2           7.6            0.2           0.2            8.0          50.2     |
```

# Ngram related
   - decoding with ngram and RNNLM
```
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4gramfull0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |
```
```
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326      205341  |   94.1        5.7        0.2        0.1        6.0      41.7    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer_lm0.7_4grampart0.3/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   93.5        6.3        0.2        0.1        6.6      44.6    |
```
  - only e2e model
```
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_dev_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   14326       205341 |   93.6        6.2        0.2        0.1        6.5      45.6    |
exp/train_sp_pytorch_train_pytorch_transformer_lr1.0/decode_test_decode_pytorch_transformer/result.txt
|   SPKR     |   # Snt      # Wrd   |   Corr        Sub        Del        Ins        Err      S.Err   |
|   Sum/Avg  |   7176       104765  |   92.7        7.1        0.2        0.1        7.4      49.8    |
```