Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Torchaudio
Commits
e108fe2a
Unverified
Commit
e108fe2a
authored
Mar 06, 2020
by
Vincent QB
Committed by
GitHub
Mar 06, 2020
Browse files
Change default value of dither (#453)
* change default value of dither. * update doc.
parent
4936c9eb
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
10 additions
and
10 deletions
+10
-10
torchaudio/compliance/kaldi.py
torchaudio/compliance/kaldi.py
+10
-10
No files found.
torchaudio/compliance/kaldi.py
View file @
e108fe2a
...
@@ -194,7 +194,7 @@ def _subtract_column_mean(tensor, subtract_mean):
...
@@ -194,7 +194,7 @@ def _subtract_column_mean(tensor, subtract_mean):
def
spectrogram
(
def
spectrogram
(
waveform
,
blackman_coeff
=
0.42
,
channel
=-
1
,
dither
=
1
.0
,
energy_floor
=
0
.0
,
waveform
,
blackman_coeff
=
0.42
,
channel
=-
1
,
dither
=
0
.0
,
energy_floor
=
1
.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
min_duration
=
0.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
min_duration
=
0.0
,
preemphasis_coefficient
=
0.97
,
raw_energy
=
True
,
remove_dc_offset
=
True
,
preemphasis_coefficient
=
0.97
,
raw_energy
=
True
,
remove_dc_offset
=
True
,
round_to_power_of_two
=
True
,
sample_frequency
=
16000.0
,
snip_edges
=
True
,
round_to_power_of_two
=
True
,
sample_frequency
=
16000.0
,
snip_edges
=
True
,
...
@@ -207,10 +207,10 @@ def spectrogram(
...
@@ -207,10 +207,10 @@ def spectrogram(
blackman_coeff (float): Constant coefficient for generalized Blackman window. (Default: ``0.42``)
blackman_coeff (float): Constant coefficient for generalized Blackman window. (Default: ``0.42``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
1
.0``)
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
0
.0``)
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
0
.0``)
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
1
.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
min_duration (float): Minimum duration of segments to process (in seconds). (Default: ``0.0``)
min_duration (float): Minimum duration of segments to process (in seconds). (Default: ``0.0``)
...
@@ -429,7 +429,7 @@ def get_mel_banks(num_bins, window_length_padded, sample_freq,
...
@@ -429,7 +429,7 @@ def get_mel_banks(num_bins, window_length_padded, sample_freq,
def
fbank
(
def
fbank
(
waveform
,
blackman_coeff
=
0.42
,
channel
=-
1
,
dither
=
1
.0
,
energy_floor
=
0
.0
,
waveform
,
blackman_coeff
=
0.42
,
channel
=-
1
,
dither
=
0
.0
,
energy_floor
=
1
.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
high_freq
=
0.0
,
htk_compat
=
False
,
low_freq
=
20.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
high_freq
=
0.0
,
htk_compat
=
False
,
low_freq
=
20.0
,
min_duration
=
0.0
,
num_mel_bins
=
23
,
preemphasis_coefficient
=
0.97
,
raw_energy
=
True
,
min_duration
=
0.0
,
num_mel_bins
=
23
,
preemphasis_coefficient
=
0.97
,
raw_energy
=
True
,
remove_dc_offset
=
True
,
round_to_power_of_two
=
True
,
sample_frequency
=
16000.0
,
remove_dc_offset
=
True
,
round_to_power_of_two
=
True
,
sample_frequency
=
16000.0
,
...
@@ -443,10 +443,10 @@ def fbank(
...
@@ -443,10 +443,10 @@ def fbank(
blackman_coeff (float): Constant coefficient for generalized Blackman window. (Default: ``0.42``)
blackman_coeff (float): Constant coefficient for generalized Blackman window. (Default: ``0.42``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
1
.0``)
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
0
.0``)
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
0
.0``)
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
1
.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
high_freq (float): High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (Default: ``0.0``)
high_freq (float): High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (Default: ``0.0``)
...
@@ -547,8 +547,8 @@ def _get_lifter_coeffs(num_ceps, cepstral_lifter):
...
@@ -547,8 +547,8 @@ def _get_lifter_coeffs(num_ceps, cepstral_lifter):
def
mfcc
(
def
mfcc
(
waveform
,
blackman_coeff
=
0.42
,
cepstral_lifter
=
22.0
,
channel
=-
1
,
dither
=
1
.0
,
waveform
,
blackman_coeff
=
0.42
,
cepstral_lifter
=
22.0
,
channel
=-
1
,
dither
=
0
.0
,
energy_floor
=
0
.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
high_freq
=
0.0
,
htk_compat
=
False
,
energy_floor
=
1
.0
,
frame_length
=
25.0
,
frame_shift
=
10.0
,
high_freq
=
0.0
,
htk_compat
=
False
,
low_freq
=
20.0
,
num_ceps
=
13
,
min_duration
=
0.0
,
num_mel_bins
=
23
,
preemphasis_coefficient
=
0.97
,
low_freq
=
20.0
,
num_ceps
=
13
,
min_duration
=
0.0
,
num_mel_bins
=
23
,
preemphasis_coefficient
=
0.97
,
raw_energy
=
True
,
remove_dc_offset
=
True
,
round_to_power_of_two
=
True
,
raw_energy
=
True
,
remove_dc_offset
=
True
,
round_to_power_of_two
=
True
,
sample_frequency
=
16000.0
,
snip_edges
=
True
,
subtract_mean
=
False
,
use_energy
=
False
,
sample_frequency
=
16000.0
,
snip_edges
=
True
,
subtract_mean
=
False
,
use_energy
=
False
,
...
@@ -562,10 +562,10 @@ def mfcc(
...
@@ -562,10 +562,10 @@ def mfcc(
cepstral_lifter (float): Constant that controls scaling of MFCCs (Default: ``22.0``)
cepstral_lifter (float): Constant that controls scaling of MFCCs (Default: ``22.0``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
channel (int): Channel to extract (-1 -> expect mono, 0 -> left, 1 -> right) (Default: ``-1``)
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
dither (float): Dithering constant (0.0 means no dither). If you turn this off, you should set
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
1
.0``)
the energy_floor option, e.g. to 1.0 or 0.1 (Default: ``
0
.0``)
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
energy_floor (float): Floor on energy (absolute, not relative) in Spectrogram computation. Caution:
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
this floor is applied to the zeroth component, representing the total signal energy. The floor on the
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
0
.0``)
individual spectrogram elements is fixed at std::numeric_limits<float>::epsilon(). (Default: ``
1
.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_length (float): Frame length in milliseconds (Default: ``25.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
frame_shift (float): Frame shift in milliseconds (Default: ``10.0``)
high_freq (float): High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (Default: ``0.0``)
high_freq (float): High cutoff frequency for mel bins (if <= 0, offset from Nyquist) (Default: ``0.0``)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment