Switching to more robust pysoundfile for reading wav files

dec81ac7 · Samuel Neugber · Samuel Neugber · 541d584e · dec81ac7 · dec81ac7
Commit dec81ac7 authored Nov 07, 2018 by Samuel Neugber Committed by Samuel Neugber Nov 08, 2018
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 5 deletions

research/audioset/README.md research/audioset/README.md +5 -3

research/audioset/vggish_input.py research/audioset/vggish_input.py +3 -2

No files found.
--- a/research/audioset/README.md
+++ b/research/audioset/README.md
@@ -49,14 +49,16 @@ VGGish depends on the following Python packages:
 * [`resampy`](http://resampy.readthedocs.io/en/latest/)
 * [`tensorflow`](http://www.tensorflow.org/)
 * [`six`](https://pythonhosted.org/six/)
+* [`pysoundfile`](https://pysoundfile.readthedocs.io/)
 These are all easily installable via, e.g., `pip install numpy` (as in the
 example command sequence below).
 Any reasonably recent version of these packages should work. TensorFlow should
-be at least version 1.0.  We have tested with Python 2.7.6 and 3.4.3 on an
+be at least version 1.0.  We have tested on an Ubuntu-like system with
-Ubuntu-like system with NumPy v1.13.1, SciPy v0.19.1, resampy v0.1.5, TensorFlow
+Python 2.7.6 and 3.4.3, NumPy v1.13.1, SciPy v0.19.1, resampy v0.1.5, TensorFlow
-v1.2.1, and Six v1.10.0.
+v1.2.1, Six v1.10.0 as well on Ubuntu and Windows 10 with Python 3.6.6, Numpy v1.15.4,
+SciPy v1.1.0, resampy v0.2.1, TensorFlow v1.3.0, Six v1.11.0 and PySoundFile 0.9.0. 
 VGGish also requires downloading two data files:

--- a/research/audioset/vggish_input.py
+++ b/research/audioset/vggish_input.py
@@ -17,11 +17,12 @@
 import numpy as np
 import resampy
-from scipy.io import wavfile
 import mel_features
 import vggish_params
+import soundfile as sf
 def waveform_to_examples(data, sample_rate):
  """Converts audio waveform into an array of examples for VGGish.
@@ -80,7 +81,7 @@ def wavfile_to_examples(wav_file):
  Returns:
    See waveform_to_examples.
  """
-  sr, wav_data = wavfile.read(wav_file)
+  wav_data, sr = sf.read(wav_file, dtype='int16')
  assert wav_data.dtype == np.int16, 'Bad sample type: %r' % wav_data.dtype
  samples = wav_data / 32768.0  # Convert to [-1.0, +1.0]
  return waveform_to_examples(samples, sr)