Commit 831281ce authored by Manoj Plakal's avatar Manoj Plakal Committed by Dan Ellis
Browse files

Cleaned up dependences and install instructions for vggish and yamnet. (#8059)

- Made code work with either TF v1.x or TF v2.x, while explicitly
  enabling v1.x behavior.l
- Pulled slim from tf_slim package instead of through tensorflow
  contrib. Note that tf_slim itself uses tensorflow contrib so
  it requires using TF v1.x for now (referenced a relevant PR
  which should remove this limitation once it gets merged).
- Removed all mention of scipy. Switched wav writing to soundfile.
- Switched package name to soundfile instead of pysoundfile. The
  former is the newer name.
- Updated installation instructions for both vggish and yamnet to
  reflect these changes.
- Tested new installation procedures. vggish works with TF v1.15,
  yamnet works with TF v1.15.0 as well as TF v2.1.0.
parent 855ed8bc
...@@ -15,19 +15,18 @@ the released embedding features. ...@@ -15,19 +15,18 @@ the released embedding features.
VGGish depends on the following Python packages: VGGish depends on the following Python packages:
* [`numpy`](http://www.numpy.org/) * [`numpy`](http://www.numpy.org/)
* [`scipy`](http://www.scipy.org/)
* [`resampy`](http://resampy.readthedocs.io/en/latest/) * [`resampy`](http://resampy.readthedocs.io/en/latest/)
* [`tensorflow`](http://www.tensorflow.org/) * [`tensorflow`](http://www.tensorflow.org/) (currently, only TF v1.x)
* [`tf_slim`](https://github.com/google-research/tf-slim)
* [`six`](https://pythonhosted.org/six/) * [`six`](https://pythonhosted.org/six/)
* [`pysoundfile`](https://pysoundfile.readthedocs.io/) * [`soundfile`](https://pysoundfile.readthedocs.io/)
These are all easily installable via, e.g., `pip install numpy` (as in the These are all easily installable via, e.g., `pip install numpy` (as in the
example command sequence below). sample installation session below).
Any reasonably recent version of these packages should work. TensorFlow should Any reasonably recent version of these packages shold work. Note that we currently only support
be at least version 1.0. We have tested that everything works on Ubuntu and TensorFlow v1.x due to a [`tf_slim` limitation](https://github.com/google-research/tf-slim/pull/1).
Windows 10 with Python 3.6.6, Numpy v1.15.4, SciPy v1.1.0, resampy v0.2.1, TensorFlow v1.15 (the latest version as of Jan 2020) has been tested to work.
TensorFlow v1.3.0, Six v1.11.0 and PySoundFile 0.9.0.
VGGish also requires downloading two data files: VGGish also requires downloading two data files:
...@@ -57,14 +56,11 @@ Here's a sample installation and test session: ...@@ -57,14 +56,11 @@ Here's a sample installation and test session:
# $ deactivate # $ deactivate
# Within the virtual environment, do not use 'sudo'. # Within the virtual environment, do not use 'sudo'.
# Upgrade pip first. # Upgrade pip first. Also make sure wheel is installed.
$ sudo python -m pip install --upgrade pip $ sudo python -m pip install --upgrade pip wheel
# Install dependences. Resampy needs to be installed after NumPy and SciPy # Install all dependences.
# are already installed. $ sudo pip install numpy resampy tensorflow==1.15 tf_slim six soundfile
$ sudo pip install numpy scipy soundfile
$ sudo pip install resampy six
$ sudo pip install tensorflow==1.14
# Clone TensorFlow models repo into a 'models' directory. # Clone TensorFlow models repo into a 'models' directory.
$ git clone https://github.com/tensorflow/models.git $ git clone https://github.com/tensorflow/models.git
......
...@@ -47,9 +47,10 @@ Usage: ...@@ -47,9 +47,10 @@ Usage:
from __future__ import print_function from __future__ import print_function
import numpy as np import numpy as np
from scipy.io import wavfile
import six import six
import tensorflow as tf import soundfile
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import vggish_input import vggish_input
import vggish_params import vggish_params
...@@ -93,7 +94,7 @@ def main(_): ...@@ -93,7 +94,7 @@ def main(_):
# Convert to signed 16-bit samples. # Convert to signed 16-bit samples.
samples = np.clip(x * 32768, -32768, 32767).astype(np.int16) samples = np.clip(x * 32768, -32768, 32767).astype(np.int16)
wav_file = six.BytesIO() wav_file = six.BytesIO()
wavfile.write(wav_file, sr, samples) soundfile.write(wav_file, samples, sr, format='WAV', subtype='PCM_16')
wav_file.seek(0) wav_file.seek(0)
examples_batch = vggish_input.wavfile_to_examples(wav_file) examples_batch = vggish_input.wavfile_to_examples(wav_file)
print(examples_batch) print(examples_batch)
......
...@@ -27,7 +27,7 @@ try: ...@@ -27,7 +27,7 @@ try:
def wav_read(wav_file): def wav_read(wav_file):
wav_data, sr = sf.read(wav_file, dtype='int16') wav_data, sr = sf.read(wav_file, dtype='int16')
return wav_data, sr return wav_data, sr
except ImportError: except ImportError:
def wav_read(wav_file): def wav_read(wav_file):
......
...@@ -31,10 +31,10 @@ https://github.com/tensorflow/models/blob/master/research/slim/nets/vgg.py ...@@ -31,10 +31,10 @@ https://github.com/tensorflow/models/blob/master/research/slim/nets/vgg.py
""" """
import tensorflow.compat.v1 as tf import tensorflow.compat.v1 as tf
from tensorflow.contrib import slim as contrib_slim tf.disable_v2_behavior()
import vggish_params as params import tf_slim as slim
slim = contrib_slim import vggish_params as params
def define_vggish_slim(training=False): def define_vggish_slim(training=False):
......
...@@ -32,7 +32,8 @@ Usage: ...@@ -32,7 +32,8 @@ Usage:
from __future__ import print_function from __future__ import print_function
import numpy as np import numpy as np
import tensorflow as tf import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import vggish_input import vggish_input
import vggish_params import vggish_params
......
...@@ -48,14 +48,15 @@ from __future__ import print_function ...@@ -48,14 +48,15 @@ from __future__ import print_function
from random import shuffle from random import shuffle
import numpy as np import numpy as np
import tensorflow as tf import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import tf_slim as slim
import vggish_input import vggish_input
import vggish_params import vggish_params
import vggish_slim import vggish_slim
flags = tf.app.flags flags = tf.app.flags
slim = tf.contrib.slim
flags.DEFINE_integer( flags.DEFINE_integer(
'num_batches', 30, 'num_batches', 30,
......
...@@ -13,7 +13,6 @@ for applying the model to input sound files. ...@@ -13,7 +13,6 @@ for applying the model to input sound files.
YAMNet depends on the following Python packages: YAMNet depends on the following Python packages:
* [`numpy`](http://www.numpy.org/) * [`numpy`](http://www.numpy.org/)
* [`scipy`](http://www.scipy.org/)
* [`resampy`](http://resampy.readthedocs.io/en/latest/) * [`resampy`](http://resampy.readthedocs.io/en/latest/)
* [`tensorflow`](http://www.tensorflow.org/) * [`tensorflow`](http://www.tensorflow.org/)
* [`pysoundfile`](https://pysoundfile.readthedocs.io/) * [`pysoundfile`](https://pysoundfile.readthedocs.io/)
...@@ -22,9 +21,9 @@ These are all easily installable via, e.g., `pip install numpy` (as in the ...@@ -22,9 +21,9 @@ These are all easily installable via, e.g., `pip install numpy` (as in the
example command sequence below). example command sequence below).
Any reasonably recent version of these packages should work. TensorFlow should Any reasonably recent version of these packages should work. TensorFlow should
be at least version 1.8 to ensure Keras support is included. We have tested be at least version 1.8 to ensure Keras support is included. Note that while
that everything works on Ubuntu and MacOS with Python 3.7.2, Numpy v1.15.4, the code works fine with TensorFlow v1.x or v2.x, we explicitly enable v1.x
SciPy v1.1.0, resampy v0.2.1, TensorFlow v1.14.0, and PySoundFile 0.9.0. behavior.
YAMNet also requires downloading the following data file: YAMNet also requires downloading the following data file:
...@@ -38,13 +37,11 @@ runs some synthetic signals through the model and checks the outputs. ...@@ -38,13 +37,11 @@ runs some synthetic signals through the model and checks the outputs.
Here's a sample installation and test session: Here's a sample installation and test session:
```shell ```shell
# Upgrade pip first. # Upgrade pip first. Also make sure wheel is installed.
python -m pip install --upgrade pip python -m pip install --upgrade pip wheel.
# Install dependences. Resampy needs to be installed after NumPy and SciPy # Install dependences.
# are already installed. pip install numpy resampy tensorflow soundfile
pip install numpy scipy
pip install resampy tensorflow soundfile
# Clone TensorFlow models repo into a 'models' directory. # Clone TensorFlow models repo into a 'models' directory.
git clone https://github.com/tensorflow/models.git git clone https://github.com/tensorflow/models.git
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment