Commit 831281ce authored by Manoj Plakal's avatar Manoj Plakal Committed by Dan Ellis
Browse files

Cleaned up dependences and install instructions for vggish and yamnet. (#8059)

- Made code work with either TF v1.x or TF v2.x, while explicitly
  enabling v1.x behavior.l
- Pulled slim from tf_slim package instead of through tensorflow
  contrib. Note that tf_slim itself uses tensorflow contrib so
  it requires using TF v1.x for now (referenced a relevant PR
  which should remove this limitation once it gets merged).
- Removed all mention of scipy. Switched wav writing to soundfile.
- Switched package name to soundfile instead of pysoundfile. The
  former is the newer name.
- Updated installation instructions for both vggish and yamnet to
  reflect these changes.
- Tested new installation procedures. vggish works with TF v1.15,
  yamnet works with TF v1.15.0 as well as TF v2.1.0.
parent 855ed8bc
......@@ -15,19 +15,18 @@ the released embedding features.
VGGish depends on the following Python packages:
* [`numpy`](http://www.numpy.org/)
* [`scipy`](http://www.scipy.org/)
* [`resampy`](http://resampy.readthedocs.io/en/latest/)
* [`tensorflow`](http://www.tensorflow.org/)
* [`tensorflow`](http://www.tensorflow.org/) (currently, only TF v1.x)
* [`tf_slim`](https://github.com/google-research/tf-slim)
* [`six`](https://pythonhosted.org/six/)
* [`pysoundfile`](https://pysoundfile.readthedocs.io/)
* [`soundfile`](https://pysoundfile.readthedocs.io/)
These are all easily installable via, e.g., `pip install numpy` (as in the
example command sequence below).
sample installation session below).
Any reasonably recent version of these packages should work. TensorFlow should
be at least version 1.0. We have tested that everything works on Ubuntu and
Windows 10 with Python 3.6.6, Numpy v1.15.4, SciPy v1.1.0, resampy v0.2.1,
TensorFlow v1.3.0, Six v1.11.0 and PySoundFile 0.9.0.
Any reasonably recent version of these packages shold work. Note that we currently only support
TensorFlow v1.x due to a [`tf_slim` limitation](https://github.com/google-research/tf-slim/pull/1).
TensorFlow v1.15 (the latest version as of Jan 2020) has been tested to work.
VGGish also requires downloading two data files:
......@@ -57,14 +56,11 @@ Here's a sample installation and test session:
# $ deactivate
# Within the virtual environment, do not use 'sudo'.
# Upgrade pip first.
$ sudo python -m pip install --upgrade pip
# Upgrade pip first. Also make sure wheel is installed.
$ sudo python -m pip install --upgrade pip wheel
# Install dependences. Resampy needs to be installed after NumPy and SciPy
# are already installed.
$ sudo pip install numpy scipy soundfile
$ sudo pip install resampy six
$ sudo pip install tensorflow==1.14
# Install all dependences.
$ sudo pip install numpy resampy tensorflow==1.15 tf_slim six soundfile
# Clone TensorFlow models repo into a 'models' directory.
$ git clone https://github.com/tensorflow/models.git
......
......@@ -47,9 +47,10 @@ Usage:
from __future__ import print_function
import numpy as np
from scipy.io import wavfile
import six
import tensorflow as tf
import soundfile
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import vggish_input
import vggish_params
......@@ -93,7 +94,7 @@ def main(_):
# Convert to signed 16-bit samples.
samples = np.clip(x * 32768, -32768, 32767).astype(np.int16)
wav_file = six.BytesIO()
wavfile.write(wav_file, sr, samples)
soundfile.write(wav_file, samples, sr, format='WAV', subtype='PCM_16')
wav_file.seek(0)
examples_batch = vggish_input.wavfile_to_examples(wav_file)
print(examples_batch)
......
......@@ -27,7 +27,7 @@ try:
def wav_read(wav_file):
wav_data, sr = sf.read(wav_file, dtype='int16')
return wav_data, sr
except ImportError:
def wav_read(wav_file):
......
......@@ -31,10 +31,10 @@ https://github.com/tensorflow/models/blob/master/research/slim/nets/vgg.py
"""
import tensorflow.compat.v1 as tf
from tensorflow.contrib import slim as contrib_slim
import vggish_params as params
tf.disable_v2_behavior()
import tf_slim as slim
slim = contrib_slim
import vggish_params as params
def define_vggish_slim(training=False):
......
......@@ -32,7 +32,8 @@ Usage:
from __future__ import print_function
import numpy as np
import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import vggish_input
import vggish_params
......
......@@ -48,14 +48,15 @@ from __future__ import print_function
from random import shuffle
import numpy as np
import tensorflow as tf
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import tf_slim as slim
import vggish_input
import vggish_params
import vggish_slim
flags = tf.app.flags
slim = tf.contrib.slim
flags.DEFINE_integer(
'num_batches', 30,
......
......@@ -13,7 +13,6 @@ for applying the model to input sound files.
YAMNet depends on the following Python packages:
* [`numpy`](http://www.numpy.org/)
* [`scipy`](http://www.scipy.org/)
* [`resampy`](http://resampy.readthedocs.io/en/latest/)
* [`tensorflow`](http://www.tensorflow.org/)
* [`pysoundfile`](https://pysoundfile.readthedocs.io/)
......@@ -22,9 +21,9 @@ These are all easily installable via, e.g., `pip install numpy` (as in the
example command sequence below).
Any reasonably recent version of these packages should work. TensorFlow should
be at least version 1.8 to ensure Keras support is included. We have tested
that everything works on Ubuntu and MacOS with Python 3.7.2, Numpy v1.15.4,
SciPy v1.1.0, resampy v0.2.1, TensorFlow v1.14.0, and PySoundFile 0.9.0.
be at least version 1.8 to ensure Keras support is included. Note that while
the code works fine with TensorFlow v1.x or v2.x, we explicitly enable v1.x
behavior.
YAMNet also requires downloading the following data file:
......@@ -38,13 +37,11 @@ runs some synthetic signals through the model and checks the outputs.
Here's a sample installation and test session:
```shell
# Upgrade pip first.
python -m pip install --upgrade pip
# Upgrade pip first. Also make sure wheel is installed.
python -m pip install --upgrade pip wheel.
# Install dependences. Resampy needs to be installed after NumPy and SciPy
# are already installed.
pip install numpy scipy
pip install resampy tensorflow soundfile
# Install dependences.
pip install numpy resampy tensorflow soundfile
# Clone TensorFlow models repo into a 'models' directory.
git clone https://github.com/tensorflow/models.git
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment