Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
d6dfaa9b
Commit
d6dfaa9b
authored
Dec 01, 2019
by
John Andrilla
Committed by
Minjie Wang
Dec 01, 2019
Browse files
[Doc] Capsule network tutorial, edit pass (#1026)
Edited for grammar and style.
parent
10b104dd
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
32 additions
and
34 deletions
+32
-34
tutorials/models/4_old_wines/2_capsule.py
tutorials/models/4_old_wines/2_capsule.py
+32
-34
No files found.
tutorials/models/4_old_wines/2_capsule.py
View file @
d6dfaa9b
"""
"""
.. _model-capsule:
.. _model-capsule:
Capsule
N
etwork
T
utorial
Capsule
n
etwork
t
utorial
===========================
===========================
**Author**: Jinjing Zhou, `Jake Zhao <https://cs.nyu.edu/~jakezhao/>`_, Zheng Zhang, Jinyang Li
**Author**: Jinjing Zhou, `Jake Zhao <https://cs.nyu.edu/~jakezhao/>`_, Zheng Zhang, Jinyang Li
It is perhaps a little surprising that some of the more classical models
In this tutorial, you learn how to describe one of the more classical models in terms of graphs. The approach
can also be described in terms of graphs, offering a different
offers a different perspective. The tutorial describes how to implement a Capsule model for the
perspective. This tutorial describes how this can be done for the
`capsule network <http://arxiv.org/abs/1710.09829>`__.
`capsule network <http://arxiv.org/abs/1710.09829>`__.
"""
"""
#######################################################################################
#######################################################################################
# Key ideas of Capsule
# Key ideas of Capsule
# --------------------
# --------------------
#
#
# The Capsule model offers two key ideas.
# The Capsule model offers two key ideas
: Richer representation and dynamic routing
.
#
#
# **Richer representation** In classic convolutional networks, a scalar
# **Richer representation**
--
In classic convolutional networks, a scalar
# value represents the activation of a given feature. By contrast, a
# value represents the activation of a given feature. By contrast, a
# capsule outputs a vector. The vector's length represents the probability
# capsule outputs a vector. The vector's length represents the probability
# of a feature being present. The vector's orientation represents the
# of a feature being present. The vector's orientation represents the
...
@@ -26,25 +25,25 @@ perspective. This tutorial describes how this can be done for the
...
@@ -26,25 +25,25 @@ perspective. This tutorial describes how this can be done for the
#
#
# |image0|
# |image0|
#
#
# **Dynamic routing** The output of a capsule is
preferentially
sent to
# **Dynamic routing**
--
The output of a capsule is sent to
# certain parents in the layer above based on how well the capsule's
# certain parents in the layer above based on how well the capsule's
# prediction agrees with that of a parent. Such dynamic
# prediction agrees with that of a parent. Such dynamic
#
"
routing-by-agreement
"
generalizes the static routing of max-pooling.
# routing-by-agreement generalizes the static routing of max-pooling.
#
#
# During training, routing is
done
iteratively
; e
ach iteration adjusts
# During training, routing is
accomplished
iteratively
. E
ach iteration adjusts
#
"
routing weights
"
between capsules based on their observed agreements
,
# routing weights between capsules based on their observed agreements
.
#
in
a manner similar to a k-means algorithm or `competitive
#
It's
a manner similar to a k-means algorithm or `competitive
# learning <https://en.wikipedia.org/wiki/Competitive_learning>`__.
# learning <https://en.wikipedia.org/wiki/Competitive_learning>`__.
#
#
# In this tutorial,
we show
how capsule's dynamic routing algorithm can be
# In this tutorial,
you see
how
a
capsule's dynamic routing algorithm can be
# naturally expressed as a graph algorithm.
Our
implementation is adapted
# naturally expressed as a graph algorithm.
The
implementation is adapted
# from `Cedric
# from `Cedric
# Chee <https://github.com/cedrickchee/capsule-net-pytorch>`__, replacing
# Chee <https://github.com/cedrickchee/capsule-net-pytorch>`__, replacing
# only the routing layer.
Our
version achieves similar speed and accuracy.
# only the routing layer.
This
version achieves similar speed and accuracy.
#
#
# Model
I
mplementation
# Model
i
mplementation
# ----------------------
# ----------------------
# Step 1: Setup and
G
raph
I
nitialization
# Step 1: Setup and
g
raph
i
nitialization
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
#
# The connectivity between two layers of capsules form a directed,
# The connectivity between two layers of capsules form a directed,
...
@@ -89,28 +88,27 @@ def init_graph(in_nodes, out_nodes, f_size):
...
@@ -89,28 +88,27 @@ def init_graph(in_nodes, out_nodes, f_size):
# Step 2: Define message passing functions
# Step 2: Define message passing functions
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
#
# This is the pseudo code for Capsule's routing algorithm as given in the
# This is the pseudocode for Capsule's routing algorithm.
# paper:
#
#
# |image2|
# |image2|
#
We i
mplement pseudo
code lines 4-7 in the class `DGLRoutingLayer` as the following steps:
#
I
mplement pseudocode lines 4-7 in the class `DGLRoutingLayer` as the following steps:
#
#
# 1. Calculate coupling coefficients
:
# 1. Calculate coupling coefficients
.
#
#
# - Coefficients are the softmax over all out-edge of in-capsules
:
# - Coefficients are the softmax over all out-edge of in-capsules
.
# :math:`\textbf{c}_{i,j} = \text{softmax}(\textbf{b}_{i,j})`.
# :math:`\textbf{c}_{i,j} = \text{softmax}(\textbf{b}_{i,j})`.
#
#
# 2. Calculate weighted sum over all in-capsules
:
# 2. Calculate weighted sum over all in-capsules
.
#
#
# - Output of a capsule is equal to the weighted sum of its in-capsules
# - Output of a capsule is equal to the weighted sum of its in-capsules
# :math:`s_j=\sum_i c_{ij}\hat{u}_{j|i}`
# :math:`s_j=\sum_i c_{ij}\hat{u}_{j|i}`
#
#
# 3. Squash outputs
:
# 3. Squash outputs
.
#
#
# - Squash the length of a
c
apsule's output vector to range (0,1), so it can represent the probability (of some feature being present).
# - Squash the length of a
C
apsule's output vector to range (0,1), so it can represent the probability (of some feature being present).
# - :math:`v_j=\text{squash}(s_j)=\frac{||s_j||^2}{1+||s_j||^2}\frac{s_j}{||s_j||}`
# - :math:`v_j=\text{squash}(s_j)=\frac{||s_j||^2}{1+||s_j||^2}\frac{s_j}{||s_j||}`
#
#
# 4. Update weights by the amount of agreement
:
# 4. Update weights by the amount of agreement
.
#
#
# - The scalar product :math:`\hat{u}_{j|i}\cdot v_j` can be considered as how well capsule :math:`i` agrees with :math:`j`. It is used to update
# - The scalar product :math:`\hat{u}_{j|i}\cdot v_j` can be considered as how well capsule :math:`i` agrees with :math:`j`. It is used to update
# :math:`b_{ij}=b_{ij}+\hat{u}_{j|i}\cdot v_j`
# :math:`b_{ij}=b_{ij}+\hat{u}_{j|i}\cdot v_j`
...
@@ -165,7 +163,7 @@ class DGLRoutingLayer(nn.Module):
...
@@ -165,7 +163,7 @@ class DGLRoutingLayer(nn.Module):
# Step 3: Testing
# Step 3: Testing
# ~~~~~~~~~~~~~~~
# ~~~~~~~~~~~~~~~
#
#
#
Let's m
ake a simple 20x10 capsule layer
:
#
M
ake a simple 20x10 capsule layer
.
in_nodes
=
20
in_nodes
=
20
out_nodes
=
10
out_nodes
=
10
f_size
=
4
f_size
=
4
...
@@ -173,9 +171,9 @@ u_hat = th.randn(in_nodes * out_nodes, f_size)
...
@@ -173,9 +171,9 @@ u_hat = th.randn(in_nodes * out_nodes, f_size)
routing
=
DGLRoutingLayer
(
in_nodes
,
out_nodes
,
f_size
)
routing
=
DGLRoutingLayer
(
in_nodes
,
out_nodes
,
f_size
)
############################################################################################################
############################################################################################################
#
We
can visualize a
c
apsule network's behavior by monitoring the entropy
#
You
can visualize a
C
apsule network's behavior by monitoring the entropy
# of coupling coefficients. They should start high and then drop, as the
# of coupling coefficients. They should start high and then drop, as the
# weights gradually concentrate on fewer edges
:
# weights gradually concentrate on fewer edges
.
entropy_list
=
[]
entropy_list
=
[]
dist_list
=
[]
dist_list
=
[]
...
@@ -196,7 +194,7 @@ plt.close()
...
@@ -196,7 +194,7 @@ plt.close()
############################################################################################################
############################################################################################################
# |image3|
# |image3|
#
#
# Alternatively, we can also watch the evolution of histograms
:
# Alternatively, we can also watch the evolution of histograms
.
import
seaborn
as
sns
import
seaborn
as
sns
import
matplotlib.animation
as
animation
import
matplotlib.animation
as
animation
...
@@ -219,8 +217,8 @@ plt.close()
...
@@ -219,8 +217,8 @@ plt.close()
############################################################################################################
############################################################################################################
# |image4|
# |image4|
#
#
#
Or
monitor the how lower
level
c
apsules gradually attach to one of the
#
You can
monitor the how lower
-
level
C
apsules gradually attach to one of the
# higher level ones
:
# higher level ones
.
import
networkx
as
nx
import
networkx
as
nx
from
networkx.algorithms
import
bipartite
from
networkx.algorithms
import
bipartite
...
@@ -256,9 +254,9 @@ plt.close()
...
@@ -256,9 +254,9 @@ plt.close()
############################################################################################################
############################################################################################################
# |image5|
# |image5|
#
#
# The full code of this visualization is provided
at
# The full code of this visualization is provided
on
# `
link
<https://github.com/dmlc/dgl/blob/master/examples/pytorch/capsule/simple_routing.py>`__
; t
he complete
# `
GitHub
<https://github.com/dmlc/dgl/blob/master/examples/pytorch/capsule/simple_routing.py>`__
. T
he complete
# code that trains on MNIST is a
t `link
<https://github.com/dmlc/dgl/tree/tutorial/examples/pytorch/capsule>`__.
# code that trains on MNIST is a
lso on `GitHub
<https://github.com/dmlc/dgl/tree/tutorial/examples/pytorch/capsule>`__.
#
#
# .. |image0| image:: https://i.imgur.com/55Ovkdh.png
# .. |image0| image:: https://i.imgur.com/55Ovkdh.png
# .. |image1| image:: https://i.imgur.com/9tc6GLl.png
# .. |image1| image:: https://i.imgur.com/9tc6GLl.png
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment