The degree encoder is a learnable embedding layer that encodes the degree of each node into a vector. It takes as input the batched input and output degrees of graph nodes, and outputs the degree embeddings of the nodes.
The degree encoder is a learnable embedding layer that encodes the degree of each node into a vector. It takes as input the batched input and output degrees of graph nodes, and outputs the degree embeddings of the nodes.
.. code:: python
.. code:: python
degree_encoder = dgl.nn.DegreeEncoder(
degree_encoder = dgl.nn.DegreeEncoder(
max_degree=8, # the maximum degree to cut off
max_degree=8, # the maximum degree to cut off
embedding_dim=512 # the dimension of the degree embedding
embedding_dim=512 # the dimension of the degree embedding
...
@@ -22,6 +23,7 @@ Path Encoding
...
@@ -22,6 +23,7 @@ Path Encoding
The path encoder encodes the edge features on the shortest path between two nodes to get attention bias for the self-attention module. It takes as input the batched edge features in shape and outputs the attention bias based on path encoding.
The path encoder encodes the edge features on the shortest path between two nodes to get attention bias for the self-attention module. It takes as input the batched edge features in shape and outputs the attention bias based on path encoding.
.. code:: python
.. code:: python
path_encoder = PathEncoder(
path_encoder = PathEncoder(
max_len=5, # the maximum length of the shortest path
max_len=5, # the maximum length of the shortest path
feat_dim=512, # the dimension of the edge feature
feat_dim=512, # the dimension of the edge feature
...
@@ -33,6 +35,7 @@ Spatial Encoding
...
@@ -33,6 +35,7 @@ Spatial Encoding
The spatial encoder encodes the shortest distance between two nodes to get attention bias for the self-attention module. It takes as input the shortest distance between two nodes and outputs the attention bias based on spatial encoding.
The spatial encoder encodes the shortest distance between two nodes to get attention bias for the self-attention module. It takes as input the shortest distance between two nodes and outputs the attention bias based on spatial encoding.
.. code:: python
.. code:: python
spatial_encoder = SpatialEncoder(
spatial_encoder = SpatialEncoder(
max_dist=5, # the maximum distance between two nodes
max_dist=5, # the maximum distance between two nodes
num_heads=8, # the number of attention heads
num_heads=8, # the number of attention heads
...
@@ -46,6 +49,7 @@ The Graphormer layer is like a Transformer encoder layer with the Multi-head Att
...
@@ -46,6 +49,7 @@ The Graphormer layer is like a Transformer encoder layer with the Multi-head Att
We can stack multiple Graphormer layers as a list just like implementing a Transformer encoder in PyTorch.
We can stack multiple Graphormer layers as a list just like implementing a Transformer encoder in PyTorch.
.. code:: python
.. code:: python
layers = th.nn.ModuleList([
layers = th.nn.ModuleList([
GraphormerLayer(
GraphormerLayer(
feat_size=512, # the dimension of the input node features
feat_size=512, # the dimension of the input node features
...
@@ -63,6 +67,7 @@ Model Forward
...
@@ -63,6 +67,7 @@ Model Forward
Grouping the modules above defines the primary components of the Graphormer model. We then can define the forward process as follows:
Grouping the modules above defines the primary components of the Graphormer model. We then can define the forward process as follows:
next(iter(dataloader)) # we will use the first batch as an example
next(iter(dataloader)) # we will use the first batch as an example
num_graphs, max_num_nodes, _ = node_feat.shape
num_graphs, max_num_nodes, _ = node_feat.shape
...
@@ -84,6 +89,6 @@ Grouping the modules above defines the primary components of the Graphormer mode
...
@@ -84,6 +89,6 @@ Grouping the modules above defines the primary components of the Graphormer mode
attn_bias=attn_bias,
attn_bias=attn_bias,
)
)
For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer`_.
For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer>`_.
You can also explore other `utility modules <https://docs.dgl.ai/api/python/nn-pytorch.html#utility-modules-for-graph-transformer>`_ to customize your own graph transformer model. In the next section, we will show how to prepare the data for training.
You can also explore other `utility modules <https://docs.dgl.ai/api/python/nn-pytorch.html#utility-modules-for-graph-transformer>`_ to customize your own graph transformer model. In the next section, we will show how to prepare the data for training.