Unverified Commit 982f2028 authored by rudongyu's avatar rudongyu Committed by GitHub
Browse files

[Doc Fix] fix the format of gt doc (#6949)

parent dfff53bc
...@@ -5,6 +5,7 @@ In this section, we will prepare the data for the Graphormer model introduced be ...@@ -5,6 +5,7 @@ In this section, we will prepare the data for the Graphormer model introduced be
.. code:: python .. code:: python
def collate(graphs): def collate(graphs):
# compute shortest path features, can be done in advance # compute shortest path features, can be done in advance
for g in graphs: for g in graphs:
......
🆕 Tutorial: GraphTransformer 🆕 Tutorial: Graph Transformer
========== ==========
This tutorial introduces the **graphtransformer** module, which is a set of This tutorial introduces the **graph transformer** (:mod:`~dgl.nn.gt`) module,
utility modules for building and training graph transformer models. which is a set of utility modules for building and training graph transformer models.
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
......
...@@ -12,6 +12,7 @@ Degree Encoding ...@@ -12,6 +12,7 @@ Degree Encoding
The degree encoder is a learnable embedding layer that encodes the degree of each node into a vector. It takes as input the batched input and output degrees of graph nodes, and outputs the degree embeddings of the nodes. The degree encoder is a learnable embedding layer that encodes the degree of each node into a vector. It takes as input the batched input and output degrees of graph nodes, and outputs the degree embeddings of the nodes.
.. code:: python .. code:: python
degree_encoder = dgl.nn.DegreeEncoder( degree_encoder = dgl.nn.DegreeEncoder(
max_degree=8, # the maximum degree to cut off max_degree=8, # the maximum degree to cut off
embedding_dim=512 # the dimension of the degree embedding embedding_dim=512 # the dimension of the degree embedding
...@@ -22,6 +23,7 @@ Path Encoding ...@@ -22,6 +23,7 @@ Path Encoding
The path encoder encodes the edge features on the shortest path between two nodes to get attention bias for the self-attention module. It takes as input the batched edge features in shape and outputs the attention bias based on path encoding. The path encoder encodes the edge features on the shortest path between two nodes to get attention bias for the self-attention module. It takes as input the batched edge features in shape and outputs the attention bias based on path encoding.
.. code:: python .. code:: python
path_encoder = PathEncoder( path_encoder = PathEncoder(
max_len=5, # the maximum length of the shortest path max_len=5, # the maximum length of the shortest path
feat_dim=512, # the dimension of the edge feature feat_dim=512, # the dimension of the edge feature
...@@ -33,6 +35,7 @@ Spatial Encoding ...@@ -33,6 +35,7 @@ Spatial Encoding
The spatial encoder encodes the shortest distance between two nodes to get attention bias for the self-attention module. It takes as input the shortest distance between two nodes and outputs the attention bias based on spatial encoding. The spatial encoder encodes the shortest distance between two nodes to get attention bias for the self-attention module. It takes as input the shortest distance between two nodes and outputs the attention bias based on spatial encoding.
.. code:: python .. code:: python
spatial_encoder = SpatialEncoder( spatial_encoder = SpatialEncoder(
max_dist=5, # the maximum distance between two nodes max_dist=5, # the maximum distance between two nodes
num_heads=8, # the number of attention heads num_heads=8, # the number of attention heads
...@@ -46,6 +49,7 @@ The Graphormer layer is like a Transformer encoder layer with the Multi-head Att ...@@ -46,6 +49,7 @@ The Graphormer layer is like a Transformer encoder layer with the Multi-head Att
We can stack multiple Graphormer layers as a list just like implementing a Transformer encoder in PyTorch. We can stack multiple Graphormer layers as a list just like implementing a Transformer encoder in PyTorch.
.. code:: python .. code:: python
layers = th.nn.ModuleList([ layers = th.nn.ModuleList([
GraphormerLayer( GraphormerLayer(
feat_size=512, # the dimension of the input node features feat_size=512, # the dimension of the input node features
...@@ -63,6 +67,7 @@ Model Forward ...@@ -63,6 +67,7 @@ Model Forward
Grouping the modules above defines the primary components of the Graphormer model. We then can define the forward process as follows: Grouping the modules above defines the primary components of the Graphormer model. We then can define the forward process as follows:
.. code:: python .. code:: python
node_feat, in_degree, out_degree, attn_mask, path_data, dist = \ node_feat, in_degree, out_degree, attn_mask, path_data, dist = \
next(iter(dataloader)) # we will use the first batch as an example next(iter(dataloader)) # we will use the first batch as an example
num_graphs, max_num_nodes, _ = node_feat.shape num_graphs, max_num_nodes, _ = node_feat.shape
...@@ -84,6 +89,6 @@ Grouping the modules above defines the primary components of the Graphormer mode ...@@ -84,6 +89,6 @@ Grouping the modules above defines the primary components of the Graphormer mode
attn_bias=attn_bias, attn_bias=attn_bias,
) )
For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer`_. For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer>`_.
You can also explore other `utility modules <https://docs.dgl.ai/api/python/nn-pytorch.html#utility-modules-for-graph-transformer>`_ to customize your own graph transformer model. In the next section, we will show how to prepare the data for training. You can also explore other `utility modules <https://docs.dgl.ai/api/python/nn-pytorch.html#utility-modules-for-graph-transformer>`_ to customize your own graph transformer model. In the next section, we will show how to prepare the data for training.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment