Unverified Commit 982f2028 authored by rudongyu's avatar rudongyu Committed by GitHub
Browse files

[Doc Fix] fix the format of gt doc (#6949)

parent dfff53bc
......@@ -5,6 +5,7 @@ In this section, we will prepare the data for the Graphormer model introduced be
.. code:: python
def collate(graphs):
# compute shortest path features, can be done in advance
for g in graphs:
......
🆕 Tutorial: GraphTransformer
🆕 Tutorial: Graph Transformer
==========
This tutorial introduces the **graphtransformer** module, which is a set of
utility modules for building and training graph transformer models.
This tutorial introduces the **graph transformer** (:mod:`~dgl.nn.gt`) module,
which is a set of utility modules for building and training graph transformer models.
.. toctree::
:maxdepth: 2
......
......@@ -12,6 +12,7 @@ Degree Encoding
The degree encoder is a learnable embedding layer that encodes the degree of each node into a vector. It takes as input the batched input and output degrees of graph nodes, and outputs the degree embeddings of the nodes.
.. code:: python
degree_encoder = dgl.nn.DegreeEncoder(
max_degree=8, # the maximum degree to cut off
embedding_dim=512 # the dimension of the degree embedding
......@@ -22,6 +23,7 @@ Path Encoding
The path encoder encodes the edge features on the shortest path between two nodes to get attention bias for the self-attention module. It takes as input the batched edge features in shape and outputs the attention bias based on path encoding.
.. code:: python
path_encoder = PathEncoder(
max_len=5, # the maximum length of the shortest path
feat_dim=512, # the dimension of the edge feature
......@@ -33,6 +35,7 @@ Spatial Encoding
The spatial encoder encodes the shortest distance between two nodes to get attention bias for the self-attention module. It takes as input the shortest distance between two nodes and outputs the attention bias based on spatial encoding.
.. code:: python
spatial_encoder = SpatialEncoder(
max_dist=5, # the maximum distance between two nodes
num_heads=8, # the number of attention heads
......@@ -46,6 +49,7 @@ The Graphormer layer is like a Transformer encoder layer with the Multi-head Att
We can stack multiple Graphormer layers as a list just like implementing a Transformer encoder in PyTorch.
.. code:: python
layers = th.nn.ModuleList([
GraphormerLayer(
feat_size=512, # the dimension of the input node features
......@@ -63,6 +67,7 @@ Model Forward
Grouping the modules above defines the primary components of the Graphormer model. We then can define the forward process as follows:
.. code:: python
node_feat, in_degree, out_degree, attn_mask, path_data, dist = \
next(iter(dataloader)) # we will use the first batch as an example
num_graphs, max_num_nodes, _ = node_feat.shape
......@@ -84,6 +89,6 @@ Grouping the modules above defines the primary components of the Graphormer mode
attn_bias=attn_bias,
)
For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer`_.
For simplicity, we omit some details in the forward process. For the complete implementation, please refer to the `Graphormer example <https://github.com/dmlc/dgl/tree/master/examples/core/Graphormer>`_.
You can also explore other `utility modules <https://docs.dgl.ai/api/python/nn-pytorch.html#utility-modules-for-graph-transformer>`_ to customize your own graph transformer model. In the next section, we will show how to prepare the data for training.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment