[examples] fix Graphormer as key in state_dict has changed (#6806)

Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

[examples] fix Graphormer as key in state_dict has changed (#6806)
Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
d6bf0387 · Songqing Zhang · GitHub · 898af658 · d6bf0387 · d6bf0387
Unverified Commit d6bf0387 authored Dec 27, 2023 by Songqing Zhang Committed by GitHub Dec 27, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 3 deletions

examples/core/Graphormer/README.md examples/core/Graphormer/README.md +1 -1

examples/core/Graphormer/model.py examples/core/Graphormer/model.py +4 -2

No files found.
--- a/examples/core/Graphormer/README.md
+++ b/examples/core/Graphormer/README.md
@@ -24,7 +24,7 @@ How to run
 ----------

 ```bash
-accelerate launch --multi_gpu --mixed_precision=fp16 train.py
+accelerate launch --multi_gpu --mixed_precision=fp16 main.py
 ```
 > **_NOTE:_**  The script will automatically download weights pre-trained on PCQM4Mv2. To reproduce the same result, set the total batch size to 64.


--- a/examples/core/Graphormer/model.py
+++ b/examples/core/Graphormer/model.py
@@ -47,7 +47,7 @@ class Graphormer(nn.Module):
        self.spatial_encoder = SpatialEncoder(
            max_dist=num_spatial, num_heads=num_attention_heads
        )
-        self.graph_token_virtual_dist = nn.Embedding(1, num_attention_heads)
+        self.graph_token_virtual_distance = nn.Embedding(1, num_attention_heads)

        self.emb_layer_norm = nn.LayerNorm(self.embedding_dim)

@@ -112,7 +112,9 @@ class Graphormer(nn.Module):
        attn_bias[:, 1:, 1:, :] = path_encoding + spatial_encoding

        # spatial encoding of the virtual node
-        t = self.graph_token_virtual_dist.weight.reshape(1, 1, self.num_heads)
+        t = self.graph_token_virtual_distance.weight.reshape(
+            1, 1, self.num_heads
+        )
        # Since the virtual node comes first, the spatial encodings between it
        # and other nodes will fill the 1st row and 1st column (omit num_graphs
        # and num_heads dimensions) of attn_bias matrix by broadcasting.