Add gat exercise (#5635)

Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>

Add gat exercise (#5635)
Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
852ddfb9 · Quan (Andy) Gan · GitHub · d18d255f · 852ddfb9
Unverified Commit 852ddfb9 authored May 01, 2023 by Quan (Andy) Gan Committed by GitHub May 01, 2023
Show whitespace changes
Inline Side-by-side

Showing with 57 additions and 0 deletions

notebooks/sparse/quickstart.ipynb notebooks/sparse/quickstart.ipynb +57 -0

No files found.
--- a/notebooks/sparse/quickstart.ipynb
+++ b/notebooks/sparse/quickstart.ipynb
@@ -1250,6 +1250,63 @@
      },
      "execution_count": null,
      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Exercise\n",
+        "\n",
+        "Let's implement a simplified version of the Graph Attention Network (GAT) layer.\n",
+        "\n",
+        "A GAT layer has two inputs: the adjacency matrix $A$ and the node input features $X$.  The idea of GAT layer is to update each node's representation with a weighted average of the node's own representation and its neighbors' representations.  In particular, when computing the output for node $i$, the GAT layer does the following:\n",
+        "1. Compute the scores $S_{ij}$ representing the attention logit from neighbor $j$ to node $i$.  $S_{ij}$ is a function of $i$ and $j$'s input features $X_i$ and $X_j$: $$S_{ij} = LeakyReLU(X_i^\\top v_1 + X_j^\\top v_2)$$, where $v_1$ and $v_2$ are trainable vectors.\n",
+        "2. Compute a softmax attention $R_{ij} = \\exp S_{ij} / \\left( \\sum_{j' \\in \\mathcal{N}_i} s_{ij'} \\right)$, where $\\mathcal{N}_j$ means the neighbors of $j$.  This means that $R$ is a row-wise softmax attention of $S$.\n",
+        "3. Compute the weighted average $H_i = \\sum_{j' : j' \\in \\mathcal{N}_i} R_{j'} X_{j'} W$, where $W$ is a trainable matrix.\n",
+        "\n",
+        "The following code defined all the parameters you need but only completes step 1.  Could you implement step 2 and step 3?"
+      ],
+      "metadata": {
+        "id": "yfEVQBUuI-cE"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "import torch.nn as nn\n",
+        "import torch.functional as F\n",
+        "\n",
+        "class SimplifiedGAT(nn.Module):\n",
+        "    def __init__(self, in_size, out_size):\n",
+        "        super().__init__()\n",
+        "\n",
+        "        self.W = nn.Parameter(torch.randn(in_size, out_size))\n",
+        "        self.v1 = nn.Parameter(torch.randn(in_size))\n",
+        "        self.v2 = nn.Parameter(torch.randn(in_size))\n",
+        "\n",
+        "    def forward(self, A, X):\n",
+        "        # A: A sparse matrix with size (N, N).  A[i, j] represent the edge from j to i.\n",
+        "        # X: A dense matrix with size (N, D)\n",
+        "        # Step 1: compute S[i, j]\n",
+        "        Xv1 = X @ self.v1\n",
+        "        Xv2 = X @ self.v2\n",
+        "        s = F.leaky_relu(Xv1[A.col] + Xv2[A.row])\n",
+        "        S = dglsp.val_like(A, s)\n",
+        "\n",
+        "        # Step 2: compute R[i, j] which is the row-wise attention of $S$.\n",
+        "        # EXERCISE: replace the statement below.\n",
+        "        R = S\n",
+        "\n",
+        "        # Step 3: compute H.\n",
+        "        # EXERCISE: replace the statement below.\n",
+        "        H = X\n",
+        "\n",
+        "        return H"
+      ],
+      "metadata": {
+        "id": "pYrgSxq6La5c"
+      },
+      "execution_count": null,
+      "outputs": []
    }
  ]
 }