Unverified Commit 852ddfb9 authored by Quan (Andy) Gan's avatar Quan (Andy) Gan Committed by GitHub
Browse files

Add gat exercise (#5635)


Co-authored-by: default avatarHongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
parent d18d255f
...@@ -1250,6 +1250,63 @@ ...@@ -1250,6 +1250,63 @@
}, },
"execution_count": null, "execution_count": null,
"outputs": [] "outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Exercise\n",
"\n",
"Let's implement a simplified version of the Graph Attention Network (GAT) layer.\n",
"\n",
"A GAT layer has two inputs: the adjacency matrix $A$ and the node input features $X$. The idea of GAT layer is to update each node's representation with a weighted average of the node's own representation and its neighbors' representations. In particular, when computing the output for node $i$, the GAT layer does the following:\n",
"1. Compute the scores $S_{ij}$ representing the attention logit from neighbor $j$ to node $i$. $S_{ij}$ is a function of $i$ and $j$'s input features $X_i$ and $X_j$: $$S_{ij} = LeakyReLU(X_i^\\top v_1 + X_j^\\top v_2)$$, where $v_1$ and $v_2$ are trainable vectors.\n",
"2. Compute a softmax attention $R_{ij} = \\exp S_{ij} / \\left( \\sum_{j' \\in \\mathcal{N}_i} s_{ij'} \\right)$, where $\\mathcal{N}_j$ means the neighbors of $j$. This means that $R$ is a row-wise softmax attention of $S$.\n",
"3. Compute the weighted average $H_i = \\sum_{j' : j' \\in \\mathcal{N}_i} R_{j'} X_{j'} W$, where $W$ is a trainable matrix.\n",
"\n",
"The following code defined all the parameters you need but only completes step 1. Could you implement step 2 and step 3?"
],
"metadata": {
"id": "yfEVQBUuI-cE"
}
},
{
"cell_type": "code",
"source": [
"import torch.nn as nn\n",
"import torch.functional as F\n",
"\n",
"class SimplifiedGAT(nn.Module):\n",
" def __init__(self, in_size, out_size):\n",
" super().__init__()\n",
"\n",
" self.W = nn.Parameter(torch.randn(in_size, out_size))\n",
" self.v1 = nn.Parameter(torch.randn(in_size))\n",
" self.v2 = nn.Parameter(torch.randn(in_size))\n",
"\n",
" def forward(self, A, X):\n",
" # A: A sparse matrix with size (N, N). A[i, j] represent the edge from j to i.\n",
" # X: A dense matrix with size (N, D)\n",
" # Step 1: compute S[i, j]\n",
" Xv1 = X @ self.v1\n",
" Xv2 = X @ self.v2\n",
" s = F.leaky_relu(Xv1[A.col] + Xv2[A.row])\n",
" S = dglsp.val_like(A, s)\n",
"\n",
" # Step 2: compute R[i, j] which is the row-wise attention of $S$.\n",
" # EXERCISE: replace the statement below.\n",
" R = S\n",
"\n",
" # Step 3: compute H.\n",
" # EXERCISE: replace the statement below.\n",
" H = X\n",
"\n",
" return H"
],
"metadata": {
"id": "pYrgSxq6La5c"
},
"execution_count": null,
"outputs": []
} }
] ]
} }
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment