hgnn.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "gpuClass": "standard"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "# Hypergraph Neural Networks\n",
        "\n",
        "This tutorial illustrates what is hypergraph and how to build a Hypergraph Neural Network using DGL's sparse matrix APIs.\n",
        "\n",
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dmlc/dgl/blob/master/notebooks/sparse/hgnn.ipynb) [![GitHub](https://img.shields.io/badge/-View%20on%20GitHub-181717?logo=github&logoColor=ffffff)](https://github.com/dmlc/dgl/blob/master/notebooks/sparse/hgnn.ipynb)"
      ],
      "metadata": {
        "id": "eiDu3XgReCt4"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Install required packages.\n",
        "import os\n",
        "import torch\n",
        "os.environ['TORCH'] = torch.__version__\n",
        "os.environ['DGLBACKEND'] = \"pytorch\"\n",
        "\n",
        "# Uncomment below to install required packages. If the CUDA version is not 11.6,\n",
        "# check the https://www.dgl.ai/pages/start.html to find the supported CUDA\n",
        "# version and corresponding command to install DGL.\n",
        "#!pip install dgl -f https://data.dgl.ai/wheels/cu116/repo.html > /dev/null\n",
        "#!pip install torchmetrics > /dev/null\n",
        "\n",
        "try:\n",
        "    import dgl\n",
        "    installed = True\n",
        "except ImportError:\n",
        "    installed = False\n",
        "print(\"DGL installed!\" if installed else \"Failed to install DGL!\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "__2tKqL0eaB0",
        "outputId": "5b5106f6-074b-42a5-fc4c-4936efd2cef8"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "DGL installed!\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Hypergraphs\n",
        "\n",
        "A [hypergraph](https://en.wikipedia.org/wiki/Hypergraph) consists of *nodes* and *hyperedges*.  Contrary to edges in graphs, a *hyperedge* can connect arbitrary number of nodes.  For instance, the following figure shows a hypergraph with 11 nodes and 5 hyperedges drawn in different colors.\n",
        "![](https://data.dgl.ai/tutorial/img/hgnn/hypergraph4.PNG)\n",
        "\n",
        "Hypergraphs are particularly useful when the relationships between data points within the dataset is not binary.  For instance, more than two products can be co-purchased together in an e-commerce system, so the relationship of co-purchase is $n$-ary rather than binary, and therefore it is better described as a hypergraph rather than a normal graph.\n",
        "\n",
        "A hypergraph is usually characterized by its *incidence matrix* $H$, whose rows represent nodes and columns represent hyperedges.  An entry $H_{ij}$ is 1 if hyperedge $j$ includes node $i$, or 0 otherwise.  For example, the hypergraph in the figure above can be characterized by a $11 \\times 5$ matrix as follows:\n",
        "\n",
        "$$\n",
        "H = \\begin{bmatrix}\n",
        "1 & 0 & 0 & 0 & 0 \\\\\n",
        "1 & 0 & 0 & 0 & 0 \\\\\n",
        "1 & 1 & 0 & 1 & 1 \\\\\n",
        "0 & 0 & 1 & 0 & 0 \\\\\n",
        "0 & 1 & 0 & 0 & 0 \\\\\n",
        "1 & 0 & 1 & 1 & 1 \\\\\n",
        "0 & 0 & 1 & 0 & 0 \\\\\n",
        "0 & 1 & 0 & 1 & 0 \\\\\n",
        "0 & 1 & 0 & 1 & 0 \\\\\n",
        "0 & 0 & 1 & 0 & 1 \\\\\n",
        "0 & 0 & 0 & 0 & 1 \\\\\n",
        "\\end{bmatrix}\n",
        "$$\n",
        "\n",
        "One can construct the hypergraph incidence matrix by specifying two tensors `nodes` and `hyperedges`, where the node ID `nodes[i]` belongs to the hyperedge ID `hyperedges[i]` for all `i`.  In the case above, the incidence matrix can be constructed below.\n"
      ],
      "metadata": {
        "id": "unL_mAj-TqC6"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import dgl.sparse as dglsp\n",
        "import torch\n",
        "\n",
        "H = dglsp.spmatrix(\n",
        "    torch.LongTensor([[0, 1, 2, 2, 2, 2, 3, 4, 5, 5, 5, 5, 6, 7, 7, 8, 8, 9, 9, 10],\n",
        "                      [0, 0, 0, 1, 3, 4, 2, 1, 0, 2, 3, 4, 2, 1, 3, 1, 3, 2, 4, 4]])\n",
        ")\n",
        "\n",
        "print(H.to_dense())"
      ],
      "metadata": {
        "id": "I_cExvtIJD1F",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "a1a576f6-1559-479c-9f3e-93e41a56833d"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "tensor([[1., 0., 0., 0., 0.],\n",
            "        [1., 0., 0., 0., 0.],\n",
            "        [1., 1., 0., 1., 1.],\n",
            "        [0., 0., 1., 0., 0.],\n",
            "        [0., 1., 0., 0., 0.],\n",
            "        [1., 0., 1., 1., 1.],\n",
            "        [0., 0., 1., 0., 0.],\n",
            "        [0., 1., 0., 1., 0.],\n",
            "        [0., 1., 0., 1., 0.],\n",
            "        [0., 0., 1., 0., 1.],\n",
            "        [0., 0., 0., 0., 1.]])\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The degree of a node in a hypergraph is defined as the number of hyperedges including the node.  Similarly, the degree of a hyperedge in a hypergraph is defined as the number of nodes included by the hyperedge.  In the example above, the hyperedge degrees can be computed by the sum of row vectors (i.e. all 4), while the node degree can be computed by the sum of column vectors."
      ],
      "metadata": {
        "id": "p-shCPQPHvBB"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "node_degrees = H.sum(1)\n",
        "print(\"Node degrees\", node_degrees)\n",
        "\n",
        "hyperedge_degrees = H.sum(0)\n",
        "print(\"Hyperedge degrees\", hyperedge_degrees)"
      ],
      "metadata": {
        "id": "wjKm9gkTOnU9",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "ffe2c441-8c2c-48a7-cef2-4ef6e96548ec"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Node degrees tensor([1., 1., 4., 1., 1., 4., 1., 2., 2., 2., 1.])\n",
            "Hyperedge degrees tensor([4., 4., 4., 4., 4.])\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "## Hypergraph Neural Network (HGNN) Layer\n",
        "\n",
        "The [HGNN layer](https://arxiv.org/pdf/1809.09401.pdf) is defined as:\n",
        "\n",
        "$$f(X^{(l)}, H; W^{(l)}) = \\sigma(L X^{(l)} W^{(l)})$$$$L = D_v^{-1/2} H B D_e^{-1} H^\\top D_v^{-1/2}$$\n",
        "\n",
        "where\n",
        "\n",
        "* $H \\in \\mathbb{R}^{N \\times M}$ is the incidence matrix of hypergraph with $N$ nodes and $M$ hyperedges.\n",
        "* $D_v \\in \\mathbb{R}^{N \\times N}$ is a diagonal matrix representing node degrees, whose $i$-th diagonal element is $\\sum_{j=1}^M H_{ij}$.\n",
        "* $D_e \\in \\mathbb{R}^{M \\times M}$ is a diagonal matrix representing hyperedge degrees, whose $j$-th diagonal element is $\\sum_{i=1}^N H_{ij}$.\n",
        "* $B \\in \\mathbb{R}^{M \\times M}$ is a diagonal matrix representing the hyperedge weights, whose $j$-th diagonal element is the weight of $j$-th hyperedge.  In our example, $B$ is an identity matrix.\n",
        "\n",
        "The following code builds a two-layer HGNN."
      ],
      "metadata": {
        "id": "7kxrINkVHrAi"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import dgl.sparse as dglsp\n",
        "import torch\n",
        "import torch.nn as nn\n",
        "import torch.nn.functional as F\n",
        "import tqdm\n",
        "from dgl.data import CoraGraphDataset\n",
        "from torchmetrics.functional import accuracy\n",
        "\n",
        "\n",
        "class HGNN(nn.Module):\n",
        "    def __init__(self, H, in_size, out_size, hidden_dims=16):\n",
        "        super().__init__()\n",
        "\n",
        "        self.W1 = nn.Linear(in_size, hidden_dims)\n",
        "        self.W2 = nn.Linear(hidden_dims, out_size)\n",
        "        self.dropout = nn.Dropout(0.5)\n",
        "\n",
        "        ###########################################################\n",
        "        # (HIGHLIGHT) Compute the Laplacian with Sparse Matrix API\n",
        "        ###########################################################\n",
        "        # Compute node degree.\n",
        "        d_V = H.sum(1)\n",
        "        # Compute edge degree.\n",
        "        d_E = H.sum(0)\n",
        "        # Compute the inverse of the square root of the diagonal D_v.\n",
        "        D_v_invsqrt = dglsp.diag(d_V**-0.5)\n",
        "        # Compute the inverse of the diagonal D_e.\n",
        "        D_e_inv = dglsp.diag(d_E**-1)\n",
        "        # In our example, B is an identity matrix.\n",
        "        n_edges = d_E.shape[0]\n",
        "        B = dglsp.identity((n_edges, n_edges))\n",
        "        # Compute Laplacian from the equation above.\n",
        "        self.L = D_v_invsqrt @ H @ B @ D_e_inv @ H.T @ D_v_invsqrt\n",
        "\n",
        "    def forward(self, X):\n",
        "        X = self.L @ self.W1(self.dropout(X))\n",
        "        X = F.relu(X)\n",
        "        X = self.L @ self.W2(self.dropout(X))\n",
        "        return X"
      ],
      "metadata": {
        "id": "58WnPtPvT2mx"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Loading Data\n",
        "\n",
        "We use Cora citation network in our example.  But instead of using the original \"cite\" relationship between papers, we consider the \"co-cite\" relationship between papers.  We build a hypergraph from the original citation network where for each paper we construct a hyperedge that includes all the other papers it cited, as well as the paper itself.\n",
        "\n",
        "![](https://data.dgl.ai/tutorial/img/hgnn/equiv.PNG)\n",
        "\n",
        "Note that a hypergraph constructed this way has an incidence matrix exactly identical to the adjacency matrix of the original graph (plus an identity matrix for self-loops).  This is because each hyperedge has a one-to-one correspondence to each paper.  So we can directly take the graph's adjacency matrix and add an identity matrix to it, and we use it as the hypergraph's incidence matrix."
      ],
      "metadata": {
        "id": "bPrOHVaGwUD0"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def load_data():\n",
        "    dataset = CoraGraphDataset()\n",
        "\n",
        "    graph = dataset[0]\n",
        "    indices = torch.stack(graph.edges())\n",
        "    H = dglsp.spmatrix(indices)\n",
        "    H = H + dglsp.identity(H.shape)\n",
        "\n",
        "    X = graph.ndata[\"feat\"]\n",
        "    Y = graph.ndata[\"label\"]\n",
        "    train_mask = graph.ndata[\"train_mask\"]\n",
        "    val_mask = graph.ndata[\"val_mask\"]\n",
        "    test_mask = graph.ndata[\"test_mask\"]\n",
        "    return H, X, Y, dataset.num_classes, train_mask, val_mask, test_mask"
      ],
      "metadata": {
        "id": "qI0j1J9pwTFg"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Training and Evaluation\n",
        "\n",
        "Now we can write the training and evaluation functions as follows."
      ],
      "metadata": {
        "id": "--rq1-r7wMST"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "def train(model, optimizer, X, Y, train_mask):\n",
        "    model.train()\n",
        "    Y_hat = model(X)\n",
        "    loss = F.cross_entropy(Y_hat[train_mask], Y[train_mask])\n",
        "    optimizer.zero_grad()\n",
        "    loss.backward()\n",
        "    optimizer.step()\n",
        "\n",
        "\n",
        "def evaluate(model, X, Y, val_mask, test_mask, num_classes):\n",
        "    model.eval()\n",
        "    Y_hat = model(X)\n",
        "    val_acc = accuracy(\n",
        "        Y_hat[val_mask], Y[val_mask], task=\"multiclass\", num_classes=num_classes\n",
        "    )\n",
        "    test_acc = accuracy(\n",
        "        Y_hat[test_mask],\n",
        "        Y[test_mask],\n",
        "        task=\"multiclass\",\n",
        "        num_classes=num_classes,\n",
        "    )\n",
        "    return val_acc, test_acc\n",
        "\n",
        "\n",
        "H, X, Y, num_classes, train_mask, val_mask, test_mask = load_data()\n",
        "model = HGNN(H, X.shape[1], num_classes)\n",
        "optimizer = torch.optim.Adam(model.parameters(), lr=0.001)\n",
        "\n",
        "with tqdm.trange(500) as tq:\n",
        "    for epoch in tq:\n",
        "        train(model, optimizer, X, Y, train_mask)\n",
        "        val_acc, test_acc = evaluate(\n",
        "            model, X, Y, val_mask, test_mask, num_classes\n",
        "        )\n",
        "        tq.set_postfix(\n",
        "            {\n",
        "                \"Val acc\": f\"{val_acc:.5f}\",\n",
        "                \"Test acc\": f\"{test_acc:.5f}\",\n",
        "            },\n",
        "            refresh=False,\n",
        "        )\n",
        "\n",
        "print(f\"Test acc: {test_acc:.3f}\")"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "IfEc6JRXwHPt",
        "outputId": "0172578a-6a1b-49eb-adcb-77ee1a949186"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Downloading /root/.dgl/cora_v2.zip from https://data.dgl.ai/dataset/cora_v2.zip...\n",
            "Extracting file to /root/.dgl/cora_v2\n",
            "Finished data loading and preprocessing.\n",
            "  NumNodes: 2708\n",
            "  NumEdges: 10556\n",
            "  NumFeats: 1433\n",
            "  NumClasses: 7\n",
            "  NumTrainingSamples: 140\n",
            "  NumValidationSamples: 500\n",
            "  NumTestSamples: 1000\n",
            "Done saving data into cached files.\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "100%|██████████| 500/500 [00:57<00:00,  8.70it/s, Val acc=0.77800, Test acc=0.78100]"
          ]
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Test acc: 0.781\n"
          ]
        },
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "For the complete example of HGNN, please refer to [here](https://github.com/dmlc/dgl/blob/master/examples/sparse/hgnn.py)."
      ],
      "metadata": {
        "id": "59pCzjpBOyEW"
      }
    }
  ]
}