[doc] remove deprecated tutoriasl for minibatch training (#6625)

c63a926d · Rhett Ying · GitHub · 74684bbe · c63a926d · c63a926d
Unverified Commit c63a926d authored Nov 27, 2023 by Rhett Ying Committed by GitHub Nov 27, 2023
8 changed files
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -211,7 +211,6 @@ from sphinx_gallery.sorting import FileNameSortKey
 examples_dirs = [
    "../../tutorials/blitz",
-    "../../tutorials/large",
    "../../tutorials/dist",
    "../../tutorials/models",
    "../../tutorials/multi",
@@ -219,7 +218,6 @@ examples_dirs = [
 ]  # path to find sources
 gallery_dirs = [
    "tutorials/blitz/",
-    "tutorials/large/",
    "tutorials/dist/",
    "tutorials/models/",
    "tutorials/multi/",

--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -28,7 +28,6 @@ Welcome to Deep Graph Library Tutorials and Documentation
   guide_ko/index
   notebooks/sparse/index
   notebooks/stochastic_training/index
-   tutorials/large/index
   tutorials/cpu/index
   tutorials/multi/index
   tutorials/dist/index
@@ -100,7 +99,7 @@ For acquainted users who wish to learn more advanced usage,
 * `Learn DGL by examples <https://github.com/dmlc/dgl/tree/master/examples>`_.
 * Read the :doc:`User Guide<guide/index>` (:doc:`中文版链接<guide_cn/index>`), which explains the concepts
  and usage of DGL in much more details.
-* Go through the tutorials for :doc:`Stochastic Training of GNNs <tutorials/large/index>`,
+* Go through the tutorials for :doc:`Stochastic Training of GNNs <notebooks/stochastic_training/index>`,
  which covers the basic steps for training GNNs on large graphs in mini-batches.
 * :doc:`Study classical papers <tutorials/models/index>` on graph machine learning alongside DGL.
 * Search for the usage of a specific API in the :doc:`API reference manual <api/python/index>`,

--- a/docs/source/notebooks/stochastic_training/index.rst
+++ b/docs/source/notebooks/stochastic_training/index.rst
-GNN Stochastic Training
+Stochastic Training of GNNs
-=========================
+===========================
 This tutorial introduces how to train GNNs with stochastic training.
@@ -7,6 +7,6 @@ This tutorial introduces how to train GNNs with stochastic training.
  :maxdepth: 1
  :titlesonly:
+  neighbor_sampling_overview.nblink
  node_classification.nblink
  link_prediction.nblink
-  neighbor_sampling_overview.nblink
--- a/tutorials/large/.gitignore
+++ b/tutorials/large/.gitignore
-dataset
-model.pt
--- a/tutorials/large/L0_neighbor_sampling_overview.py
+++ b/tutorials/large/L0_neighbor_sampling_overview.py
-"""
-Introduction of Neighbor Sampling
-=================================
-In :doc:`previous tutorials <../blitz/1_introduction>` you have learned how to
-train GNNs by computing the representations of all nodes on a graph.
-However, sometimes your graph is too large to fit the computation of all
-nodes in a single GPU.
-By the end of this tutorial, you will be able to
-  Understand the pipeline of stochastic GNN training.
-  Understand what is neighbor sampling and why it yields a bipartite
-   graph for each GNN layer.
-"""
-######################################################################
-# Message Passing Review
-# ----------------------
-#
-# Recall that in `Gilmer et al. <https://arxiv.org/abs/1704.01212>`__
-# (also in :doc:`message passing tutorial <../blitz/3_message_passing>`), the
-# message passing formulation is as follows:
-#
-# .. math::
-#
-#
-#    m_{u\to v}^{(l)} = M^{(l)}\left(h_v^{(l-1)}, h_u^{(l-1)}, e_{u\to v}^{(l-1)}\right)
-#
-# .. math::
-#
-#
-#    m_{v}^{(l)} = \sum_{u\in\mathcal{N}(v)}m_{u\to v}^{(l)}
-#
-# .. math::
-#
-#
-#    h_v^{(l)} = U^{(l)}\left(h_v^{(l-1)}, m_v^{(l)}\right)
-#
-# where DGL calls :math:`M^{(l)}` the *message function*, :math:`\sum` the
-# *reduce function* and :math:`U^{(l)}` the *update function*. Note that
-# :math:`\sum` here can represent any function and is not necessarily a
-# summation.
-#
-# Essentially, the :math:`l`-th layer representation of a single node
-# depends on the :math:`(l-1)`-th layer representation of the same node,
-# as well as the :math:`(l-1)`-th layer representation of the neighboring
-# nodes. Those :math:`(l-1)`-th layer representations then depend on the
-# :math:`(l-2)`-th layer representation of those nodes, as well as their
-# neighbors.
-#
-# The following animation shows how a 2-layer GNN is supposed to compute
-# the output of node 5:
-#
-# |image1|
-#
-# You can see that to compute node 5 from the second layer, you will need
-# its direct neighbors’ first layer representations (colored in yellow),
-# which in turn needs their direct neighbors’ (i.e. node 5’s second-hop
-# neighbors’) representations (colored in green).
-#
-# .. |image1| image:: https://data.dgl.ai/tutorial/img/sampling.gif
-#
-######################################################################
-# Neighbor Sampling Overview
-# --------------------------
-#
-# You can also see from the previous example that computing representation
-# for a small number of nodes often requires input features of a
-# significantly larger number of nodes. Taking all neighbors for message
-# aggregation is often too costly since the nodes needed for input
-# features would easily cover a large portion of the graph, especially for
-# real-world graphs which are often
-# `scale-free <https://en.wikipedia.org/wiki/Scale-free_network>`__.
-#
-# Neighbor sampling addresses this issue by selecting a subset of the
-# neighbors to perform aggregation. For instance, to compute
-# :math:`\boldsymbol{h}_5^{(2)}`, you can choose two of the neighbors
-# instead of all of them to aggregate, as in the following animation:
-#
-# |image2|
-#
-# You can see that this method uses much fewer nodes needed in message
-# passing for a single minibatch.
-#
-# .. |image2| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
-#
-######################################################################
-# You can also notice in the animation above that the computation
-# dependencies in the animation above can be described as a series of
-# bipartite graphs.
-# The output nodes (called *destination nodes*) are on one side and all the
-# nodes necessary for inputs (called *source nodes*) are on the other side.
-# The arrows indicate how the sampled neighbors propagates messages to the nodes.
-# DGL calls such graphs *message flow graphs* (MFG).
-#
-# Note that some GNN modules, such as `SAGEConv`, need to use the destination
-# nodes' features on the previous layer to compute the outputs.  Without
-# loss of generality, DGL always includes the destination nodes themselves
-# in the source nodes.
-#
-######################################################################
-# What’s next?
-# ------------
-#
-# :doc:`Stochastic GNN Training for Node Classification in
-# DGL <L1_large_node_classification>`
-#
-# Thumbnail credits: Understanding graph embedding methods and their applications, Mengjia Xu
-# sphinx_gallery_thumbnail_path = '_static/large_L0_neighbor_sampling_overview.png'
--- a/tutorials/large/L1_large_node_classification.py
+++ b/tutorials/large/L1_large_node_classification.py
-"""
-Node Classification
-===========================================================
-This tutorial shows how to train a multi-layer GraphSAGE for node
-classification on ``ogbn-arxiv`` provided by `Open Graph
-Benchmark (OGB) <https://ogb.stanford.edu/>`__. The dataset contains around
-170 thousand nodes and 1 million edges.
-By the end of this tutorial, you will be able to
-  Train a GNN model for node classification on a single GPU with DGL's
-   neighbor sampling components.
-This tutorial assumes that you have read the :doc:`Introduction of Neighbor
-Sampling for GNN Training <L0_neighbor_sampling_overview>`.
-"""
-######################################################################
-# Loading Dataset
-# ---------------
-#
-# `ogbn-arxiv` is already prepared as ``BuiltinDataset`` in GraphBolt.
-#
-import os
-os.environ["DGLBACKEND"] = "pytorch"
-import dgl
-import dgl.graphbolt as gb
-import numpy as np
-import torch
-dataset = gb.BuiltinDataset("ogbn-arxiv").load()
-device = "cpu"  # change to 'cuda' for GPU
-######################################################################
-# Dataset consists of graph, feature and tasks. You can get the
-# training-validation-test set from the tasks. Seed nodes and corresponding
-# labels are already stored in each training-validation-test set. Other
-# metadata such as number of classes are also stored in the tasks. In this
-# dataset, there is only one task: `node classification`.
-#
-graph = dataset.graph
-feature = dataset.feature
-train_set = dataset.tasks[0].train_set
-valid_set = dataset.tasks[0].validation_set
-test_set = dataset.tasks[0].test_set
-task_name = dataset.tasks[0].metadata["name"]
-num_classes = dataset.tasks[0].metadata["num_classes"]
-print(f"Task: {task_name}. Number of classes: {num_classes}")
-######################################################################
-# How DGL Handles Computation Dependency
-# --------------------------------------
-#
-# In the :doc:`previous tutorial <L0_neighbor_sampling_overview>`, you
-# have seen that the computation dependency for message passing of a
-# single node can be described as a series of *message flow graphs* (MFG).
-#
-# |image1|
-#
-# .. |image1| image:: https://data.dgl.ai/tutorial/img/bipartite.gif
-#
-######################################################################
-# Defining Neighbor Sampler and Data Loader in DGL
-# ------------------------------------------------
-#
-# DGL provides tools to iterate over the dataset in minibatches
-# while generating the computation dependencies to compute their outputs
-# with the MFGs above. For node classification, you can use
-# ``dgl.graphbolt.MultiProcessDataLoader`` for iterating over the dataset.
-# It accepts a data pipe that generates minibatches of nodes and their
-# labels, sample neighbors for each node, and generate the computation
-# dependencies in the form of MFGs. Feature fetching, block creation and
-# copying to target device are also supported. All these operations are
-# split into separate stages in the data pipe, so that you can customize
-# the data pipeline by inserting your own operations.
-#
-# .. note::
-#
-#    To write your own neighbor sampler, please refer to :ref:`this user
-#    guide section <guide-minibatch-customizing-neighborhood-sampler>`.
-#
-#
-# Let’s say that each node will gather messages from 4 neighbors on each
-# layer. The code defining the data loader and neighbor sampler will look
-# like the following.
-#
-datapipe = gb.ItemSampler(train_set, batch_size=1024, shuffle=True)
-datapipe = datapipe.sample_neighbor(graph, [4, 4])
-datapipe = datapipe.fetch_feature(feature, node_feature_keys=["feat"])
-datapipe = datapipe.to_dgl()
-datapipe = datapipe.copy_to(device)
-train_dataloader = gb.MultiProcessDataLoader(datapipe, num_workers=0)
-######################################################################
-# .. note::
-#
-#    In this example, neighborhood sampling runs on CPU, If you are
-#    interested in running it on GPU, please refer to
-#    :ref:`guide-minibatch-gpu-sampling`.
-#
-######################################################################
-# You can iterate over the data loader and a ``DGLMiniBatch`` object
-# is yielded.
-#
-data = next(iter(train_dataloader))
-print(data)
-######################################################################
-# You can get the input node IDs from MFGs.
-#
-mfgs = data.blocks
-input_nodes = mfgs[0].srcdata[dgl.NID]
-print(f"Input nodes: {input_nodes}.")
-######################################################################
-# Defining Model
-# --------------
-#
-# Let’s consider training a 2-layer GraphSAGE with neighbor sampling. The
-# model can be written as follows:
-#
-import torch.nn as nn
-import torch.nn.functional as F
-from dgl.nn import SAGEConv
-class Model(nn.Module):
-    def __init__(self, in_feats, h_feats, num_classes):
-        super(Model, self).__init__()
-        self.conv1 = SAGEConv(in_feats, h_feats, aggregator_type="mean")
-        self.conv2 = SAGEConv(h_feats, num_classes, aggregator_type="mean")
-        self.h_feats = h_feats
-    def forward(self, mfgs, x):
-        # Lines that are changed are marked with an arrow: "<---"
-        h_dst = x[: mfgs[0].num_dst_nodes()]  # <---
-        h = self.conv1(mfgs[0], (x, h_dst))  # <---
-        h = F.relu(h)
-        h_dst = h[: mfgs[1].num_dst_nodes()]  # <---
-        h = self.conv2(mfgs[1], (h, h_dst))  # <---
-        return h
-in_size = feature.size("node", None, "feat")[0]
-model = Model(in_size, 64, num_classes).to(device)
-######################################################################
-# If you compare against the code in the
-# :doc:`introduction <../blitz/1_introduction>`, you will notice several
-# differences:
-#
-# -  **DGL GNN layers on MFGs**. Instead of computing on the
-#    full graph:
-#
-#    .. code:: python
-#
-#       h = self.conv1(g, x)
-#
-#    you only compute on the sampled MFG:
-#
-#    .. code:: python
-#
-#       h = self.conv1(mfgs[0], (x, h_dst))
-#
-#    All DGL’s GNN modules support message passing on MFGs,
-#    where you supply a pair of features, one for source nodes and another
-#    for destination nodes.
-#
-# -  **Feature slicing for self-dependency**. There are statements that
-#    perform slicing to obtain the previous-layer representation of the
-#     nodes:
-#
-#    .. code:: python
-#
-#       h_dst = x[:mfgs[0].num_dst_nodes()]
-#
-#    ``num_dst_nodes`` method works with MFGs, where it will
-#    return the number of destination nodes.
-#
-#    Since the first few source nodes of the yielded MFG are
-#    always the same as the destination nodes, these statements obtain the
-#    representations of the destination nodes on the previous layer. They are
-#    then combined with neighbor aggregation in ``dgl.nn.SAGEConv`` layer.
-#
-# .. note::
-#
-#    See the :doc:`custom message passing
-#    tutorial <L4_message_passing>` for more details on how to
-#    manipulate MFGs produced in this way, such as the usage
-#    of ``num_dst_nodes``.
-#
-######################################################################
-# Defining Training Loop
-# ----------------------
-#
-# The following initializes the model and defines the optimizer.
-#
-opt = torch.optim.Adam(model.parameters())
-######################################################################
-# When computing the validation score for model selection, usually you can
-# also do neighbor sampling. To do that, you need to define another data
-# loader.
-#
-datapipe = gb.ItemSampler(valid_set, batch_size=1024, shuffle=False)
-datapipe = datapipe.sample_neighbor(graph, [4, 4])
-datapipe = datapipe.fetch_feature(feature, node_feature_keys=["feat"])
-datapipe = datapipe.to_dgl()
-datapipe = datapipe.copy_to(device)
-valid_dataloader = gb.MultiProcessDataLoader(datapipe, num_workers=0)
-import sklearn.metrics
-######################################################################
-# The following is a training loop that performs validation every epoch.
-# It also saves the model with the best validation accuracy into a file.
-#
-import tqdm
-best_accuracy = 0
-best_model_path = "model.pt"
-for epoch in range(10):
-    model.train()
-    with tqdm.tqdm(train_dataloader) as tq:
-        for step, data in enumerate(tq):
-            x = data.node_features["feat"]
-            labels = data.labels
-            predictions = model(data.blocks, x)
-            loss = F.cross_entropy(predictions, labels)
-            opt.zero_grad()
-            loss.backward()
-            opt.step()
-            accuracy = sklearn.metrics.accuracy_score(
-                labels.cpu().numpy(),
-                predictions.argmax(1).detach().cpu().numpy(),
-            )
-            tq.set_postfix(
-                {"loss": "%.03f" % loss.item(), "acc": "%.03f" % accuracy},
-                refresh=False,
-            )
-    model.eval()
-    predictions = []
-    labels = []
-    with tqdm.tqdm(valid_dataloader) as tq, torch.no_grad():
-        for data in tq:
-            x = data.node_features["feat"]
-            labels.append(data.labels.cpu().numpy())
-            predictions.append(model(data.blocks, x).argmax(1).cpu().numpy())
-        predictions = np.concatenate(predictions)
-        labels = np.concatenate(labels)
-        accuracy = sklearn.metrics.accuracy_score(labels, predictions)
-        print("Epoch {} Validation Accuracy {}".format(epoch, accuracy))
-        if best_accuracy < accuracy:
-            best_accuracy = accuracy
-            torch.save(model.state_dict(), best_model_path)
-        # Note that this tutorial do not train the whole model to the end.
-        break
-######################################################################
-# Conclusion
-# ----------
-#
-# In this tutorial, you have learned how to train a multi-layer GraphSAGE
-# with neighbor sampling.
-#
-# What’s next?
-# ------------
-#
-# -  :doc:`Stochastic training of GNN for link
-#    prediction <L2_large_link_prediction>`.
-# -  :doc:`Adapting your custom GNN module for stochastic
-#    training <L4_message_passing>`.
-# -  During inference you may wish to disable neighbor sampling. If so,
-#    please refer to the :ref:`user guide on exact offline
-#    inference <guide-minibatch-inference>`.
-#
--- a/tutorials/large/L2_large_link_prediction.py
+++ b/tutorials/large/L2_large_link_prediction.py
-"""
-Link Prediction
-==============================================
-This tutorial will show how to train a multi-layer GraphSAGE for link
-prediction on `CoraGraphDataset <https://data.dgl.ai/dataset/cora_v2.zip>`__.
-The dataset contains 2708 nodes and 10556 edges.
-By the end of this tutorial, you will be able to
-  Train a GNN model for link prediction on target device with DGL's
-   neighbor sampling components.
-This tutorial assumes that you have read the :doc:`Introduction of Neighbor
-Sampling for GNN Training <L0_neighbor_sampling_overview>` and :doc:`Neighbor
-Sampling for Node Classification <L1_large_node_classification>`.
-"""
-######################################################################
-# Link Prediction Overview
-# ------------------------
-#
-# Unlike node classification predicts labels for nodes based on their
-# local neighborhoods, link prediction assesses the likelihood of an edge
-# existing between two nodes, necessitating different sampling strategies
-# that account for pairs of nodes and their joint neighborhoods.
-#
-######################################################################
-# Loading Dataset
-# ---------------
-#
-# `cora` is already prepared as ``BuiltinDataset`` in GraphBolt.
-#
-import os
-os.environ["DGLBACKEND"] = "pytorch"
-import dgl.graphbolt as gb
-import numpy as np
-import torch
-import tqdm
-dataset = gb.BuiltinDataset("cora").load()
-device = torch.device("cpu")  # change to 'cuda' for GPU
-######################################################################
-# Dataset consists of graph, feature and tasks. You can get the
-# training-validation-test set from the tasks. Seed nodes and corresponding
-# labels are already stored in each training-validation-test set. This
-# dataset contains 2 tasks, one for node classification and the other for
-# link prediction. We will use the link prediction task.
-#
-graph = dataset.graph
-feature = dataset.feature
-train_set = dataset.tasks[1].train_set
-test_set = dataset.tasks[1].test_set
-task_name = dataset.tasks[1].metadata["name"]
-print(f"Task: {task_name}.")
-######################################################################
-# Defining Neighbor Sampler and Data Loader in DGL
-# ------------------------------------------------
-#
-# Different from the :doc:`link prediction tutorial for full
-# graph <../blitz/4_link_predict>`, a common practice to train GNN on large graphs is
-# to iterate over the edges
-# in minibatches, since computing the probability of all edges is usually
-# impossible. For each minibatch of edges, you compute the output
-# representation of their incident nodes using neighbor sampling and GNN,
-# in a similar fashion introduced in the :doc:`large-scale node classification
-# tutorial <L1_large_node_classification>`.
-#
-# To perform link prediction, you need to specify a negative sampler. DGL
-# provides builtin negative samplers such as
-# ``dgl.graphbolt.UniformNegativeSampler``.  Here this tutorial uniformly
-# draws 5 negative examples per positive example.
-#
-# Except for the negative sampler, the rest of the code is identical to
-# the :doc:`node classification tutorial <L1_large_node_classification>`.
-#
-datapipe = gb.ItemSampler(train_set, batch_size=256, shuffle=True)
-datapipe = datapipe.sample_uniform_negative(graph, 5)
-datapipe = datapipe.sample_neighbor(graph, [5, 5, 5])
-datapipe = datapipe.fetch_feature(feature, node_feature_keys=["feat"])
-datapipe = datapipe.to_dgl()
-datapipe = datapipe.copy_to(device)
-train_dataloader = gb.MultiProcessDataLoader(datapipe, num_workers=0)
-######################################################################
-# You can peek one minibatch from ``train_dataloader`` and see what it
-# will give you.
-#
-data = next(iter(train_dataloader))
-print(f"DGLMiniBatch: {data}")
-######################################################################
-# Defining Model for Node Representation
-# --------------------------------------
-#
-import dgl.nn as dglnn
-import torch.nn as nn
-import torch.nn.functional as F
-class SAGE(nn.Module):
-    def __init__(self, in_size, hidden_size):
-        super().__init__()
-        self.layers = nn.ModuleList()
-        self.layers.append(dglnn.SAGEConv(in_size, hidden_size, "mean"))
-        self.layers.append(dglnn.SAGEConv(hidden_size, hidden_size, "mean"))
-        self.hidden_size = hidden_size
-        self.predictor = nn.Sequential(
-            nn.Linear(hidden_size, hidden_size),
-            nn.ReLU(),
-            nn.Linear(hidden_size, 1),
-        )
-    def forward(self, blocks, x):
-        hidden_x = x
-        for layer_idx, (layer, block) in enumerate(zip(self.layers, blocks)):
-            hidden_x = layer(block, hidden_x)
-            is_last_layer = layer_idx == len(self.layers) - 1
-            if not is_last_layer:
-                hidden_x = F.relu(hidden_x)
-        return hidden_x
-######################################################################
-# Defining Training Loop
-# ----------------------
-#
-# The following initializes the model and defines the optimizer.
-#
-in_size = feature.size("node", None, "feat")[0]
-model = SAGE(in_size, 128).to(device)
-optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
-#####################################################################
-# Convert the minibatch to a training pair and a label tensor.
-#
-def to_binary_link_dgl_computing_pack(data: gb.DGLMiniBatch):
-    """Convert the minibatch to a training pair and a label tensor."""
-    pos_src, pos_dst = data.positive_node_pairs
-    neg_src, neg_dst = data.negative_node_pairs
-    node_pairs = (
-        torch.cat((pos_src, neg_src), dim=0),
-        torch.cat((pos_dst, neg_dst), dim=0),
-    )
-    pos_label = torch.ones_like(pos_src)
-    neg_label = torch.zeros_like(neg_src)
-    labels = torch.cat([pos_label, neg_label], dim=0)
-    return (node_pairs, labels.float())
-######################################################################
-# The following is the training loop for link prediction and
-# evaluation.
-#
-for epoch in range(10):
-    model.train()
-    total_loss = 0
-    for step, data in tqdm.tqdm(enumerate(train_dataloader)):
-        # Unpack MiniBatch.
-        compacted_pairs, labels = to_binary_link_dgl_computing_pack(data)
-        node_feature = data.node_features["feat"]
-        # Convert sampled subgraphs to DGL blocks.
-        blocks = data.blocks
-        # Get the embeddings of the input nodes.
-        y = model(blocks, node_feature)
-        logits = model.predictor(
-            y[compacted_pairs[0]] * y[compacted_pairs[1]]
-        ).squeeze()
-        # Compute loss.
-        loss = F.binary_cross_entropy_with_logits(logits, labels)
-        optimizer.zero_grad()
-        loss.backward()
-        optimizer.step()
-        total_loss += loss.item()
-    print(f"Epoch {epoch:03d} | Loss {total_loss / (step + 1):.3f}")
-######################################################################
-# Evaluating Performance with Link Prediction
-# -------------------------------------------
-#
-model.eval()
-datapipe = gb.ItemSampler(test_set, batch_size=256, shuffle=False)
-# Since we need to use all neghborhoods for evaluation, we set the fanout
-# to -1.
-datapipe = datapipe.sample_neighbor(graph, [-1, -1])
-datapipe = datapipe.fetch_feature(feature, node_feature_keys=["feat"])
-datapipe = datapipe.to_dgl()
-datapipe = datapipe.copy_to(device)
-eval_dataloader = gb.MultiProcessDataLoader(datapipe, num_workers=0)
-logits = []
-labels = []
-for step, data in enumerate(eval_dataloader):
-    # Unpack MiniBatch.
-    compacted_pairs, label = to_binary_link_dgl_computing_pack(data)
-    # The features of sampled nodes.
-    x = data.node_features["feat"]
-    # Forward.
-    y = model(data.blocks, x)
-    logit = (
-        model.predictor(y[compacted_pairs[0]] * y[compacted_pairs[1]])
-        .squeeze()
-        .detach()
-    )
-    logits.append(logit)
-    labels.append(label)
-logits = torch.cat(logits, dim=0)
-labels = torch.cat(labels, dim=0)
-# Compute the AUROC score.
-from sklearn.metrics import roc_auc_score
-auc = roc_auc_score(labels, logits)
-print("Link Prediction AUC:", auc)
-######################################################################
-# Conclusion
-# ----------
-#
-# In this tutorial, you have learned how to train a multi-layer GraphSAGE
-# for link prediction with neighbor sampling.
-#
--- a/tutorials/large/README.txt
+++ b/tutorials/large/README.txt
-[Deprecated] Stochastic Training of GNNs
-========================================