[Feature] Distributed graph store (#1383)

* initial version from distributed training. This is copied from multiprocessing training. * modify for distributed training. * it's runnable now. * measure time in neighbor sampling. * simplify neighbor sampling. * fix a bug in distributed neighbor sampling. * allow single-machine training. * fix a bug. * fix a bug. * fix openmp. * make some improvement. * fix. * add prepare in the sampler. * prepare nodeflow async. * fix a bug. * get id. * simplify the code. * improve. * fix partition.py * fix the example. * add more features. * fix the example. * allow one partition * use distributed kvstore. * do g2l map manually. * fix commandline. * a temp script to save reddit. * fix pull_handler. * add pytorch version. * estimate the time for copying data. * delete unused code. * fix a bug. * print id. * fix a bug * fix a bug * fix a bug. * remove redundent code. * revert modify in sampler. * fix temp script. * remove pytorch version. * fix. * distributed training with pytorch. * add distributed graph store. * fix. * add metis_partition_assignment. * fix a few bugs in distributed graph store. * fix test. * fix bugs in distributed graph store. * fix tests. * remove code of defining DistGraphStore. * fix partition. * fix example. * update run.sh. * only read necessary node data. * batching data fetch of multiple NodeFlows. * simplify gcn. * remove unnecessary code. * use the new copy_from_kvstore. * update training script. * print time in graphsage. * make distributed training runnable. * use val_nid. * fix train_sampling. * add distributed training. * add run.sh * add more timing. * fix a bug. * save graph metadata when partition. * create ndata and edata in distributed graph store. * add timing in minibatch training of GraphSage. * use pytorch distributed. * add checks. * fix a bug in global vs. local ids. * remove fast pull * fix a compile error. * update and add new APIs. * implement more methods in DistGraphStore. * update more APIs. * rename it to DistGraph. * rename to DistTensor * remove some unnecessary API. * remove unnecessary files. * revert changes in sampler. * Revert "simplify gcn." This reverts commit 0ed3a34ca714203a5b45240af71555d4227ce452. * Revert "simplify neighbor sampling." This reverts commit 551c72d20f05a029360ba97f312c7a7a578aacec. * Revert "measure time in neighbor sampling." This reverts commit 63ae80c7b402bb626e24acbbc8fdfe9fffd0bc64. * Revert "add timing in minibatch training of GraphSage." This reverts commit e59dc8957a414c7df5c316f51d78bce822bdef5e. * Revert "fix train_sampling." This reverts commit ea6aea9a4aabb8ba0ff63070aa51e7ca81536ad9. * fix lint. * add comments and small update. * add more comments. * add more unit tests and fix bugs. * check the existence of shared-mem graph index. * use new partitioned graph storage. * fix bugs. * print error in fast pull. * fix lint * fix a compile error. * save absolute path after partitioning. * small fixes in the example * Revert "[kvstore] support any data type for init_data() (#1465)" This reverts commit 87b6997b . * fix a bug. * disable evaluation. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * support set and init data. * support set and init data. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * fix bugs. * fix unit test. * move to dgl.distributed. * fix lint. * fix lint. * remove local_nids. * fix lint. * fix test. * remove train_dist. * revert train_sampling. * rename funcs. * address comments. * address comments. Use NodeDataView/EdgeDataView to keep track of data. * address comments. * address comments. * revert. * save data with DGL serializer. * use the right way of getting shape. * fix lint. * address comments. * address comments. * fix an error in mxnet. * address comments. * add edge_map. * add more test and fix bugs. Co-authored-by: Zheng <dzzhen@186590dc80ff.ant.amazon.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-131.us-east-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-150.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-135.us-west-2.compute.internal>

[Feature] Distributed graph store (#1383)
* initial version from distributed training. This is copied from multiprocessing training. * modify for distributed training. * it's runnable now. * measure time in neighbor sampling. * simplify neighbor sampling. * fix a bug in distributed neighbor sampling. * allow single-machine training. * fix a bug. * fix a bug. * fix openmp. * make some improvement. * fix. * add prepare in the sampler. * prepare nodeflow async. * fix a bug. * get id. * simplify the code. * improve. * fix partition.py * fix the example. * add more features. * fix the example. * allow one partition * use distributed kvstore. * do g2l map manually. * fix commandline. * a temp script to save reddit. * fix pull_handler. * add pytorch version. * estimate the time for copying data. * delete unused code. * fix a bug. * print id. * fix a bug * fix a bug * fix a bug. * remove redundent code. * revert modify in sampler. * fix temp script. * remove pytorch version. * fix. * distributed training with pytorch. * add distributed graph store. * fix. * add metis_partition_assignment. * fix a few bugs in distributed graph store. * fix test. * fix bugs in distributed graph store. * fix tests. * remove code of defining DistGraphStore. * fix partition. * fix example. * update run.sh. * only read necessary node data. * batching data fetch of multiple NodeFlows. * simplify gcn. * remove unnecessary code. * use the new copy_from_kvstore. * update training script. * print time in graphsage. * make distributed training runnable. * use val_nid. * fix train_sampling. * add distributed training. * add run.sh * add more timing. * fix a bug. * save graph metadata when partition. * create ndata and edata in distributed graph store. * add timing in minibatch training of GraphSage. * use pytorch distributed. * add checks. * fix a bug in global vs. local ids. * remove fast pull * fix a compile error. * update and add new APIs. * implement more methods in DistGraphStore. * update more APIs. * rename it to DistGraph. * rename to DistTensor * remove some unnecessary API. * remove unnecessary files. * revert changes in sampler. * Revert "simplify gcn." This reverts commit 0ed3a34ca714203a5b45240af71555d4227ce452. * Revert "simplify neighbor sampling." This reverts commit 551c72d20f05a029360ba97f312c7a7a578aacec. * Revert "measure time in neighbor sampling." This reverts commit 63ae80c7b402bb626e24acbbc8fdfe9fffd0bc64. * Revert "add timing in minibatch training of GraphSage." This reverts commit e59dc8957a414c7df5c316f51d78bce822bdef5e. * Revert "fix train_sampling." This reverts commit ea6aea9a4aabb8ba0ff63070aa51e7ca81536ad9. * fix lint. * add comments and small update. * add more comments. * add more unit tests and fix bugs. * check the existence of shared-mem graph index. * use new partitioned graph storage. * fix bugs. * print error in fast pull. * fix lint * fix a compile error. * save absolute path after partitioning. * small fixes in the example * Revert "[kvstore] support any data type for init_data() (#1465)" This reverts commit 87b6997b . * fix a bug. * disable evaluation. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * support set and init data. * support set and init data. * Revert "Revert "[kvstore] support any data type for init_data() (#1465)"" This reverts commit f5b8039c6326eb73bad8287db3d30d93175e5bee. * fix bugs. * fix unit test. * move to dgl.distributed. * fix lint. * fix lint. * remove local_nids. * fix lint. * fix test. * remove train_dist. * revert train_sampling. * rename funcs. * address comments. * address comments. Use NodeDataView/EdgeDataView to keep track of data. * address comments. * address comments. * revert. * save data with DGL serializer. * use the right way of getting shape. * fix lint. * address comments. * address comments. * fix an error in mxnet. * address comments. * add edge_map. * add more test and fix bugs. Co-authored-by: Zheng <dzzhen@186590dc80ff.ant.amazon.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-131.us-east-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-26-167.us-east-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-150.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-16-250.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-30-135.us-west-2.compute.internal>
2190c39d · Da Zheng · GitHub · 5fc334fc · 2190c39d · 2190c39d
Unverified Commit 2190c39d authored May 02, 2020 by Da Zheng Committed by GitHub May 02, 2020
16 changed files
--- a/examples/pytorch/graphsage/train_sampling.py
+++ b/examples/pytorch/graphsage/train_sampling.py
@@ -78,7 +78,6 @@ class SAGE(nn.Module):
        Inference with the GraphSAGE model on full neighbors (i.e. without neighbor sampling).
        g : the entire graph.
        x : the input of entire node set.
-
        The inference code is written in a fashion that it could handle any number of nodes and
        layers.
        """
@@ -114,7 +113,6 @@ def prepare_mp(g):
    Explicitly materialize the CSR, CSC and COO representation of the given graph
    so that they could be shared via copy-on-write to sampler workers and GPU
    trainers.
-
    This is a workaround before full shared memory support on heterogeneous graphs.
    """
    g.in_degree(0)

--- a/include/dgl/runtime/shared_mem.h
+++ b/include/dgl/runtime/shared_mem.h
@@ -58,13 +58,20 @@ class SharedMemory {
   * \param size the size of the shared memory.
   * \return the address of the shared memory
   */
-  void *create_new(size_t size);
+  void *CreateNew(size_t size);
  /*
   * \brief allocate shared memory that has been created.
   * \param size the size of the shared memory.
   * \return the address of the shared memory
   */
-  void *open(size_t size);
+  void *Open(size_t size);
+
+  /*
+   * \brief check if the shared memory exist.
+   * \param name the name of the shared memory.
+   * \return a boolean value to indicate if the shared memory exists.
+   */
+  static bool Exist(const std::string &name);
 };
 #endif  // _WIN32


--- a/python/dgl/contrib/__init__.py
+++ b/python/dgl/contrib/__init__.py
 from . import sampling
 from . import graph_store
 from .dis_kvstore import KVClient, KVServer
-from .dis_kvstore import read_ip_config
\ No newline at end of file
+from .dis_kvstore import read_ip_config
--- a/python/dgl/contrib/dis_kvstore.py
+++ b/python/dgl/contrib/dis_kvstore.py
@@ -1381,4 +1381,4 @@ class KVClient(object):
            self._data_store
        """
        target[name][ID] = data
-    
\ No newline at end of file
+    
--- a/python/dgl/distributed/__init__.py
+++ b/python/dgl/distributed/__init__.py
+"""DGL distributed."""
+
+from .dist_graph import DistGraphServer, DistGraph
+from .partition import partition_graph, load_partition
--- a/python/dgl/distributed/dist_graph.py
+++ b/python/dgl/distributed/dist_graph.py
+"""Define distributed graph."""
+
+import socket
+from collections.abc import MutableMapping
+import numpy as np
+
+from ..graph import DGLGraph
+from .. import backend as F
+from ..base import NID, EID
+from ..contrib.dis_kvstore import KVServer, KVClient
+from ..graph_index import from_shared_mem_graph_index
+from .._ffi.ndarray import empty_shared_mem
+from ..frame import infer_scheme
+from .partition import load_partition
+from .. import ndarray as nd
+
+def _get_ndata_path(graph_name, ndata_name):
+    return "/" + graph_name + "_node_" + ndata_name
+
+def _get_edata_path(graph_name, edata_name):
+    return "/" + graph_name + "_edge_" + edata_name
+
+def _get_graph_path(graph_name):
+    return "/" + graph_name
+
+DTYPE_DICT = F.data_type_dict
+DTYPE_DICT = {DTYPE_DICT[key]:key for key in DTYPE_DICT}
+
+def _move_data_to_shared_mem_array(arr, name):
+    dlpack = F.zerocopy_to_dlpack(arr)
+    dgl_tensor = nd.from_dlpack(dlpack)
+    new_arr = empty_shared_mem(name, True, F.shape(arr), DTYPE_DICT[F.dtype(arr)])
+    dgl_tensor.copyto(new_arr)
+    dlpack = new_arr.to_dlpack()
+    return F.zerocopy_from_dlpack(dlpack)
+
+def _copy_graph_to_shared_mem(g, graph_name):
+    gidx = g._graph.copyto_shared_mem(_get_graph_path(graph_name))
+    new_g = DGLGraph(gidx)
+    # We should share the node/edge data to the client explicitly instead of putting them
+    # in the KVStore because some of the node/edge data may be duplicated.
+    local_node_path = _get_ndata_path(graph_name, 'local_node')
+    new_g.ndata['local_node'] = _move_data_to_shared_mem_array(g.ndata['local_node'],
+                                                               local_node_path)
+    local_edge_path = _get_edata_path(graph_name, 'local_edge')
+    new_g.edata['local_edge'] = _move_data_to_shared_mem_array(g.edata['local_edge'],
+                                                               local_edge_path)
+    new_g.ndata[NID] = _move_data_to_shared_mem_array(g.ndata[NID],
+                                                      _get_ndata_path(graph_name, NID))
+    new_g.edata[EID] = _move_data_to_shared_mem_array(g.edata[EID],
+                                                      _get_edata_path(graph_name, EID))
+    return new_g
+
+FIELD_DICT = {'local_node': F.int64,
+              NID: F.int64,
+              EID: F.int64}
+
+def _get_ndata_name(name):
+    ''' This is to get the name of node data in the kvstore.
+
+    KVStore doesn't understand node data or edge data. We'll use a prefix to distinguish them.
+    '''
+    return 'node:' + name
+
+def _get_edata_name(name):
+    ''' This is to get the name of edge data in the kvstore.
+
+    KVStore doesn't understand node data or edge data. We'll use a prefix to distinguish them.
+    '''
+    return 'edge:' + name
+
+def _is_ndata_name(name):
+    ''' Is this node data in the kvstore '''
+    return name[:5] == 'node:'
+
+def _is_edata_name(name):
+    ''' Is this edge data in the kvstore '''
+    return name[:5] == 'edge:'
+
+def _get_shared_mem_ndata(g, graph_name, name):
+    ''' Get shared-memory node data from DistGraph server.
+
+    This is called by the DistGraph client to access the node data in the DistGraph server
+    with shared memory.
+    '''
+    shape = (g.number_of_nodes(),)
+    dtype = FIELD_DICT[name]
+    dtype = DTYPE_DICT[dtype]
+    data = empty_shared_mem(_get_ndata_path(graph_name, name), False, shape, dtype)
+    dlpack = data.to_dlpack()
+    return F.zerocopy_from_dlpack(dlpack)
+
+def _get_shared_mem_edata(g, graph_name, name):
+    ''' Get shared-memory edge data from DistGraph server.
+
+    This is called by the DistGraph client to access the edge data in the DistGraph server
+    with shared memory.
+    '''
+    shape = (g.number_of_edges(),)
+    dtype = FIELD_DICT[name]
+    dtype = DTYPE_DICT[dtype]
+    data = empty_shared_mem(_get_edata_path(graph_name, name), False, shape, dtype)
+    dlpack = data.to_dlpack()
+    return F.zerocopy_from_dlpack(dlpack)
+
+def _get_graph_from_shared_mem(graph_name):
+    ''' Get the graph from the DistGraph server.
+
+    The DistGraph server puts the graph structure of the local partition in the shared memory.
+    The client can access the graph structure and some metadata on nodes and edges directly
+    through shared memory to reduce the overhead of data access.
+    '''
+    gidx = from_shared_mem_graph_index(_get_graph_path(graph_name))
+    if gidx is None:
+        return gidx
+
+    g = DGLGraph(gidx)
+    g.ndata['local_node'] = _get_shared_mem_ndata(g, graph_name, 'local_node')
+    g.ndata[NID] = _get_shared_mem_ndata(g, graph_name, NID)
+    g.edata[EID] = _get_shared_mem_edata(g, graph_name, EID)
+    return g
+
+def _get_shared_mem_metadata(graph_name):
+    ''' Get the metadata of the graph through shared memory.
+
+    The metadata includes the number of nodes and the number of edges. In the future,
+    we can add more information, especially for heterograph.
+    '''
+    shape = (2,)
+    dtype = F.int64
+    dtype = DTYPE_DICT[dtype]
+    data = empty_shared_mem(_get_ndata_path(graph_name, 'meta'), False, shape, dtype)
+    dlpack = data.to_dlpack()
+    return F.zerocopy_from_dlpack(dlpack)
+
+class DistTensor:
+    ''' Distributed tensor.
+
+    This is a wrapper to access a tensor stored in multiple machines.
+    This wrapper provides an interface similar to the local tensor.
+
+    Parameters
+    ----------
+    kv : DistGraph
+        The distributed graph object.
+    name : string
+        The name of the tensor.
+    '''
+    def __init__(self, g, name):
+        self.kvstore = g._client
+        self.name = name
+        dtype, shape, _ = g._client.get_data_meta(name)
+        self._shape = shape
+        self._dtype = dtype
+
+    def __getitem__(self, idx):
+        return self.kvstore.pull(name=self.name, id_tensor=idx)
+
+    def __setitem__(self, idx, val):
+        # TODO(zhengda) how do we want to support broadcast (e.g., G.ndata['h'][idx] = 1).
+        self.kvstore.push(name=self.name, id_tensor=idx, data_tensor=val)
+
+    def __len__(self):
+        return self._shape[0]
+
+    @property
+    def shape(self):
+        ''' Return the shape of the distributed tensor. '''
+        return self._shape
+
+    @property
+    def dtype(self):
+        ''' Return the data type of the distributed tensor. '''
+        return self._dtype
+
+
+class NodeDataView(MutableMapping):
+    """The data view class when dist_graph.ndata[...].data is called.
+    """
+    __slots__ = ['_graph', '_data']
+
+    def __init__(self, g):
+        self._graph = g
+        # When this is created, the server may already load node data. We need to
+        # initialize the node data in advance.
+        names = g._get_all_ndata_names()
+        self._data = {name: DistTensor(g, _get_ndata_name(name)) for name in names}
+
+    def _get_names(self):
+        return list(self._data.keys())
+
+    def _add(self, name):
+        self._data[name] = DistTensor(self._graph, _get_ndata_name(name))
+
+    def __getitem__(self, key):
+        return self._data[key]
+
+    def __setitem__(self, key, val):
+        raise DGLError("DGL doesn't support assignment. "
+                       + "Please call init_ndata to initialize new node data.")
+
+    def __delitem__(self, key):
+        #TODO(zhengda) how to delete data in the kvstore.
+        raise NotImplementedError("delete node data isn't supported yet")
+
+    def __len__(self):
+        # The number of node data may change. Let's count it every time we need them.
+        # It's not called frequently. It should be fine.
+        return len(self._data)
+
+    def __iter__(self):
+        return iter(self._data)
+
+    def __repr__(self):
+        reprs = {}
+        for name in self._data:
+            dtype = F.dtype(self._data[name])
+            shape = F.shape(self._data[name])
+            reprs[name] = 'DistTensor(shape={}, dtype={})'.format(str(shape), str(dtype))
+        return repr(reprs)
+
+class EdgeDataView(MutableMapping):
+    """The data view class when G.edges[...].data is called.
+    """
+    __slots__ = ['_graph', '_data']
+
+    def __init__(self, g):
+        self._graph = g
+        # When this is created, the server may already load edge data. We need to
+        # initialize the edge data in advance.
+        names = g._get_all_edata_names()
+        self._data = {name: DistTensor(g, _get_edata_name(name)) for name in names}
+
+    def _get_names(self):
+        return list(self._data.keys())
+
+    def _add(self, name):
+        self._data[name] = DistTensor(self._graph, _get_edata_name(name))
+
+    def __getitem__(self, key):
+        return self._data[key]
+
+    def __setitem__(self, key, val):
+        raise DGLError("DGL doesn't support assignment. "
+                       + "Please call init_edata to initialize new edge data.")
+
+    def __delitem__(self, key):
+        #TODO(zhengda) how to delete data in the kvstore.
+        raise NotImplementedError("delete edge data isn't supported yet")
+
+    def __len__(self):
+        # The number of edge data may change. Let's count it every time we need them.
+        # It's not called frequently. It should be fine.
+        return len(self._data)
+
+    def __iter__(self):
+        return iter(self._data)
+
+    def __repr__(self):
+        reprs = {}
+        for name in self._data:
+            dtype = F.dtype(self._data[name])
+            shape = F.shape(self._data[name])
+            reprs[name] = 'DistTensor(shape={}, dtype={})'.format(str(shape), str(dtype))
+        return repr(reprs)
+
+
+class DistGraphServer(KVServer):
+    ''' The DistGraph server.
+
+    This DistGraph server loads the graph data and sets up a service so that clients can read data
+    of a graph partition (graph structure, node data and edge data) from remote machines.
+    A server is responsible for one graph partition.
+
+    Currently, each machine runs only one main server with a set of backup servers to handle
+    clients' requests. The main server and the backup servers all handle the requests for the same
+    graph partition. They all share the partition data (graph structure and node/edge data) with
+    shared memory.
+
+    In addition, the partition data is also shared with the DistGraph clients that run on
+    the same machine.
+
+    Parameters
+    ----------
+    server_id : int
+        The server ID (start from 0).
+    server_namebook: dict
+        IP address namebook of KVServer, where key is the KVServer's ID
+        (start from 0) and value is the server's machine_id, IP address and port, e.g.,
+
+          {0:'[0, 172.31.40.143, 30050],
+           1:'[0, 172.31.40.143, 30051],
+           2:'[1, 172.31.36.140, 30050],
+           3:'[1, 172.31.36.140, 30051],
+           4:'[2, 172.31.47.147, 30050],
+           5:'[2, 172.31.47.147, 30051],
+           6:'[3, 172.31.30.180, 30050],
+           7:'[3, 172.31.30.180, 30051]}
+    num_client : int
+        Total number of client nodes.
+    graph_name : string
+        The name of the graph. The server and the client need to specify the same graph name.
+    conf_file : string
+        The path of the config file generated by the partition tool.
+    '''
+    def __init__(self, server_id, server_namebook, num_client, graph_name, conf_file):
+        super(DistGraphServer, self).__init__(server_id=server_id, server_namebook=server_namebook,
+                                              num_client=num_client)
+
+        host_name = socket.gethostname()
+        host_ip = socket.gethostbyname(host_name)
+        print('Server {}: host name: {}, ip: {}'.format(server_id, host_name, host_ip))
+
+        self.client_g, node_feats, edge_feats, self.meta = load_partition(conf_file, server_id)
+        num_nodes, num_edges, node_map, edge_map = self.meta
+        self.client_g = _copy_graph_to_shared_mem(self.client_g, graph_name)
+        self.meta = _move_data_to_shared_mem_array(F.tensor([num_nodes, num_edges]),
+                                                   _get_ndata_path(graph_name, 'meta'))
+
+        # Create node global2local map.
+        node_g2l = F.zeros((num_nodes), dtype=F.int64, ctx=F.cpu()) - 1
+        # The nodes that belong to this partition.
+        local_nids = F.nonzero_1d(self.client_g.ndata['local_node'])
+        nids = self.client_g.ndata[NID][local_nids]
+        assert np.all(node_map[nids] == server_id), 'Load a wrong partition'
+        F.scatter_row_inplace(node_g2l, nids, F.arange(0, len(nids)))
+
+        # Create edge global2local map.
+        if len(edge_feats) > 0:
+            edge_g2l = F.zeros((num_edges), dtype=F.int64, ctx=F.cpu()) - 1
+            local_eids = F.nonzero_1d(self.client_g.edata['local_edge'])
+            eids = self.client_g.edata[EID][local_eids]
+            assert np.all(edge_map[eids] == server_id), 'Load a wrong partition'
+            F.scatter_row_inplace(edge_g2l, eids, F.arange(0, len(eids)))
+
+        node_map = F.zerocopy_from_numpy(node_map)
+        edge_map = F.zerocopy_from_numpy(edge_map)
+        if self.get_id() % self.get_group_count() == 0: # master server
+            for name in node_feats:
+                self.set_global2local(name=_get_ndata_name(name), global2local=node_g2l)
+                self.init_data(name=_get_ndata_name(name), data_tensor=node_feats[name])
+                self.set_partition_book(name=_get_ndata_name(name), partition_book=node_map)
+            for name in edge_feats:
+                self.set_global2local(name=_get_edata_name(name), global2local=edge_g2l)
+                self.init_data(name=_get_edata_name(name), data_tensor=edge_feats[name])
+                self.set_partition_book(name=_get_edata_name(name), partition_book=edge_map)
+        else:
+            for name in node_feats:
+                self.set_global2local(name=_get_ndata_name(name))
+                self.init_data(name=_get_ndata_name(name))
+                self.set_partition_book(name=_get_ndata_name(name), partition_book=node_map)
+            for name in edge_feats:
+                self.set_global2local(name=_get_edata_name(name))
+                self.init_data(name=_get_edata_name(name))
+                self.set_partition_book(name=_get_edata_name(name), partition_book=edge_map)
+
+class DistGraph:
+    ''' The DistGraph client.
+
+    This provides the graph interface to access the partitioned graph data for distributed GNN
+    training. All data of partitions are loaded by the DistGraph server. The client doesn't need
+    to load any data.
+
+    Parameters
+    ----------
+    server_namebook: dict
+        IP address namebook of KVServer, where key is the KVServer's ID
+        (start from 0) and value is the server's machine_id, IP address and port,
+        and group_count, e.g.,
+
+          {0:'[0, 172.31.40.143, 30050, 2],
+           1:'[0, 172.31.40.143, 30051, 2],
+           2:'[1, 172.31.36.140, 30050, 2],
+           3:'[1, 172.31.36.140, 30051, 2],
+           4:'[2, 172.31.47.147, 30050, 2],
+           5:'[2, 172.31.47.147, 30051, 2],
+           6:'[3, 172.31.30.180, 30050, 2],
+           7:'[3, 172.31.30.180, 30051, 2]}
+    graph_name : str
+        The name of the graph. This name has to be the same as the one used in DistGraphServer.
+    '''
+    def __init__(self, server_namebook, graph_name):
+        self._client = KVClient(server_namebook=server_namebook)
+        self._client.connect()
+
+        self.g = _get_graph_from_shared_mem(graph_name)
+        self.graph_name = graph_name
+        self.meta = F.asnumpy(_get_shared_mem_metadata(graph_name))
+
+        self._client.barrier()
+
+        if self.g is not None:
+            self._local_nids = F.nonzero_1d(self.g.ndata['local_node'])
+            self._local_gnid = self.g.ndata[NID][self._local_nids]
+        else:
+            self._local_nids = None
+            self._local_gnid = None
+
+        self._ndata = NodeDataView(self)
+        self._edata = EdgeDataView(self)
+
+
+    def init_ndata(self, ndata_name, shape, dtype):
+        '''Initialize node data
+
+        This initializes the node data in the distributed graph storage.
+
+        Parameters
+        ----------
+        name : string
+            The name of the node data.
+        shape : tuple
+            The shape of the node data.
+        dtype : dtype
+            The data type of the node data.
+        '''
+        assert shape[0] == self.number_of_nodes()
+        names = self._ndata._get_names()
+        # TODO we need to fix this. We should be able to init ndata even when there is no node data.
+        assert len(names) > 0
+        self._client.init_data(_get_ndata_name(ndata_name), shape, dtype, _get_ndata_name(names[0]))
+        self._ndata._add(ndata_name)
+
+    def init_edata(self, edata_name, shape, dtype):
+        '''Initialize edge data
+
+        This initializes the edge data in the distributed graph storage.
+
+        Parameters
+        ----------
+        name : string
+            The name of the edge data.
+        shape : tuple
+            The shape of the edge data.
+        dtype : dtype
+            The data type of the edge data.
+        '''
+        assert shape[0] == self.number_of_edges()
+        names = self._edata._get_names()
+        # TODO we need to fix this. We should be able to init ndata even when there is no edge data.
+        assert len(names) > 0
+        self._client.init_data(_get_edata_name(edata_name), shape, dtype, _get_edata_name(names[0]))
+        self._edata._add(edata_name)
+
+    def init_node_emb(self, name, shape, dtype, initializer):
+        ''' Initialize node embeddings.
+
+        This initializes the node embeddings in the distributed graph storage.
+
+        Parameters
+        ----------
+        name : string
+            The name of the node embeddings.
+        shape : tuple
+            The shape of the node embeddings.
+        dtype : string
+            The data type of the node embeddings.
+        initializer : callable
+            The initializer.
+        '''
+        # TODO(zhengda)
+        raise NotImplementedError("init_node_emb isn't supported yet")
+
+    def get_node_embeddings(self):
+        ''' Return node embeddings
+
+        Returns
+        -------
+        a dict of SparseEmbedding
+            All node embeddings in the graph store.
+        '''
+        # TODO(zhengda)
+        raise NotImplementedError("get_node_embeddings isn't supported yet")
+
+    @property
+    def ndata(self):
+        """Return the data view of all the nodes.
+
+        Returns
+        -------
+        NodeDataView
+            The data view in the distributed graph storage.
+        """
+        return self._ndata
+
+    @property
+    def edata(self):
+        """Return the data view of all the edges.
+
+        Returns
+        -------
+        EdgeDataView
+            The data view in the distributed graph storage.
+        """
+        return self._edata
+
+    def number_of_nodes(self):
+        """Return the number of nodes"""
+        return self.meta[0]
+
+    def number_of_edges(self):
+        """Return the number of edges"""
+        return self.meta[1]
+
+    def node_attr_schemes(self):
+        """Return the node feature and embedding schemes."""
+        schemes = {}
+        for key in self.ndata:
+            schemes[key] = infer_scheme(self.ndata[key])
+        return schemes
+
+    def edge_attr_schemes(self):
+        """Return the edge feature and embedding schemes."""
+        schemes = {}
+        for key in self.edata:
+            schemes[key] = infer_scheme(self.edata[key])
+        return schemes
+
+    def rank(self):
+        ''' The rank of the distributed graph store.
+
+        Returns
+        -------
+        int
+            The rank of the current graph store.
+        '''
+        return self._client.get_id()
+
+    def shut_down(self):
+        """Shut down all KVServer nodes.
+
+        We usually invoke this API by just one client (e.g., client_0).
+        """
+        # We have to remove them. Otherwise, kvstore cannot shut down correctly.
+        self._ndata = None
+        self._edata = None
+        self._client.shut_down()
+
+    def _get_all_ndata_names(self):
+        ''' Get the names of all node data.
+        '''
+        names = self._client.get_data_name_list()
+        ndata_names = []
+        for name in names:
+            if _is_ndata_name(name):
+                # Remove the prefix "node:"
+                ndata_names.append(name[5:])
+        return ndata_names
+
+    def _get_all_edata_names(self):
+        ''' Get the names of all edge data.
+        '''
+        names = self._client.get_data_name_list()
+        edata_names = []
+        for name in names:
+            if _is_edata_name(name):
+                # Remove the prefix "edge:"
+                edata_names.append(name[5:])
+        return edata_names
--- a/python/dgl/distributed/partition.py
+++ b/python/dgl/distributed/partition.py
+"""Functions for partitions.
+
+For distributed training, a graph is partitioned and partitions are stored in files
+organized as follows:
+
+```
+data_root_dir/
+  |-- part_conf.json      # partition configuration file in JSON
+  |-- node_map            # partition id of each node stored in a numpy array
+  |-- edge_map            # partition id of each edge stored in a numpy array
+  |-- part0/              # data for partition 0
+      |-- node_feats      # node features stored in binary format
+      |-- edge_feats      # edge features stored in binary format
+      |-- graph           # graph structure of this partition stored in binary format
+  |-- part1/              # data for partition 1
+      |-- node_feats
+      |-- edge_feats
+      |-- graph
+```
+
+The partition configuration file stores the file locations. For the above example,
+the configuration file will look like the following:
+
+```
+{
+  "graph_name" : "test",
+  "part_method" : "metis",
+  "num_parts" : 2
+  "halo_hops" : 1,
+  "node_map" : "data_root_dir/node_map.npy",
+  "edge_map" : "data_root_dir/edge_map.npy"
+  "num_nodes" : 1000000,
+  "num_edges" : 52000000,
+  "part-0" : {
+    "node_feats" : "data_root_dir/part0/node_feats.dgl",
+    "edge_feats" : "data_root_dir/part0/edge_feats.dgl",
+    "part_graph" : "data_root_dir/part0/graph.dgl",
+  },
+  "part-1" : {
+    "node_feats" : "data_root_dir/part1/node_feats.dgl",
+    "edge_feats" : "data_root_dir/part1/edge_feats.dgl",
+    "part_graph" : "data_root_dir/part1/graph.dgl",
+  },
+}
+```
+
+Here are the definition of the fields in the partition configuration file:
+    * `graph_name` is the name of the graph given by a user.
+    * `part_method` is the method used to assign nodes to partitions.
+      Currently, it supports "random" and "metis".
+    * `num_parts` is the number of partitions.
+    * `halo_hops` is the number of HALO nodes we want to include in a partition.
+    * `node_map` is the node assignment map, which tells the partition Id a node is assigned to.
+    * `edge_map` is the edge assignment map, which tells the partition Id an edge is assigned to.
+    * `num_nodes` is the number of nodes in the global graph.
+    * `num_edges` is the number of edges in the global graph.
+    * `part-*` stores the data of a partition.
+
+Nodes in each partition is *relabeled* to always start with zero. We call the node
+ID in the original graph, *global ID*, while the relabeled ID in each partition,
+*local ID*. Each partition graph has an integer node data tensor stored under name
+`dgl.NID` and each value is the node's global ID. Similarly, edges are relabeled too
+and the mapping from local ID to global ID is stored as an integer edge data tensor
+under name `dgl.EID`.
+
+Note that each partition can contain *HALO* nodes and edges, those belonging to
+other partitions but are included in this partition for integrity or efficiency concerns.
+We call nodes and edges that truly belong to one partition *local nodes/edges*, while
+the rest "HALO nodes/edges".
+
+Node and edge features are splitted and stored together with each graph partition.
+We do not store features of HALO nodes and edges.
+
+Two useful functions in this module:
+    * :func:`~dgl.distributed.load_partition` loads one partition and the meta data into memory.
+    * :func:`~dgl.distributed.partition` partitions a graph into files organized as above.
+
+"""
+
+import json
+import os
+import numpy as np
+
+from .. import backend as F
+from ..base import NID, EID
+from ..data.utils import load_graphs, save_graphs, load_tensors, save_tensors
+from ..transform import metis_partition_assignment, partition_graph_with_halo
+
+def load_partition(conf_file, part_id):
+    ''' Load data of a partition from the data path in the DistGraph server.
+
+    A partition data includes a graph structure of the partition, a dict of node tensors,
+    a dict of edge tensors and some metadata. The partition may contain the HALO nodes,
+    which are the nodes replicated from other partitions. However, the dict of node tensors
+    only contains the node data that belongs to the local partition. Similarly, edge tensors
+    only contains the edge data that belongs to the local partition. The metadata include
+    the information of the global graph (not the local partition), which includes the number
+    of nodes, the number of edges as well as the node assignment of the global graph.
+
+    The function currently loads data through the normal filesystem interface. In the future,
+    we need to support loading data from other storage such as S3 and HDFS.
+
+    Parameters
+    ----------
+    conf_file : str
+        The path of the partition config file.
+    part_id : int
+        The partition Id.
+
+    Returns
+    -------
+    DGLGraph
+        The graph partition structure.
+    dict of tensors
+        All node features.
+    dict of tensors
+        All edge features.
+    (int, int, NumPy ndarray, Numpy ndarray))
+        The metadata of the global graph: number of nodes, number of edges, node map, edge map.
+    '''
+    with open(conf_file) as conf_f:
+        part_metadata = json.load(conf_f)
+    assert 'part-{}'.format(part_id) in part_metadata, "part-{} does not exist".format(part_id)
+    part_files = part_metadata['part-{}'.format(part_id)]
+    assert 'node_feats' in part_files, "the partition does not contain node features."
+    assert 'edge_feats' in part_files, "the partition does not contain edge feature."
+    assert 'part_graph' in part_files, "the partition does not contain graph structure."
+    node_feats = load_tensors(part_files['node_feats'])
+    edge_feats = load_tensors(part_files['edge_feats'])
+    graph = load_graphs(part_files['part_graph'])[0][0]
+    assert 'num_nodes' in part_metadata, "cannot get the number of nodes of the global graph."
+    assert 'num_edges' in part_metadata, "cannot get the number of edges of the global graph."
+    assert 'node_map' in part_metadata, "cannot get the node map."
+    assert 'edge_map' in part_metadata, "cannot get the edge map."
+    node_map = np.load(part_metadata['node_map'])
+    edge_map = np.load(part_metadata['edge_map'])
+    meta = (part_metadata['num_nodes'], part_metadata['num_edges'], node_map, edge_map)
+    assert NID in graph.ndata, "the partition graph should contain node mapping to global node Id"
+    assert EID in graph.edata, "the partition graph should contain edge mapping to global edge Id"
+
+    # TODO we need to fix this. DGL backend doesn't support boolean or byte.
+    # int64 is unnecessary.
+    part_ids = F.zerocopy_from_numpy(node_map)[graph.ndata[NID]]
+    graph.ndata['local_node'] = F.astype(part_ids == part_id, F.int64)
+    part_ids = F.zerocopy_from_numpy(edge_map)[graph.edata[EID]]
+    graph.edata['local_edge'] = F.astype(part_ids == part_id, F.int64)
+
+    return graph, node_feats, edge_feats, meta
+
+def partition_graph(g, graph_name, num_parts, out_path, num_hops=1, part_method="metis"):
+    ''' Partition a graph for distributed training and store the partitions on files.
+
+    The partitioning occurs in three steps: 1) run a partition algorithm (e.g., Metis) to
+    assign nodes to partitions; 2) construct partition graph structure based on
+    the node assignment; 3) split the node features and edge features based on
+    the partition result.
+
+    The partitioned data is stored into multiple files.
+
+    First, the metadata of the original graph and the partitioning is stored in a JSON file
+    named after `graph_name`. This JSON file contains the information of the originla graph
+    as well as the file names that store each partition.
+
+    The node assignment is stored in a separate file if we don't reshuffle node Ids to ensure
+    that all nodes in a partition fall into a contiguous Id range. The node assignment is stored
+    in a numpy file.
+
+    All node features in a partition are stored in a file with DGL format. The node features are
+    stored in a dictionary, in which the key is the node data name and the value is a tensor.
+
+    All edge features in a partition are stored in a file with DGL format. The edge features are
+    stored in a dictionary, in which the key is the edge data name and the value is a tensor.
+
+    The graph structure of a partition is stored in a file with the DGLGraph format. The DGLGraph
+    contains the mapping of node/edge Ids to the Ids in the original graph.
+
+    Parameters
+    ----------
+    g : DGLGraph
+        The input graph to partition
+    graph_name : str
+        The name of the graph.
+    num_parts : int
+        The number of partitions
+    num_hops : int
+        The number of hops of HALO nodes we construct on a partition graph structure.
+    part_method : str
+        The partition method. It supports "random" and "metis".
+    out_path : str
+        The path to store the files for all partitioned data.
+    '''
+    if num_parts == 1:
+        client_parts = {0: g}
+        node_parts = F.zeros((g.number_of_nodes(),), F.int64, F.cpu())
+        g.ndata[NID] = F.arange(0, g.number_of_nodes())
+        g.edata[EID] = F.arange(0, g.number_of_edges())
+    elif part_method == 'metis':
+        node_parts = metis_partition_assignment(g, num_parts)
+        client_parts = partition_graph_with_halo(g, node_parts, num_hops)
+    elif part_method == 'random':
+        node_parts = dgl.random.choice(num_parts, g.number_of_nodes())
+        client_parts = partition_graph_with_halo(g, node_parts, num_hops)
+    else:
+        raise Exception('Unknown partitioning method: ' + part_method)
+
+    # Let's calculate edge assignment.
+    # TODO(zhengda) we should replace int64 with int16. int16 should be sufficient.
+    if num_parts > 1:
+        edge_parts = np.zeros((g.number_of_edges(),), dtype=np.int64) - 1
+        num_edges = 0
+        lnodes_list = []      # The node ids of each partition
+        ledges_list = []      # The edge Ids of each partition
+        for part_id in range(num_parts):
+            part = client_parts[part_id]
+            local_nodes = F.boolean_mask(part.ndata[NID], part.ndata['inner_node'] == 1)
+            local_edges = F.asnumpy(g.in_edges(local_nodes, form='eid'))
+            edge_parts[local_edges] = part_id
+            num_edges += len(local_edges)
+            lnodes_list.append(local_nodes)
+            ledges_list.append(local_edges)
+        assert num_edges == g.number_of_edges()
+    else:
+        edge_parts = np.zeros((g.number_of_edges(),), dtype=np.int64)
+
+    os.makedirs(out_path, mode=0o775, exist_ok=True)
+    tot_num_inner_edges = 0
+    out_path = os.path.abspath(out_path)
+    node_part_file = os.path.join(out_path, "node_map")
+    edge_part_file = os.path.join(out_path, "edge_map")
+    np.save(node_part_file, F.asnumpy(node_parts), allow_pickle=False)
+    np.save(edge_part_file, edge_parts, allow_pickle=False)
+    part_metadata = {'graph_name': graph_name,
+                     'num_nodes': g.number_of_nodes(),
+                     'num_edges': g.number_of_edges(),
+                     'part_method': part_method,
+                     'num_parts': num_parts,
+                     'halo_hops': num_hops,
+                     'node_map': node_part_file + ".npy",
+                     'edge_map': edge_part_file + ".npy"}
+    for part_id in range(num_parts):
+        part = client_parts[part_id]
+
+        # Get the node/edge features of each partition.
+        node_feats = {}
+        edge_feats = {}
+        if num_parts > 1:
+            local_nodes = lnodes_list[part_id]
+            local_edges = ledges_list[part_id]
+            print('part {} has {} nodes and {} edges.'.format(
+                part_id, part.number_of_nodes(), part.number_of_edges()))
+            print('{} nodes and {} edges are inside the partition'.format(
+                len(local_nodes), len(local_edges)))
+            tot_num_inner_edges += len(local_edges)
+            for name in g.ndata:
+                node_feats[name] = g.ndata[name][local_nodes]
+            for name in g.edata:
+                edge_feats[name] = g.edata[name][local_edges]
+        else:
+            for name in g.ndata:
+                node_feats[name] = g.ndata[name]
+            for name in g.edata:
+                edge_feats[name] = g.edata[name]
+
+        part_dir = os.path.join(out_path, "part" + str(part_id))
+        node_feat_file = os.path.join(part_dir, "node_feat.dgl")
+        edge_feat_file = os.path.join(part_dir, "edge_feat.dgl")
+        part_graph_file = os.path.join(part_dir, "graph.dgl")
+        part_metadata['part-{}'.format(part_id)] = {'node_feats': node_feat_file,
+                                                    'edge_feats': edge_feat_file,
+                                                    'part_graph': part_graph_file}
+        os.makedirs(part_dir, mode=0o775, exist_ok=True)
+        save_tensors(node_feat_file, node_feats)
+        save_tensors(edge_feat_file, edge_feats)
+        save_graphs(part_graph_file, [part])
+
+    with open('{}/{}.json'.format(out_path, graph_name), 'w') as outfile:
+        json.dump(part_metadata, outfile, sort_keys=True, indent=4)
+
+    num_cuts = g.number_of_edges() - tot_num_inner_edges
+    if num_parts == 1:
+        num_cuts = 0
+    print('There are {} edges in the graph and {} edge cuts for {} partitions.'.format(
+        g.number_of_edges(), num_cuts, num_parts))
--- a/python/dgl/graph_index.py
+++ b/python/dgl/graph_index.py
@@ -1025,6 +1025,11 @@ def from_csr(indptr, indices, direction):
    indices : Tensor
        column index array in the CSR format
    direction : str
+
+    Returns
+    ------
+    GraphIndex
+        The graph index
        the edge direction. Either "in" or "out".
    """
    indptr = utils.toindex(indptr)
@@ -1042,6 +1047,11 @@ def from_shared_mem_graph_index(shared_mem_name):
    ----------
    shared_mem_name : string
        the name of shared memory
+
+    Returns
+    ------
+    GraphIndex
+        The graph index
    """
    return _CAPI_DGLGraphCSRCreateMMap(shared_mem_name)


--- a/python/dgl/transform.py
+++ b/python/dgl/transform.py
@@ -549,8 +549,7 @@ def remove_self_loop(g):
    return new_g

 def partition_graph_with_halo(g, node_part, num_hops):
-    '''
-    This is to partition a graph. Each partition contains HALO nodes
+    ''' This is to partition a graph. Each partition contains HALO nodes
    so that we can generate NodeFlow in each partition correctly.

    Parameters
@@ -586,9 +585,35 @@ def partition_graph_with_halo(g, node_part, num_hops):
        subg_dict[i] = subg
    return subg_dict

-def metis_partition(g, k, extra_cached_hops=0):
+def metis_partition_assignment(g, k):
+    ''' This assigns nodes to different partitions with Metis partitioning algorithm.
+
+    After the partition assignment, we construct partitions.
+
+    Parameters
+    ----------
+    g : DGLGraph
+        The graph to be partitioned
+    k : int
+        The number of partitions.
+
+    Returns
+    -------
+    a 1-D tensor
+        A vector with each element that indicates the partition Id of a vertex.
    '''
-    This is to partition a graph with Metis partitioning.
+    # METIS works only on symmetric graphs.
+    # The METIS runs on the symmetric graph to generate the node assignment to partitions.
+    sym_g = to_bidirected(g, readonly=True)
+    node_part = _CAPI_DGLMetisPartition(sym_g._graph, k)
+    if len(node_part) == 0:
+        return None
+    else:
+        node_part = utils.toindex(node_part)
+        return node_part.tousertensor()
+
+def metis_partition(g, k, extra_cached_hops=0):
+    ''' This is to partition a graph with Metis partitioning.

    Metis assigns vertices to partitions. This API constructs graphs with the vertices assigned
    to the partitions and their incoming edges.

--- a/src/graph/immutable_graph.cc
+++ b/src/graph/immutable_graph.cc
@@ -517,6 +517,12 @@ ImmutableGraphPtr ImmutableGraph::CreateFromCSR(
 }

 ImmutableGraphPtr ImmutableGraph::CreateFromCSR(const std::string &name) {
+  // If the shared memory graph index doesn't exist, we return null directly.
+#ifndef _WIN32
+  if (!SharedMemory::Exist(GetSharedMemName(name, "meta"))) {
+    return nullptr;
+  }
+#endif  // _WIN32
  GraphIndexMetadata meta = DeserializeMetadata(GetSharedMemName(name, "meta"));
  CSRPtr in_csr, out_csr;
  if (meta.has_in_csr) {

--- a/src/graph/network.cc
+++ b/src/graph/network.cc
@@ -766,6 +766,9 @@ DGL_REGISTER_GLOBAL("network._CAPI_FastPull")
      }
    }
    row_size *= (local_data->dtype.bits / 8);
+    size_t data_size = local_data.GetSize();
+    CHECK_GT(local_data_shape.size(), 0);
+    CHECK_EQ(row_size * local_data_shape[0], data_size);
    // Get local id and remote id
    if (str_flag.compare("has_g2l") == 0) {
      NDArray g2l = args[11];
@@ -775,9 +778,12 @@ DGL_REGISTER_GLOBAL("network._CAPI_FastPull")
        int64_t part_id = pb_data[id];
        if (part_id == local_machine_id) {
          int64_t local_id = g2l_data[id];
+          CHECK_LT(local_id, local_data_shape[0]);
+          CHECK_GE(local_id, 0);
          local_ids.push_back(local_id);
          local_ids_orginal.push_back(i);
        } else {
+          CHECK_LT(part_id, machine_count) << "invalid partition ID";
          remote_ids[part_id].push_back(id);
          remote_ids_original[part_id].push_back(i);
        }
@@ -787,6 +793,8 @@ DGL_REGISTER_GLOBAL("network._CAPI_FastPull")
        int64_t id = ID_data[i];
        int64_t part_id = pb_data[id];
        if (part_id == local_machine_id) {
+          CHECK_LT(id, local_data_shape[0]);
+          CHECK_GE(id, 0);
          local_ids.push_back(id);
          local_ids_orginal.push_back(i);
        } else {
@@ -822,6 +830,9 @@ DGL_REGISTER_GLOBAL("network._CAPI_FastPull")
    // Copy local data
 #pragma omp parallel for
    for (int64_t i = 0; i < local_ids.size(); ++i) {
+      CHECK_GE(ID_size*row_size, local_ids_orginal[i] * row_size + row_size);
+      CHECK_GE(data_size, local_ids[i] * row_size + row_size);
+      CHECK_GE(local_ids[i], 0);
      memcpy(return_data + local_ids_orginal[i] * row_size,
             local_data_char + local_ids[i] * row_size,
             row_size);

--- a/src/runtime/ndarray.cc
+++ b/src/runtime/ndarray.cc
@@ -167,9 +167,9 @@ NDArray NDArray::EmptyShared(const std::string &name,
 #ifndef _WIN32
  auto mem = std::make_shared<SharedMemory>(name);
  if (is_create) {
-    ret.data_->dl_tensor.data = mem->create_new(size);
+    ret.data_->dl_tensor.data = mem->CreateNew(size);
  } else {
-    ret.data_->dl_tensor.data = mem->open(size);
+    ret.data_->dl_tensor.data = mem->Open(size);
  }

  ret.data_->mem = mem;

--- a/src/runtime/shared_mem.cc
+++ b/src/runtime/shared_mem.cc
@@ -34,7 +34,7 @@ SharedMemory::~SharedMemory() {
  }
 }

-void *SharedMemory::create_new(size_t size) {
+void *SharedMemory::CreateNew(size_t size) {
  this->own = true;

  int flag = O_RDWR|O_CREAT;
@@ -49,7 +49,7 @@ void *SharedMemory::create_new(size_t size) {
  return ptr;
 }

-void *SharedMemory::open(size_t size) {
+void *SharedMemory::Open(size_t size) {
  int flag = O_RDWR;
  fd = shm_open(name.c_str(), flag, S_IRUSR | S_IWUSR);
  CHECK_NE(fd, -1) << "fail to open " << name << ": " << strerror(errno);
@@ -58,6 +58,16 @@ void *SharedMemory::open(size_t size) {
      << "Failed to map shared memory. mmap failed with error " << strerror(errno);
  return ptr;
 }
+
+bool SharedMemory::Exist(const std::string &name) {
+  int fd = shm_open(name.c_str(), O_RDONLY, S_IRUSR | S_IWUSR);
+  if (fd >= 0) {
+    close(fd);
+    return true;
+  } else {
+    return false;
+  }
+}
 #endif  // _WIN32

 }  // namespace runtime

--- a/tests/distributed/test_dist_graph_store.py
+++ b/tests/distributed/test_dist_graph_store.py
+import os
+os.environ['OMP_NUM_THREADS'] = '1'
+import dgl
+import sys
+import numpy as np
+import time
+from scipy import sparse as spsp
+from numpy.testing import assert_array_equal
+from multiprocessing import Process, Manager, Condition, Value
+import multiprocessing as mp
+from dgl.graph_index import create_graph_index
+from dgl.data.utils import load_graphs, save_graphs
+from dgl.distributed import DistGraphServer, DistGraph
+from dgl.distributed import partition_graph
+import backend as F
+import unittest
+import pickle
+
+server_namebook = {0: [0, '127.0.0.1', 30000, 1]}
+
+def create_random_graph(n):
+    arr = (spsp.random(n, n, density=0.001, format='coo') != 0).astype(np.int64)
+    ig = create_graph_index(arr, readonly=True)
+    return dgl.DGLGraph(ig)
+
+def run_server(graph_name, server_id, num_clients, barrier):
+    g = DistGraphServer(server_id, server_namebook, num_clients, graph_name,
+                        '/tmp/{}.json'.format(graph_name))
+    barrier.wait()
+    print('start server', server_id)
+    g.start()
+
+def run_client(graph_name, barrier, num_nodes, num_edges):
+    barrier.wait()
+    g = DistGraph(server_namebook, graph_name)
+
+    # Test API
+    assert g.number_of_nodes() == num_nodes
+    assert g.number_of_edges() == num_edges
+
+    # Test reading node data
+    nids = F.arange(0, int(g.number_of_nodes() / 2))
+    feats1 = g.ndata['features'][nids]
+    feats = F.squeeze(feats1, 1)
+    assert np.all(F.asnumpy(feats == nids))
+
+    # Test reading edge data
+    eids = F.arange(0, int(g.number_of_edges() / 2))
+    feats1 = g.edata['features'][eids]
+    feats = F.squeeze(feats1, 1)
+    assert np.all(F.asnumpy(feats == eids))
+
+    # Test init node data
+    new_shape = (g.number_of_nodes(), 2)
+    g.init_ndata('test1', new_shape, F.int32)
+    feats = g.ndata['test1'][nids]
+    assert np.all(F.asnumpy(feats) == 0)
+
+    # Test init edge data
+    new_shape = (g.number_of_edges(), 2)
+    g.init_edata('test1', new_shape, F.int32)
+    feats = g.edata['test1'][eids]
+    assert np.all(F.asnumpy(feats) == 0)
+
+    # Test write data
+    new_feats = F.ones((len(nids), 2), F.int32, F.cpu())
+    g.ndata['test1'][nids] = new_feats
+    feats = g.ndata['test1'][nids]
+    assert np.all(F.asnumpy(feats) == 1)
+
+    # Test metadata operations.
+    assert len(g.ndata['features']) == g.number_of_nodes()
+    assert g.ndata['features'].shape == (g.number_of_nodes(), 1)
+    assert g.ndata['features'].dtype == F.int64
+    assert g.node_attr_schemes()['features'].dtype == F.int64
+    assert g.node_attr_schemes()['test1'].dtype == F.int32
+    assert g.node_attr_schemes()['features'].shape == (1,)
+
+    g.shut_down()
+    print('end')
+
+def run_server_client():
+    g = create_random_graph(10000)
+
+    # Partition the graph
+    num_parts = 1
+    graph_name = 'test'
+    g.ndata['features'] = F.unsqueeze(F.arange(0, g.number_of_nodes()), 1)
+    g.edata['features'] = F.unsqueeze(F.arange(0, g.number_of_edges()), 1)
+    partition_graph(g, graph_name, num_parts, '/tmp')
+
+    # let's just test on one partition for now.
+    # We cannot run multiple servers and clients on the same machine.
+    barrier = mp.Barrier(2)
+    serv_ps = []
+    for serv_id in range(1):
+        p = Process(target=run_server, args=(graph_name, serv_id, 1, barrier))
+        serv_ps.append(p)
+        p.start()
+
+    cli_ps = []
+    for cli_id in range(1):
+        print('start client', cli_id)
+        p = Process(target=run_client, args=(graph_name, barrier, g.number_of_nodes(),
+                                             g.number_of_edges()))
+        p.start()
+        cli_ps.append(p)
+
+    for p in cli_ps:
+        p.join()
+    print('clients have terminated')
+
+if __name__ == '__main__':
+    run_server_client()
--- a/tests/distributed/test_partition.py
+++ b/tests/distributed/test_partition.py
+import dgl
+import sys
+import numpy as np
+from scipy import sparse as spsp
+from numpy.testing import assert_array_equal
+from dgl.graph_index import create_graph_index
+from dgl.distributed import partition_graph, load_partition
+import backend as F
+import unittest
+import pickle
+
+def create_random_graph(n):
+    arr = (spsp.random(n, n, density=0.001, format='coo') != 0).astype(np.int64)
+    ig = create_graph_index(arr, readonly=True)
+    return dgl.DGLGraph(ig)
+
+def test_partition():
+    g = create_random_graph(10000)
+    g.ndata['labels'] = F.arange(0, g.number_of_nodes())
+    g.ndata['feats'] = F.tensor(np.random.randn(g.number_of_nodes(), 10))
+    num_parts = 4
+    num_hops = 2
+
+    partition_graph(g, 'test', num_parts, '/tmp', num_hops=num_hops, part_method='metis')
+    for i in range(num_parts):
+        part_g, node_feats, edge_feats, meta = load_partition('/tmp/test.json', i)
+        num_nodes, num_edges, node_map, edge_map = meta
+
+        # Check the metadata
+        assert num_nodes == g.number_of_nodes()
+        assert num_edges == g.number_of_edges()
+
+        # Check the node map.
+        local_nodes = np.nonzero(node_map == i)[0]
+        part_ids = node_map[F.asnumpy(part_g.ndata[dgl.NID])]
+        local_nodes1 = F.asnumpy(part_g.ndata[dgl.NID])[part_ids == i]
+        assert np.all(local_nodes == local_nodes1)
+
+        # Check the edge map.
+        assert np.all(edge_map >= 0)
+        local_edges = np.nonzero(edge_map == i)[0]
+        part_ids = edge_map[F.asnumpy(part_g.edata[dgl.EID])]
+        local_edges1 = F.asnumpy(part_g.edata[dgl.EID])[part_ids == i]
+        assert np.all(local_edges == np.sort(local_edges1))
+
+        for name in ['labels', 'feats']:
+            assert name in node_feats
+            assert node_feats[name].shape[0] == len(local_nodes)
+            assert len(local_nodes) == len(node_feats[name])
+            assert np.all(F.asnumpy(g.ndata[name][local_nodes]) == F.asnumpy(node_feats[name]))
+        assert len(edge_feats) == 0
+
+
+if __name__ == '__main__':
+    test_partition()
--- a/tools/partition.py
+++ b/tools/partition.py
@@ -4,11 +4,15 @@ import signal
 import dgl
 from dgl import backend as F
 from dgl.data.utils import load_graphs, save_graphs
+from dgl.contrib.dist_graph import partition_graph
+import pickle

 def main():
    parser = argparse.ArgumentParser(description='Partition a graph')
    parser.add_argument('--data', required=True, type=str,
                        help='The file path of the input graph in the DGL format.')
+    parser.add_argument('--graph-name', required=True, type=str,
+                        help='The graph name')
    parser.add_argument('-k', '--num-parts', required=True, type=int,
                        help='The number of partitions')
    parser.add_argument('--num-hops', type=int, default=1,
@@ -18,39 +22,11 @@ def main():
    parser.add_argument('-o', '--output', required=True, type=str,
                        help='The output directory of the partitioned results')
    args = parser.parse_args()
-    data_path = args.data
-    num_parts = args.num_parts
-    num_hops = args.num_hops
-    method = args.method
-    output = args.output
-
-    glist, _ = load_graphs(data_path)
+    glist, _ = load_graphs(args.data)
    g = glist[0]
+    partition_graph(g, args.graph_name, args.num_parts, args.output,
+                    num_hops=args.num_hops, part_method=args.method)

-    if args.method == 'metis':
-        part_dict = dgl.transform.metis_partition(g, num_parts, num_hops)
-    elif args.method == 'random':
-        node_parts = np.random.choice(num_parts, g.number_of_nodes())
-        part_dict = dgl.transform.partition_graph_with_halo(g, node_parts, num_hops)
-    else:
-        raise Exception('unknown partitioning method: ' + args.method)
-
-    tot_num_inner_edges = 0
-    for part_id in part_dict:
-        part = part_dict[part_id]
-
-        num_inner_nodes = len(np.nonzero(F.asnumpy(part.ndata['inner_node']))[0])
-        num_inner_edges = len(np.nonzero(F.asnumpy(part.edata['inner_edge']))[0])
-        print('part {} has {} nodes and {} edges. {} nodes and {} edges are inside the partition'.format(
-              part_id, part.number_of_nodes(), part.number_of_edges(),
-              num_inner_nodes, num_inner_edges))
-        tot_num_inner_edges += num_inner_edges
-
-        # TODO I duplicate some node features.
-        part.copy_from_parent()
-        save_graphs(output + '/' + str(part_id) + '.dgl', [part])
-    print('there are {} edges in the graph and {} edge cuts for {} partitions.'.format(
-        g.number_of_edges(), g.number_of_edges() - tot_num_inner_edges, len(part_dict)))

 if __name__ == '__main__':
    main()