Unverified Commit 44089c8b authored by Minjie Wang's avatar Minjie Wang Committed by GitHub
Browse files

[Refactor][Graph] Merge DGLGraph and DGLHeteroGraph (#1862)



* Merge

* [Graph][CUDA] Graph on GPU and many refactoring (#1791)

* change edge_ids behavior and C++ impl

* fix unittests; remove utils.Index in edge_id

* pass mx and th tests

* pass tf test

* add aten::Scatter_

* Add nonzero; impl CSRGetDataAndIndices/CSRSliceMatrix

* CSRGetData and CSRGetDataAndIndices passed tests

* CSRSliceMatrix basic tests

* fix bug in empty slice

* CUDA CSRHasDuplicate

* has_node; has_edge_between

* predecessors, successors

* deprecate send/recv; fix send_and_recv

* deprecate send/recv; fix send_and_recv

* in_edges; out_edges; all_edges; apply_edges

* in deg/out deg

* subgraph/edge_subgraph

* adj

* in_subgraph/out_subgraph

* sample neighbors

* set/get_n/e_repr

* wip: working on refactoring all idtypes

* pass ndata/edata tests on gpu

* fix

* stash

* workaround nonzero issue

* stash

* nx conversion

* test_hetero_basics except update routines

* test_update_routines

* test_hetero_basics for pytorch

* more fixes

* WIP: flatten graph

* wip: flatten

* test_flatten

* test_to_device

* fix bug in to_homo

* fix bug in CSRSliceMatrix

* pass subgraph test

* fix send_and_recv

* fix filter

* test_heterograph

* passed all pytorch tests

* fix mx unittest

* fix pytorch test_nn

* fix all unittests for PyTorch

* passed all mxnet tests

* lint

* fix tf nn test

* pass all tf tests

* lint

* lint

* change deprecation

* try fix compile

* lint

* update METIDS

* fix utest

* fix

* fix utests

* try debug

* revert

* small fix

* fix utests

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* trigger

* +1s

* [kernel] Use heterograph index instead of unitgraph index (#1813)

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* trigger

* +1s

* [Graph] Mutation for Heterograph (#1818)

* mutation add_nodes and add_edges

* Add support for remove_edges, remove_nodes, add_selfloop, remove_selfloop

* Fix
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>

* upd

* upd

* upd

* fix

* [Transfom] Mutable transform (#1833)

* add nodesy

* All three

* Fix

* lint

* Add some test case

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* fix

* triger

* Fix

* fix
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>

* [Graph] Migrate Batch & Readout module to heterograph (#1836)

* dgl.batch

* unbatch

* fix to device

* reduce readout; segment reduce

* change batch_num_nodes|edges to function

* reduce readout/ softmax

* broadcast

* topk

* fix

* fix tf and mx

* fix some ci

* fix batch but unbatch differently

* new checkk

* upd

* upd

* upd

* idtype behavior; code reorg

* idtype behavior; code reorg

* wip: test_basics

* pass test_basics

* WIP: from nx/ to nx

* missing files

* upd

* pass test_basics:test_nx_conversion

* Fix test

* Fix inplace update

* WIP: fixing tests

* upd

* pass test_transform cpu

* pass gpu test_transform

* pass test_batched_graph

* GPU graph auto cast to int32

* missing file

* stash

* WIP: rgcn-hetero

* Fix two datasety

* upd

* weird

* Fix capsuley

* fuck you

* fuck matthias

* Fix dgmg

* fix bug in block degrees; pass rgcn-hetero

* rgcn

* gat and diffpool fix
also fix ppi and tu dataset

* Tree LSTM

* pointcloud

* rrn; wip: sgc

* resolve conflicts

* upd

* sgc and reddit dataset

* upd

* Fix deepwalk, gindt and gcn

* fix datasets and sign

* optimization

* optimization

* upd

* upd

* Fix GIN

* fix bug in add_nodes add_edges; tagcn

* adaptive sampling and gcmc

* upd

* upd

* fix geometric

* fix

* metapath2vec

* fix agnn

* fix pickling problem of block

* fix utests

* miss file

* linegraph

* upd

* upd

* upd

* graphsage

* stgcn_wave

* fix hgt

* on unittests

* Fix transformer

* Fix HAN

* passed pytorch unittests

* lint

* fix

* Fix cluster gcn

* cluster-gcn is ready

* on fixing block related codes

* 2nd order derivative

* Revert "2nd order derivative"

This reverts commit 523bf6c249bee61b51b1ad1babf42aad4167f206.

* passed torch utests again

* fix all mxnet unittests

* delete some useless tests

* pass all tf cpu tests

* disable

* disable distributed unittest

* fix

* fix

* lint

* fix

* fix

* fix script

* fix tutorial

* fix apply edges bug

* fix 2 basics

* fix tutorial
Co-authored-by: default avataryzh119 <expye@outlook.com>
Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-7-42.us-west-2.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-1-5.us-west-2.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-68-185.ec2.internal>
parent 015acfd2
......@@ -24,7 +24,7 @@ def random_walk(g, seeds, num_traces, num_hops):
Parameters
----------
g : DGLGraph
g : DGLGraphStale
The graph.
seeds : Tensor
The node ID tensor from which the random walk traces starts.
......@@ -89,7 +89,7 @@ def random_walk_with_restart(
Parameters
----------
g : DGLGraph
g : DGLGraphStale
The graph.
seeds : Tensor
The node ID tensor from which the random walk traces starts.
......@@ -143,7 +143,7 @@ def bipartite_single_sided_random_walk_with_restart(
Parameters
----------
g : DGLGraph
g : DGLGraphStale
The graph.
seeds : Tensor
The node ID tensor from which the random walk traces starts.
......
......@@ -12,8 +12,8 @@ from ..._ffi.ndarray import empty
from ... import utils
from ...nodeflow import NodeFlow
from ... import backend as F
from ...graph import DGLGraph
from ...base import NID, EID
from ...graph import DGLGraph as DGLGraphStale
from ...base import NID, EID, dgl_warning
try:
import Queue as queue
......@@ -237,8 +237,8 @@ class NeighborSampler(NodeFlowSampler):
Parameters
----------
g : DGLGraph
The DGLGraph where we sample NodeFlows.
g : DGLGraphStale
The DGLGraphStale where we sample NodeFlows.
batch_size : int
The batch size (i.e, the number of nodes in the last layer)
expand_factor : int
......@@ -314,9 +314,11 @@ class NeighborSampler(NodeFlowSampler):
super(NeighborSampler, self).__init__(
g, batch_size, seed_nodes, shuffle, num_workers * 2 if prefetch else 0,
ThreadPrefetchingWrapper)
dgl_warning('dgl.contrib.sampling.NeighborSampler is deprecated starting from v0.5.'
' Please read our guide<link> for how to use the new sampling APIs.')
assert g.is_readonly, "NeighborSampler doesn't support mutable graphs. " + \
"Please turn it into an immutable graph with DGLGraph.readonly"
"Please turn it into an immutable graph with DGLGraphStale.readonly"
assert isinstance(expand_factor, Integral), 'non-int expand_factor not supported'
self._expand_factor = int(expand_factor)
......@@ -364,8 +366,8 @@ class LayerSampler(NodeFlowSampler):
Parameters
----------
g : DGLGraph
The DGLGraph where we sample NodeFlows.
g : DGLGraphStale
The DGLGraphStale where we sample NodeFlows.
batch_size : int
The batch size (i.e, the number of nodes in the last layer)
layer_size: int
......@@ -413,7 +415,7 @@ class LayerSampler(NodeFlowSampler):
ThreadPrefetchingWrapper)
assert g.is_readonly, "LayerSampler doesn't support mutable graphs. " + \
"Please turn it into an immutable graph with DGLGraph.readonly"
"Please turn it into an immutable graph with DGLGraphStale.readonly"
assert node_prob is None, 'non-uniform node probability not supported'
self._num_workers = int(num_workers)
......@@ -432,7 +434,7 @@ class LayerSampler(NodeFlowSampler):
nflows = [NodeFlow(self.g, obj) for obj in nfobjs]
return nflows
class EdgeSubgraph(DGLGraph):
class EdgeSubgraph(DGLGraphStale):
''' The subgraph sampled from an edge sampler.
A user can access the head nodes and tail nodes of the subgraph directly.
......@@ -551,8 +553,8 @@ class EdgeSampler(object):
Parameters
----------
g : DGLGraph
The DGLGraph where we sample edges.
g : DGLGraphStale
The DGLGraphStale where we sample edges.
batch_size : int
The batch size (i.e, the number of edges from the graph)
seed_edges : tensor, optional
......@@ -785,7 +787,7 @@ def create_full_nodeflow(g, num_layers, add_self_loop=False):
Parameters
----------
g : DGLGraph
g : DGLGraphStale
a DGL graph
num_layers : int
The number of layers
......
"""Module for converting graph from/to other object."""
# pylint: disable=dangerous-default-value
from collections import defaultdict
import numpy as np
import scipy as sp
import networkx as nx
from . import backend as F
......@@ -18,11 +18,20 @@ __all__ = [
'heterograph',
'to_hetero',
'to_homo',
'from_scipy',
'from_networkx',
'to_networkx',
]
def graph(data, ntype='_N', etype='_E', num_nodes=None, card=None, validate=True,
restrict_format='auto', index_dtype="int64", **kwargs):
def graph(data,
ntype='_N', etype='_E',
num_nodes=None,
validate=True,
formats=['coo', 'csr', 'csc'],
idtype=None,
device=None,
card=None,
**deprecated_kwargs):
"""Create a graph with one type of nodes and edges.
In the sparse matrix perspective, :func:`dgl.graph` creates a graph
......@@ -46,28 +55,21 @@ def graph(data, ntype='_N', etype='_E', num_nodes=None, card=None, validate=True
num_nodes : int, optional
Number of nodes in the graph. If None, infer from input data, i.e.
the largest node ID plus 1. (Default: None)
card : int, optional
Deprecated (see :attr:`num_nodes`). Cardinality (number of nodes in the graph).
If None, infer from input data, i.e. the largest node ID plus 1. (Default: None)
validate : bool, optional
If True, check if node ids are within cardinality, the check process may take
some time. (Default: True)
If False and card is not None, user would receive a warning.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
kwargs : key-word arguments, optional
Other key word arguments. Only comes into effect when we are using a NetworkX
graph. It can consist of:
* edge_id_attr_name
``Str``, key name for edge ids in the NetworkX graph. If not found, we
will consider the graph not to have pre-specified edge ids.
* node_attrs
``List of str``, names for node features to retrieve from the NetworkX graph
* edge_attrs
``List of str``, names for edge features to retrieve from the NetworkX graph
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
idtype : int32, int64, optional
Integer ID type. Valid options are int32 or int64. If None, try infer from
the given data.
device : Device context, optional
Device on which the graph is created. Default: infer from data.
card : int, optional
Deprecated (see :attr:`num_nodes`). Cardinality (number of nodes in the graph).
If None, infer from input data, i.e. the largest node ID plus 1. (Default: None)
Returns
-------
......@@ -124,35 +126,41 @@ def graph(data, ntype='_N', etype='_E', num_nodes=None, card=None, validate=True
ndata_schemes={}
edata_schemes={})
"""
if len(deprecated_kwargs) != 0:
raise DGLError("Key word arguments {} have been removed from dgl.graph()."
" They are moved to dgl.from_scipy() and dgl.from_networkx()."
" Please refer to their API documents for more details.".format(
deprecated_kwargs.keys()))
if isinstance(data, DGLHeteroGraph):
return data.astype(idtype).to(device)
if card is not None:
dgl_warning("Argument 'card' will be deprecated. "
"Please use num_nodes={} instead.".format(card))
num_nodes = card
if num_nodes is not None:
u, v, urange, vrange = utils.graphdata2tensors(data, idtype)
if num_nodes is not None: # override the number of nodes
urange, vrange = num_nodes, num_nodes
else:
urange, vrange = None, None
if isinstance(data, tuple):
u, v = data
return create_from_edges(
u, v, ntype, etype, ntype, urange, vrange, validate,
restrict_format=restrict_format, index_dtype=index_dtype)
elif isinstance(data, list):
return create_from_edge_list(
data, ntype, etype, ntype, urange, vrange, validate,
restrict_format=restrict_format, index_dtype=index_dtype)
elif isinstance(data, sp.sparse.spmatrix):
return create_from_scipy(
data, ntype, etype, ntype, restrict_format=restrict_format, index_dtype=index_dtype)
elif isinstance(data, nx.Graph):
return create_from_networkx(
data, ntype, etype, restrict_format=restrict_format, index_dtype=index_dtype, **kwargs)
else:
raise DGLError('Unsupported graph data type:', type(data))
def bipartite(data, utype='_U', etype='_E', vtype='_V', num_nodes=None, card=None,
validate=True, restrict_format='auto', index_dtype='int64', **kwargs):
g = create_from_edges(u, v, ntype, etype, ntype, urange, vrange,
validate, formats=formats)
if device is None:
return utils.to_int32_graph_if_on_gpu(g)
else:
return g.to(device)
def bipartite(data,
utype='_U', etype='_E', vtype='_V',
num_nodes=None,
validate=True,
formats=['coo', 'csr', 'csc'],
idtype=None,
device=None,
card=None,
**deprecated_kwargs):
"""Create a bipartite graph.
The result graph is directed and edges must be from ``utype`` nodes
......@@ -181,25 +189,22 @@ def bipartite(data, utype='_U', etype='_E', vtype='_V', num_nodes=None, card=Non
num_nodes : 2-tuple of int, optional
Number of nodes in the source and destination group. If None, infer from input data,
i.e. the largest node ID plus 1 for each type. (Default: None)
card : 2-tuple of int, optional
Deprecated (see :attr:`num_nodes`). Cardinality (number of nodes in the source and
destination group). If None, infer from input data, i.e. the largest node ID plus 1
for each type. (Default: None)
validate : bool, optional
If True, check if node ids are within cardinality, the check process may take
some time. (Default: True)
If False and card is not None, user would receive a warning.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
kwargs : key-word arguments, optional
Other key word arguments. Only comes into effect when we are using a NetworkX
graph. It can consist of:
* edge_id_attr_name
``Str``, key name for edge ids in the NetworkX graph. If not found, we
will consider the graph not to have pre-specified edge ids.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
idtype : int32, int64, optional
Integer ID type. Valid options are int32 or int64. If None, try infer from
the given data.
device : Device context, optional
Device on which the graph is created. Default: infer from data.
card : 2-tuple of int, optional
Deprecated (see :attr:`num_nodes`). Cardinality (number of nodes in the source and
destination group). If None, infer from input data, i.e. the largest node ID plus 1
for each type. (Default: None)
Returns
-------
......@@ -273,34 +278,31 @@ def bipartite(data, utype='_U', etype='_E', vtype='_V', num_nodes=None, card=Non
num_edges={('_U', '_E', '_V'): 3},
metagraph=[('_U', '_V')])
"""
if len(deprecated_kwargs) != 0:
raise DGLError("Key word arguments {} have been removed from dgl.graph()."
" They are moved to dgl.from_scipy() and dgl.from_networkx()."
" Please refer to their API documents for more details.".format(
deprecated_kwargs.keys()))
if utype == vtype:
raise DGLError('utype should not be equal to vtype. Use ``dgl.graph`` instead.')
if card is not None:
dgl_warning("Argument 'card' will be deprecated. "
"Please use num_nodes={} instead.".format(card))
num_nodes = card
if num_nodes is not None:
u, v, urange, vrange = utils.graphdata2tensors(data, idtype, bipartite=True)
if num_nodes is not None: # override the number of nodes
urange, vrange = num_nodes
g = create_from_edges(
u, v, utype, etype, vtype, urange, vrange, validate,
formats=formats)
if device is None:
return utils.to_int32_graph_if_on_gpu(g)
else:
urange, vrange = None, None
if isinstance(data, tuple):
u, v = data
return create_from_edges(
u, v, utype, etype, vtype, urange, vrange, validate, index_dtype=index_dtype,
restrict_format=restrict_format)
elif isinstance(data, list):
return create_from_edge_list(
data, utype, etype, vtype, urange, vrange, validate, index_dtype=index_dtype,
restrict_format=restrict_format)
elif isinstance(data, sp.sparse.spmatrix):
return create_from_scipy(
data, utype, etype, vtype, restrict_format=restrict_format, index_dtype=index_dtype)
elif isinstance(data, nx.Graph):
return create_from_networkx_bipartite(data, utype, etype, vtype,
restrict_format=restrict_format,
index_dtype=index_dtype, **kwargs)
else:
raise DGLError('Unsupported graph data type:', type(data))
return g.to(device)
def hetero_from_relations(rel_graphs, num_nodes_per_type=None):
"""Create a heterograph from graphs representing connections of each relation.
......@@ -370,6 +372,8 @@ def hetero_from_relations(rel_graphs, num_nodes_per_type=None):
('developer', 'develops', 'game'): 2},
metagraph=[('user', 'user'), ('user', 'game'), ('developer', 'game')])
"""
utils.check_all_same_idtype(rel_graphs, 'rel_graphs')
utils.check_all_same_device(rel_graphs, 'rel_graphs')
# TODO(minjie): this API can be generalized as a union operation of the input graphs
# TODO(minjie): handle node/edge data
# infer meta graph
......@@ -392,11 +396,7 @@ def hetero_from_relations(rel_graphs, num_nodes_per_type=None):
ntypes = list(sorted(num_nodes_per_type.keys()))
num_nodes_per_type = utils.toindex([num_nodes_per_type[ntype] for ntype in ntypes], "int64")
ntype_dict = {ntype: i for i, ntype in enumerate(ntypes)}
index_dtype = rel_graphs[0]._idtype_str
for rgrh in rel_graphs:
if rgrh._idtype_str != index_dtype:
raise Exception("Expect relation graphs to be {}, but got {}".format(
index_dtype, rgrh._idtype_str))
stype, etype, dtype = rgrh.canonical_etypes[0]
meta_edges_src.append(ntype_dict[stype])
meta_edges_dst.append(ntype_dict[dtype])
......@@ -414,7 +414,12 @@ def hetero_from_relations(rel_graphs, num_nodes_per_type=None):
retg._edge_frames[i].update(rgrh._edge_frames[0])
return retg
def heterograph(data_dict, num_nodes_dict=None, restrict_format='auto', index_dtype='int64'):
def heterograph(data_dict,
num_nodes_dict=None,
validate=True,
formats=['coo', 'csr', 'csc'],
idtype=None,
device=None):
"""Create a heterogeneous graph from a dictionary between edge types and edge lists.
Parameters
......@@ -432,11 +437,18 @@ def heterograph(data_dict, num_nodes_dict=None, restrict_format='auto', index_dt
By default DGL infers the number of nodes for each node type from ``data_dict``
by taking the maximum node ID plus one for each node type.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
validate : bool, optional
If True, check if node ids are within cardinality, the check process may take
some time. (Default: True)
If False and num_nodes_dict is not None, user would receive a warning.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
idtype : int32, int64, optional
Integer ID type. Valid options are int32 or int64. If None, try infer from
the given data.
device : Device context, optional
Device on which the graph is created. Default: infer from data.
Returns
-------
......@@ -450,70 +462,42 @@ def heterograph(data_dict, num_nodes_dict=None, restrict_format='auto', index_dt
... ('developer', 'develops', 'game'): [(0, 0), (1, 1)],
... })
"""
rel_graphs = []
# Try infer idtype
if idtype is None:
for data in data_dict.values():
if isinstance(data, tuple) and len(data) == 2 and F.is_tensor(data[0]):
idtype = F.dtype(data[0])
break
# Convert all data to edge tensors first.
data_dict = {(sty, ety, dty) : utils.graphdata2tensors(data, idtype, bipartite=(sty != dty))
for (sty, ety, dty), data in data_dict.items()}
# infer number of nodes for each node type
if num_nodes_dict is None:
num_nodes_dict = defaultdict(int)
for (srctype, etype, dsttype), data in data_dict.items():
if isinstance(data, tuple):
src = utils.toindex(data[0]).tonumpy()
dst = utils.toindex(data[1]).tonumpy()
nsrc = (src.max() + 1) if len(src) > 0 else 0
ndst = (dst.max() + 1) if len(dst) > 0 else 0
elif isinstance(data, list):
if len(data) == 0:
nsrc = ndst = 0
else:
src, dst = zip(*data)
src = utils.toindex(src).tonumpy()
dst = utils.toindex(dst).tonumpy()
nsrc = src.max() + 1
ndst = dst.max() + 1
elif isinstance(data, sp.sparse.spmatrix):
nsrc = data.shape[0]
ndst = data.shape[1]
elif isinstance(data, nx.Graph):
if data.number_of_nodes() == 0:
nsrc = ndst = 0
elif srctype == dsttype:
nsrc = ndst = data.number_of_nodes()
else:
nsrc = len({n for n, d in data.nodes(data=True) if d['bipartite'] == 0})
ndst = data.number_of_nodes() - nsrc
elif isinstance(data, DGLHeteroGraph):
# original node type and edge type of ``data`` is ignored.
assert len(data.canonical_etypes) == 1, \
"Relational graphs must have only one edge type."
srctype, _, dsttype = data.canonical_etypes[0]
nsrc = data.number_of_nodes(srctype)
ndst = data.number_of_nodes(dsttype)
else:
raise DGLError('Unsupported graph data type %s for %s' % (
type(data), (srctype, etype, dsttype)))
if srctype == dsttype:
ndst = nsrc = max(nsrc, ndst)
_, _, nsrc, ndst = data
num_nodes_dict[srctype] = max(num_nodes_dict[srctype], nsrc)
num_nodes_dict[dsttype] = max(num_nodes_dict[dsttype], ndst)
rel_graphs = []
for (srctype, etype, dsttype), data in data_dict.items():
if isinstance(data, DGLHeteroGraph):
rel_graphs.append(data)
elif srctype == dsttype:
u, v, _, _ = data
if srctype == dsttype:
rel_graphs.append(graph(
data, srctype, etype,
(u, v), srctype, etype,
num_nodes=num_nodes_dict[srctype],
validate=False,
restrict_format=restrict_format,
index_dtype=index_dtype))
validate=validate,
formats=formats,
idtype=idtype, device=device))
else:
rel_graphs.append(bipartite(
data, srctype, etype, dsttype,
(u, v), srctype, etype, dsttype,
num_nodes=(num_nodes_dict[srctype], num_nodes_dict[dsttype]),
validate=False,
restrict_format=restrict_format,
index_dtype=index_dtype))
validate=validate,
formats=formats,
idtype=idtype, device=device))
return hetero_from_relations(rel_graphs, num_nodes_dict)
......@@ -615,7 +599,8 @@ def to_hetero(G, ntypes, etypes, ntype_field=NTYPE, etype_field=ETYPE,
' type of nodes and edges.')
num_ntypes = len(ntypes)
index_dtype = G._idtype_str
idtype = G.idtype
device = G.device
ntype_ids = F.asnumpy(G.ndata[ntype_field])
etype_ids = F.asnumpy(G.edata[etype_field])
......@@ -669,13 +654,14 @@ def to_hetero(G, ntypes, etypes, ntype_field=NTYPE, etype_field=ETYPE,
if stid == dtid:
rel_graph = graph(
(src_of_etype, dst_of_etype), ntypes[stid], etypes[etid],
num_nodes=ntype_count[stid], validate=False, index_dtype=index_dtype)
num_nodes=ntype_count[stid], validate=False,
idtype=idtype, device=device)
else:
rel_graph = bipartite(
(src_of_etype,
dst_of_etype), ntypes[stid], etypes[etid], ntypes[dtid],
num_nodes=(ntype_count[stid], ntype_count[dtid]),
validate=False, index_dtype=index_dtype)
validate=False, idtype=idtype, device=device)
rel_graphs.append(rel_graph)
hg = hetero_from_relations(rel_graphs,
......@@ -753,8 +739,9 @@ def to_homo(G):
for ntype_id, ntype in enumerate(G.ntypes):
num_nodes = G.number_of_nodes(ntype)
total_num_nodes += num_nodes
# Type ID is always in int64
ntype_ids.append(F.full_1d(num_nodes, ntype_id, F.int64, F.cpu()))
nids.append(F.arange(0, num_nodes, G._idtype_str))
nids.append(F.arange(0, num_nodes, G.idtype))
for etype_id, etype in enumerate(G.canonical_etypes):
srctype, _, dsttype = etype
......@@ -762,11 +749,12 @@ def to_homo(G):
num_edges = len(src)
srcs.append(src + int(offset_per_ntype[G.get_ntype_id(srctype)]))
dsts.append(dst + int(offset_per_ntype[G.get_ntype_id(dsttype)]))
# Type ID is always in int64
etype_ids.append(F.full_1d(num_edges, etype_id, F.int64, F.cpu()))
eids.append(F.arange(0, num_edges, G._idtype_str))
eids.append(F.arange(0, num_edges, G.idtype))
retg = graph((F.cat(srcs, 0), F.cat(dsts, 0)), num_nodes=total_num_nodes,
validate=False, index_dtype=G._idtype_str)
validate=False, idtype=G.idtype, device=G.device)
# copy features
comb_nf = combine_frames(G._node_frames, range(len(G.ntypes)))
......@@ -777,213 +765,90 @@ def to_homo(G):
retg.edata.update(comb_ef)
# assign node type and id mapping field.
retg.ndata[NTYPE] = F.cat(ntype_ids, 0)
retg.ndata[NID] = F.cat(nids, 0)
retg.edata[ETYPE] = F.cat(etype_ids, 0)
retg.edata[EID] = F.cat(eids, 0)
retg.ndata[NTYPE] = F.copy_to(F.cat(ntype_ids, 0), G.device)
retg.ndata[NID] = F.copy_to(F.cat(nids, 0), G.device)
retg.edata[ETYPE] = F.copy_to(F.cat(etype_ids, 0), G.device)
retg.edata[EID] = F.copy_to(F.cat(eids, 0), G.device)
return retg
############################################################
# Internal APIs
############################################################
def create_from_edges(u, v, utype, etype, vtype, urange=None, vrange=None, validate=True,
restrict_format="auto", index_dtype='int64'):
"""Internal function to create a graph from incident nodes with types.
utype could be equal to vtype
def from_scipy(sp_mat,
ntype='_N', etype='_E',
eweight_name=None,
formats=['coo', 'csr', 'csc'],
idtype=None):
"""Create a DGLGraph from a SciPy sparse matrix.
Parameters
----------
u : iterable of int
List of source node IDs.
v : iterable of int
List of destination node IDs.
utype : str
Source node type name.
etype : str
Edge type name.
vtype : str
Destination node type name.
urange : int, optional
The source node ID range. If None, the value is the maximum
of the source node IDs in the edge list plus 1. (Default: None)
vrange : int, optional
The destination node ID range. If None, the value is the
maximum of the destination node IDs in the edge list plus 1. (Default: None)
validate : bool, optional
If True, checks if node IDs are within range.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
Returns
-------
DGLHeteroGraph
"""
u = utils.toindex(u, index_dtype)
v = utils.toindex(v, index_dtype)
if validate:
if urange is not None and len(u) > 0 and \
urange <= int(F.asnumpy(F.max(u.tousertensor(), dim=0))):
raise DGLError('Invalid node id {} (should be less than cardinality {}).'.format(
urange, int(F.asnumpy(F.max(u.tousertensor(), dim=0)))))
if vrange is not None and len(v) > 0 and \
vrange <= int(F.asnumpy(F.max(v.tousertensor(), dim=0))):
raise DGLError('Invalid node id {} (should be less than cardinality {}).'.format(
vrange, int(F.asnumpy(F.max(v.tousertensor(), dim=0)))))
urange = urange or (
0 if len(u) == 0 else (int(F.asnumpy(F.max(u.tousertensor(), dim=0))) + 1))
vrange = vrange or (
0 if len(v) == 0 else (int(F.asnumpy(F.max(v.tousertensor(), dim=0))) + 1))
if utype == vtype:
urange = vrange = max(urange, vrange)
num_ntypes = 1
else:
num_ntypes = 2
hgidx = heterograph_index.create_unitgraph_from_coo(
num_ntypes, urange, vrange, u, v, restrict_format)
if utype == vtype:
return DGLHeteroGraph(hgidx, [utype], [etype])
else:
return DGLHeteroGraph(hgidx, [utype, vtype], [etype])
def create_from_edge_list(elist, utype, etype, vtype, urange=None, vrange=None,
validate=True, restrict_format='auto', index_dtype='int64'):
"""Internal function to create a heterograph from a list of edge tuples with types.
utype could be equal to vtype
Parameters
----------
elist : iterable of int pairs
List of (src, dst) node ID pairs.
utype : str
Source node type name.
etype : str
Edge type name.
vtype : str
Destination node type name.
urange : int, optional
The source node ID range. If None, the value is the maximum
of the source node IDs in the edge list plus 1. (Default: None)
vrange : int, optional
The destination node ID range. If None, the value is the
maximum of the destination node IDs in the edge list plus 1. (Default: None)
validate : bool, optional
If True, checks if node IDs are within range.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
Returns
-------
DGLHeteroGraph
"""
if len(elist) == 0:
u, v = [], []
else:
u, v = zip(*elist)
u = list(u)
v = list(v)
return create_from_edges(u, v, utype, etype, vtype, urange, vrange,
validate, restrict_format, index_dtype=index_dtype)
def create_from_scipy(spmat, utype, etype, vtype, with_edge_id=False,
restrict_format='auto', index_dtype='int64'):
"""Internal function to create a heterograph from a scipy sparse matrix with types.
Parameters
----------
spmat : scipy.sparse.spmatrix
The adjacency matrix whose rows represent sources and columns
represent destinations.
utype : str
Source node type name.
sp_mat : SciPy sparse matrix
SciPy sparse matrix.
ntype : str
Type name for both source and destination nodes
etype : str
Edge type name.
vtype : str
Destination node type name.
with_edge_id : bool
If True, the entries in the sparse matrix are treated as edge IDs.
Otherwise, the entries are ignored and edges will be added in
(source, destination) order.
Note that this option only affects CSR matrices; COO matrices' rows and cols
are always assumed to be ordered by edge ID already.
validate : bool, optional
If True, checks if node IDs are within range.
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
Type name for edges
eweight_name : str, optional
If given, the edge weights in the matrix will be
stored in ``edata[eweight_name]``.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
idtype : int32, int64, optional
Integer ID type. Must be int32 or int64. Default: int64.
Returns
-------
DGLHeteroGraph
g : DGLGraph
"""
num_src, num_dst = spmat.shape
num_ntypes = 1 if utype == vtype else 2
if spmat.getformat() == 'coo':
row = utils.toindex(spmat.row.astype(index_dtype), index_dtype)
col = utils.toindex(spmat.col.astype(index_dtype), index_dtype)
hgidx = heterograph_index.create_unitgraph_from_coo(
num_ntypes, num_src, num_dst, row, col, restrict_format)
else:
spmat = spmat.tocsr()
indptr = utils.toindex(spmat.indptr.astype(index_dtype), index_dtype)
indices = utils.toindex(spmat.indices.astype(index_dtype), index_dtype)
# TODO(minjie): with_edge_id is only reasonable for csr matrix. How to fix?
data = utils.toindex(spmat.data if with_edge_id else list(range(len(indices))), index_dtype)
hgidx = heterograph_index.create_unitgraph_from_csr(
num_ntypes, num_src, num_dst, indptr, indices, data, restrict_format)
if num_ntypes == 1:
return DGLHeteroGraph(hgidx, [utype], [etype])
else:
return DGLHeteroGraph(hgidx, [utype, vtype], [etype])
u, v, urange, vrange = utils.graphdata2tensors(sp_mat, idtype)
g = create_from_edges(u, v, ntype, etype, ntype, urange, vrange,
validate=False, formats=formats)
if eweight_name is not None:
g.edata[eweight_name] = F.tensor(sp_mat.data)
return g
def create_from_networkx(nx_graph,
ntype, etype,
edge_id_attr_name='id',
node_attrs=None,
edge_attrs=None,
restrict_format='auto',
index_dtype='int64'):
"""Create a heterograph that has only one set of nodes and edges.
def from_networkx(nx_graph, *,
ntype='_N', etype='_E',
node_attrs=None,
edge_attrs=None,
edge_id_attr_name='id',
formats=['coo', 'csr', 'csc'],
idtype=None):
"""Create a DGLGraph from networkx.
Parameters
----------
nx_graph : NetworkX graph
nx_graph : networkx.Graph
NetworkX graph.
ntype : str
Type name for both source and destination nodes
etype : str
Type name for edges
edge_id_attr_name : str, optional
Key name for edge ids in the NetworkX graph. If not found, we
will consider the graph not to have pre-specified edge ids. (Default: 'id')
node_attrs : list of str
Names for node features to retrieve from the NetworkX graph (Default: None)
edge_attrs : list of str
Names for edge features to retrieve from the NetworkX graph (Default: None)
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto', optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
edge_id_attr_name : str, optional
Key name for edge ids in the NetworkX graph. If not found, we
will consider the graph not to have pre-specified edge ids. (Default: 'id')
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
idtype : int32, int64, optional
Integer ID type. Must be int32 or int64. Default: int64.
Returns
-------
g : DGLHeteroGraph
g : DGLGraph
"""
# Relabel nodes using consecutive integers
nx_graph = nx.convert_node_labels_to_integers(nx_graph, ordering='sorted')
if not nx_graph.is_directed():
nx_graph = nx_graph.to_directed()
# Relabel nodes using consecutive integers
nx_graph = nx.convert_node_labels_to_integers(nx_graph, ordering='sorted')
g = graph(nx_graph, ntype, etype,
formats=formats,
idtype=idtype)
# nx_graph.edges(data=True) returns src, dst, attr_dict
if nx_graph.number_of_edges() > 0:
......@@ -991,26 +856,6 @@ def create_from_networkx(nx_graph,
else:
has_edge_id = False
if has_edge_id:
num_edges = nx_graph.number_of_edges()
src = np.zeros((num_edges,), dtype=getattr(np, index_dtype))
dst = np.zeros((num_edges,), dtype=getattr(np, index_dtype))
for u, v, attr in nx_graph.edges(data=True):
eid = attr[edge_id_attr_name]
src[eid] = u
dst[eid] = v
else:
src = []
dst = []
for e in nx_graph.edges:
src.append(e[0])
dst.append(e[1])
src = utils.toindex(src, index_dtype)
dst = utils.toindex(dst, index_dtype)
num_nodes = nx_graph.number_of_nodes()
g = create_from_edges(src, dst, ntype, etype, ntype, num_nodes, num_nodes,
validate=False, restrict_format=restrict_format, index_dtype=index_dtype)
# handle features
# copy attributes
def _batcher(lst):
......@@ -1025,7 +870,7 @@ def create_from_networkx(nx_graph,
for attr in node_attrs:
attr_dict[attr].append(nx_graph.nodes[nid][attr])
for attr in node_attrs:
g.ndata[attr] = _batcher(attr_dict[attr])
g.ndata[attr] = F.copy_to(_batcher(attr_dict[attr]), g.device)
if edge_attrs is not None:
# mapping from feature name to a list of tensors to be concatenated
......@@ -1041,7 +886,7 @@ def create_from_networkx(nx_graph,
' smaller than the number of edges --'
' {}, got {}.'.format(num_edges, attrs['id']))
for key in edge_attrs:
attr_dict[key][attrs['id']] = attrs[key]
attr_dict[key][attrs[edge_id_attr_name]] = attrs[key]
else:
# XXX: assuming networkx iteration order is deterministic
# so the order is the same as graph_index.from_networkx
......@@ -1052,90 +897,10 @@ def create_from_networkx(nx_graph,
for val in attr_dict[attr]:
if val is None:
raise DGLError('Not all edges have attribute {}.'.format(attr))
g.edata[attr] = _batcher(attr_dict[attr])
g.edata[attr] = F.copy_to(_batcher(attr_dict[attr]), g.device)
return g
def create_from_networkx_bipartite(nx_graph,
utype, etype, vtype,
edge_id_attr_name='id',
node_attrs=None,
edge_attrs=None,
restrict_format='auto',
index_dtype='int64'):
"""Create a heterograph that has one set of source nodes, one set of
destination nodes and one set of edges.
Parameters
----------
nx_graph : NetworkX graph
The input graph must follow the bipartite graph convention of networkx.
Each node has an attribute ``bipartite`` with values 0 and 1 indicating
which set it belongs to. Only edges from node set 0 to node set 1 are
added to the returned graph.
utype : str
Source node type name.
etype : str
Edge type name.
vtype : str
Destination node type name.
edge_id_attr_name : str, optional
Key name for edge ids in the NetworkX graph. If not found, we
will consider the graph not to have pre-specified edge ids. (Default: 'id')
node_attrs : list of str
Names for node features to retrieve from the NetworkX graph (Default: None)
edge_attrs : list of str
Names for edge features to retrieve from the NetworkX graph (Default: None)
restrict_format : 'any', 'coo', 'csr', 'csc', 'auto' optional
Force the storage format. Default: 'auto' (i.e. let DGL decide what to use).
index_dtype : 'int32', 'int64', optional
Force the index data type. Default: 'int64'.
Returns
-------
g : DGLHeteroGraph
"""
if not nx_graph.is_directed():
nx_graph = nx_graph.to_directed()
top_nodes = {n for n, d in nx_graph.nodes(data=True) if d['bipartite'] == 0}
bottom_nodes = set(nx_graph) - top_nodes
top_nodes = sorted(top_nodes)
bottom_nodes = sorted(bottom_nodes)
top_map = {n : i for i, n in enumerate(top_nodes)}
bottom_map = {n : i for i, n in enumerate(bottom_nodes)}
if nx_graph.number_of_edges() > 0:
has_edge_id = edge_id_attr_name in next(iter(nx_graph.edges(data=True)))[-1]
else:
has_edge_id = False
if has_edge_id:
num_edges = nx_graph.number_of_edges()
src = np.zeros((num_edges,), dtype=getattr(np, index_dtype))
dst = np.zeros((num_edges,), dtype=getattr(np, index_dtype))
for u, v, attr in nx_graph.edges(data=True):
eid = attr[edge_id_attr_name]
src[eid] = top_map[u]
dst[eid] = bottom_map[v]
else:
src = []
dst = []
for e in nx_graph.edges:
if e[0] in top_map:
src.append(top_map[e[0]])
dst.append(bottom_map[e[1]])
src = utils.toindex(src, index_dtype)
dst = utils.toindex(dst, index_dtype)
g = create_from_edges(src, dst, utype, etype, vtype,
len(top_nodes), len(bottom_nodes), validate=False,
restrict_format=restrict_format, index_dtype=index_dtype)
# TODO attributes
assert node_attrs is None, 'Retrieval of node attributes are not supported yet.'
assert edge_attrs is None, 'Retrieval of edge attributes are not supported yet.'
return g
def to_networkx(g, node_attrs=None, edge_attrs=None):
"""Convert to networkx graph.
......@@ -1156,4 +921,91 @@ def to_networkx(g, node_attrs=None, edge_attrs=None):
networkx.DiGraph
The nx graph
"""
return g.to_networkx(node_attrs, edge_attrs)
if g.device != F.cpu():
raise DGLError('Cannot convert a CUDA graph to networkx. Call g.cpu() first.')
if not g.is_homogeneous():
raise DGLError('dgl.to_networkx only supports homogeneous graphs.')
src, dst = g.edges()
src = F.asnumpy(src)
dst = F.asnumpy(dst)
# xiangsx: Always treat graph as multigraph
nx_graph = nx.MultiDiGraph()
nx_graph.add_nodes_from(range(g.number_of_nodes()))
for eid, (u, v) in enumerate(zip(src, dst)):
nx_graph.add_edge(u, v, id=eid)
if node_attrs is not None:
for nid, attr in nx_graph.nodes(data=True):
feat_dict = g._get_n_repr(0, nid)
attr.update({key: F.squeeze(feat_dict[key], 0) for key in node_attrs})
if edge_attrs is not None:
for _, _, attr in nx_graph.edges(data=True):
eid = attr['id']
feat_dict = g._get_e_repr(0, eid)
attr.update({key: F.squeeze(feat_dict[key], 0) for key in edge_attrs})
return nx_graph
DGLHeteroGraph.to_networkx = to_networkx
############################################################
# Internal APIs
############################################################
def create_from_edges(u, v,
utype, etype, vtype,
urange, vrange,
validate=True,
formats=['coo', 'csr', 'csc']):
"""Internal function to create a graph from incident nodes with types.
utype could be equal to vtype
Parameters
----------
u : Tensor
Source node IDs.
v : Tensor
Dest node IDs.
utype : str
Source node type name.
etype : str
Edge type name.
vtype : str
Destination node type name.
urange : int, optional
The source node ID range. If None, the value is the maximum
of the source node IDs in the edge list plus 1. (Default: None)
vrange : int, optional
The destination node ID range. If None, the value is the
maximum of the destination node IDs in the edge list plus 1. (Default: None)
validate : bool, optional
If True, checks if node IDs are within range.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
Returns
-------
DGLHeteroGraph
"""
if validate:
if urange is not None and len(u) > 0 and \
urange <= F.as_scalar(F.max(u, dim=0)):
raise DGLError('Invalid node id {} (should be less than cardinality {}).'.format(
urange, F.as_scalar(F.max(u, dim=0))))
if vrange is not None and len(v) > 0 and \
vrange <= F.as_scalar(F.max(v, dim=0)):
raise DGLError('Invalid node id {} (should be less than cardinality {}).'.format(
vrange, F.as_scalar(F.max(v, dim=0))))
if utype == vtype:
num_ntypes = 1
else:
num_ntypes = 2
hgidx = heterograph_index.create_unitgraph_from_coo(
num_ntypes, urange, vrange, u, v, formats)
if utype == vtype:
return DGLHeteroGraph(hgidx, [utype], [etype])
else:
return DGLHeteroGraph(hgidx, [utype, vtype], [etype])
......@@ -13,8 +13,9 @@ import os, sys
from .utils import download, extract_archive, get_download_dir, _get_dgl_url
from ..utils import retry_method_with_fix
from ..graph import DGLGraph
from ..graph import batch as graph_batch
from .. import convert
from .. import batch
from .. import backend as F
_urls = {
'cora_v2' : 'dataset/cora_v2.zip',
......@@ -133,12 +134,12 @@ class CitationGraphDataset(object):
def __getitem__(self, idx):
assert idx == 0, "This dataset has only one graph"
g = DGLGraph(self.graph)
g.ndata['train_mask'] = self.train_mask
g.ndata['val_mask'] = self.val_mask
g.ndata['test_mask'] = self.test_mask
g.ndata['label'] = self.labels
g.ndata['feat'] = self.features
g = convert.graph(self.graph)
g.ndata['train_mask'] = F.tensor(self.train_mask, F.bool)
g.ndata['val_mask'] = F.tensor(self.val_mask, F.bool)
g.ndata['test_mask'] = F.tensor(self.test_mask, F.bool)
g.ndata['label'] = F.tensor(self.labels, F.int64)
g.ndata['feat'] = F.tensor(self.features, F.float32)
return g
def __len__(self):
......@@ -327,13 +328,13 @@ class CoraBinary(object):
for line in f.readlines():
if line.startswith('graph'):
if len(elist) != 0:
self.graphs.append(DGLGraph(elist))
self.graphs.append(convert.graph(elist))
elist = []
else:
u, v = line.strip().split(' ')
elist.append((int(u), int(v)))
if len(elist) != 0:
self.graphs.append(DGLGraph(elist))
self.graphs.append(convert.graph(elist))
with open("{}/pmpds.pkl".format(root), 'rb') as f:
self.pmpds = _pickle_load(f)
self.labels = []
......@@ -359,9 +360,9 @@ class CoraBinary(object):
return (self.graphs[i], self.pmpds[i], self.labels[i])
@staticmethod
def collate_fn(batch):
graphs, pmpds, labels = zip(*batch)
batched_graphs = graph_batch(graphs)
def collate_fn(cur):
graphs, pmpds, labels = zip(*cur)
batched_graphs = batch.batch(graphs)
batched_pmpds = sp.block_diag(pmpds)
batched_labels = np.concatenate(labels, axis=0)
return batched_graphs, batched_pmpds, batched_labels
......
......@@ -14,7 +14,8 @@ from .. import backend as F
from .utils import download, extract_archive, get_download_dir, _get_dgl_url
from ..utils import retry_method_with_fix
from ..graph import DGLGraph
from ..convert import graph
from .. import backend as F
_url = 'https://raw.githubusercontent.com/weihua916/powerful-gnns/master/dataset.zip'
......@@ -23,10 +24,10 @@ class GINDataset(object):
"""Datasets for Graph Isomorphism Network (GIN)
Adapted from https://github.com/weihua916/powerful-gnns/blob/master/dataset.zip.
The dataset contains the compact format of popular graph kernel datasets, which includes:
The dataset contains the compact format of popular graph kernel datasets, which includes:
MUTAG, COLLAB, IMDBBINARY, IMDBMULTI, NCI1, PROTEINS, PTC, REDDITBINARY, REDDITMULTI5K
This datset class processes all data sets listed above. For more graph kernel datasets,
This datset class processes all data sets listed above. For more graph kernel datasets,
see :class:`TUDataset`
Paramters
......@@ -144,7 +145,7 @@ class GINDataset(object):
self.labels.append(self.glabel_dict[glabel])
g = DGLGraph()
g = graph([])
g.add_nodes(n_nodes)
nlabels = [] # node labels
......@@ -178,30 +179,30 @@ class GINDataset(object):
m_edges += nrow[1]
g.add_edges(j, nrow[2:])
# add self loop
if self.self_loop:
m_edges += 1
g.add_edge(j, j)
if (j + 1) % 10 == 0 and self.verbosity is True:
print(
'processing node {} of graph {}...'.format(
j + 1, i + 1))
print('this node has {} edgs.'.format(
nrow[1]))
# Add self loops
if self.self_loop:
m_edges += n_nodes
g.add_edges(F.arange(0, n_nodes), F.arange(0, n_nodes))
if nattrs != []:
nattrs = np.stack(nattrs)
g.ndata['attr'] = nattrs
g.ndata['attr'] = F.tensor(nattrs)
self.nattrs_flag = True
else:
nattrs = None
g.ndata['label'] = np.asarray(nlabels)
g.ndata['label'] = F.tensor(np.asarray(nlabels))
if len(self.nlabel_dict) > 1:
self.nlabels_flag = True
assert len(g) == n_nodes
assert g.number_of_nodes() == n_nodes
# update statistics of graphs
self.n += n_nodes
......@@ -238,8 +239,8 @@ class GINDataset(object):
label2idx = self.nlabel_dict
for g in self.graphs:
g.ndata['attr'] = np.zeros((
g.number_of_nodes(), len(label2idx)))
g.ndata['attr'] = F.tensor(np.zeros((
g.number_of_nodes(), len(label2idx))))
g.ndata['attr'][range(g.number_of_nodes()), [label2idx[F.as_scalar(nl)] for nl in g.ndata['label']]] = 1
# after load, get the #classes and #dim
......
......@@ -3,7 +3,7 @@ import math
import networkx as nx
import numpy as np
from ..graph import DGLGraph
from .. import convert
__all__ = ['MiniGCDataset']
......@@ -77,7 +77,7 @@ class MiniGCDataset(object):
self._gen_circular_ladder(self.num_graphs - len(self.graphs))
# preprocess
for i in range(self.num_graphs):
self.graphs[i] = DGLGraph(self.graphs[i])
self.graphs[i] = convert.graph(self.graphs[i])
# add self edges
nodes = self.graphs[i].nodes()
self.graphs[i].add_edges(nodes, nodes)
......
......@@ -8,7 +8,7 @@ from networkx.readwrite import json_graph
from .utils import download, extract_archive, get_download_dir, _get_dgl_url
from ..utils import retry_method_with_fix
from ..graph import DGLGraph
from ..convert import from_networkx
_url = 'dataset/ppi.zip'
......@@ -54,13 +54,13 @@ class PPIDataset(object):
numpy.ndarry object, it's shape is [n, v],
n is the number of nodes, v is the feature's dimension;
train/test/valid_labels.npy=> the labels of the input nodes, it
is a numpy ndarry, it's like[[0, 0, 1, ... 0],
is a numpy ndarry, it's like[[0, 0, 1, ... 0],
[0, 1, 1, 0 ...1]], shape of it is n*h, n is the number of nodes,
h is the label's dimension;
train/test/valid/_graph_id.npy => the element in it indicates which
graph the nodes belong to, it is a one dimensional numpy.ndarray
object and the length of it is equal the number of nodes,
it's like [1, 1, 2, 1...20].
it's like [1, 1, 2, 1...20].
"""
print('Loading G...')
if self.mode == 'train':
......@@ -68,21 +68,21 @@ class PPIDataset(object):
g_data = json.load(jsonfile)
self.labels = np.load('{}/ppi/train_labels.npy'.format(self._dir))
self.features = np.load('{}/ppi/train_feats.npy'.format(self._dir))
self.graph = DGLGraph(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph = from_networkx(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph_id = np.load('{}/ppi/train_graph_id.npy'.format(self._dir))
if self.mode == 'valid':
with open('{}/ppi/valid_graph.json'.format(self._dir)) as jsonfile:
g_data = json.load(jsonfile)
self.labels = np.load('{}/ppi/valid_labels.npy'.format(self._dir))
self.features = np.load('{}/ppi/valid_feats.npy'.format(self._dir))
self.graph = DGLGraph(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph = from_networkx(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph_id = np.load('{}/ppi/valid_graph_id.npy'.format(self._dir))
if self.mode == 'test':
with open('{}/ppi/test_graph.json'.format(self._dir)) as jsonfile:
g_data = json.load(jsonfile)
self.labels = np.load('{}/ppi/test_labels.npy'.format(self._dir))
self.features = np.load('{}/ppi/test_feats.npy'.format(self._dir))
self.graph = DGLGraph(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph = from_networkx(nx.DiGraph(json_graph.node_link_graph(g_data)))
self.graph_id = np.load('{}/ppi/test_graph_id.npy'.format(self._dir))
def _preprocess(self):
......
......@@ -5,8 +5,8 @@ import numpy as np
import os, sys
from .utils import download, extract_archive, get_download_dir, _get_dgl_url
from ..utils import retry_method_with_fix
from ..graph import DGLGraph
from .. import backend as F
from .. import convert
class RedditDataset(object):
def __init__(self, self_loop=False):
......@@ -31,14 +31,13 @@ class RedditDataset(object):
# graph
coo_adj = sp.load_npz(os.path.join(
self._extract_dir, "reddit{}_graph.npz".format(self._self_loop_str)))
self.graph = DGLGraph(coo_adj, readonly=True)
self.graph = convert.graph(coo_adj)
# features and labels
reddit_data = np.load(os.path.join(self._extract_dir, "reddit_data.npz"))
self.features = reddit_data["feature"]
self.labels = reddit_data["label"]
self.num_labels = 41
# tarin/val/test indices
node_ids = reddit_data["node_ids"]
node_types = reddit_data["node_types"]
self.train_mask = (node_types == 1)
self.val_mask = (node_types == 2)
......@@ -55,13 +54,12 @@ class RedditDataset(object):
def __getitem__(self, idx):
assert idx == 0, "Reddit Dataset only has one graph"
g = self.graph
g.ndata['train_mask'] = self.train_mask
g.ndata['val_mask'] = self.val_mask
g.ndata['test_mask'] = self.test_mask
g.ndata['feat'] = self.features
g.ndata['label'] = self.labels
return g
self.graph.ndata['train_mask'] = F.tensor(self.train_mask, dtype=F.bool)
self.graph.ndata['val_mask'] = F.tensor(self.val_mask, dtype=F.bool)
self.graph.ndata['test_mask'] = F.tensor(self.test_mask, dtype=F.bool)
self.graph.ndata['feat'] = F.tensor(self.features, dtype=F.float32)
self.graph.ndata['label'] = F.tensor(self.labels, dtype=F.int64)
return self.graph
def __len__(self):
return 1
......@@ -6,8 +6,8 @@ import numpy as np
import numpy.random as npr
import scipy as sp
from ..graph import DGLGraph, batch
from ..utils import Index
from .. import convert
from .. import batch
def sbm(n_blocks, block_size, p, q, rng=None):
""" (Symmetric) Stochastic Block Model
......@@ -77,7 +77,6 @@ class SBMMixture:
block_size = n_nodes // n_communities
self._k = k
self._avg_deg = avg_deg
self._gs = [DGLGraph() for i in range(n_graphs)]
if type(pq) is list:
assert len(pq) == n_graphs
elif type(pq) is str:
......@@ -85,14 +84,10 @@ class SBMMixture:
pq = [generator() for i in range(n_graphs)]
else:
raise RuntimeError()
adjs = [sbm(n_communities, block_size, *x) for x in pq]
for g, adj in zip(self._gs, adjs):
g.from_scipy_sparse_matrix(adj)
self._gs = [convert.graph(sbm(n_communities, block_size, *x)) for x in pq]
self._lgs = [g.line_graph(backtracking=False) for g in self._gs]
in_degrees = lambda g: g.in_degrees(
Index(np.arange(0, g.number_of_nodes()))).unsqueeze(1).float()
self._g_degs = [in_degrees(g) for g in self._gs]
self._lg_degs = [in_degrees(lg) for lg in self._lgs]
self._g_degs = [g.in_degrees().float() for g in self._gs]
self._lg_degs = [lg.in_degrees().float() for lg in self._lgs]
self._pm_pds = list(zip(*[g.edges() for g in self._gs]))[0]
def __len__(self):
......@@ -112,8 +107,8 @@ class SBMMixture:
def collate_fn(self, x):
g, lg, deg_g, deg_lg, pm_pd = zip(*x)
g_batch = batch(g)
lg_batch = batch(lg)
g_batch = batch.batch(g)
lg_batch = batch.batch(lg)
degg_batch = np.concatenate(deg_g, axis=0)
deglg_batch = np.concatenate(deg_lg, axis=0)
pm_pd_batch = np.concatenate([x + i * self._n_nodes for i, x in enumerate(pm_pd)], axis=0)
......
......@@ -12,7 +12,7 @@ import numpy as np
import os
from .. import backend as F
from ..graph import DGLGraph
from ..convert import from_networkx
from .utils import download, extract_archive, get_download_dir, _get_dgl_url
from ..utils import retry_method_with_fix
......@@ -121,8 +121,7 @@ class SST(object):
# add root
g.add_node(0, x=SST.PAD_WORD, y=int(root.label()), mask=0)
_rec_build(0, root)
ret = DGLGraph()
ret.from_networkx(g, node_attrs=['x', 'y', 'mask'])
ret = from_networkx(g, node_attrs=['x', 'y', 'mask'])
return ret
def __getitem__(self, idx):
......
......@@ -5,7 +5,8 @@ import random
from .utils import download, extract_archive, get_download_dir, loadtxt
from ..utils import retry_method_with_fix
from ..graph import DGLGraph
from ..convert import graph
from .. import backend as F
class LegacyTUDataset(object):
"""
......@@ -30,7 +31,7 @@ class LegacyTUDataset(object):
self.name = name
self.hidden_size = hidden_size
self.extract_dir = self._get_extract_dir()
self._load()
self._load(use_pandas, max_allow_node)
def _get_extract_dir(self):
download_dir = get_download_dir()
......@@ -42,11 +43,18 @@ class LegacyTUDataset(object):
return extract_dir
def _download(self):
download_dir = get_download_dir()
zip_file_path = os.path.join(
download_dir,
"tu_{}.zip".format(
self.name))
download(self._url.format(self.name), path=zip_file_path)
extract_dir = os.path.join(download_dir, "tu_{}".format(self.name))
extract_archive(zip_file_path, extract_dir)
@retry_method_with_fix(_download)
def _load(self):
def _load(self, use_pandas, max_allow_node):
self.data_mode = None
self.max_allow_node = max_allow_node
......@@ -63,7 +71,7 @@ class LegacyTUDataset(object):
DS_graph_labels = self._idx_from_zero(
np.genfromtxt(self._file_path("graph_labels"), dtype=int))
g = DGLGraph()
g = graph([])
g.add_nodes(int(DS_edge_list.max()) + 1)
g.add_edges(DS_edge_list[:, 0], DS_edge_list[:, 1])
......@@ -75,17 +83,17 @@ class LegacyTUDataset(object):
if len(node_idx[0]) > self.max_num_node:
self.max_num_node = len(node_idx[0])
self.graph_lists = g.subgraphs(node_idx_list)
self.graph_lists = [g.subgraph(node_idx) for node_idx in node_idx_list]
self.num_labels = max(DS_graph_labels) + 1
self.graph_labels = DS_graph_labels
try:
DS_node_labels = self._idx_from_zero(
np.loadtxt(self._file_path("node_labels"), dtype=int))
g.ndata['node_label'] = DS_node_labels
g.ndata['node_label'] = F.tensor(DS_node_labels)
one_hot_node_labels = self._to_onehot(DS_node_labels)
for idxs, g in zip(node_idx_list, self.graph_lists):
g.ndata['feat'] = one_hot_node_labels[idxs, :]
g.ndata['feat'] = F.tensor(one_hot_node_labels[idxs, :])
self.data_mode = "node_label"
except IOError:
print("No Node Label Data")
......@@ -96,7 +104,7 @@ class LegacyTUDataset(object):
if DS_node_attr.ndim == 1:
DS_node_attr = np.expand_dims(DS_node_attr, -1)
for idxs, g in zip(node_idx_list, self.graph_lists):
g.ndata['feat'] = DS_node_attr[idxs, :]
g.ndata['feat'] = F.tensor(DS_node_attr[idxs, :])
self.data_mode = "node_attr"
except IOError:
print("No Node Attribute Data")
......@@ -170,7 +178,7 @@ class TUDataset(object):
Graphs may have node labels, node attributes, edge labels, and edge attributes,
varing from different dataset.
:param name: Dataset Name, such as `ENZYMES`, `DD`, `COLLAB`, `MUTAG`, can be the
:param name: Dataset Name, such as `ENZYMES`, `DD`, `COLLAB`, `MUTAG`, can be the
datasets name on https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets.
"""
......
......@@ -183,6 +183,9 @@ class Column(object):
else:
return Column(data)
def __repr__(self):
return repr(self.data)
class Frame(MutableMapping):
"""The columnar storage for node/edge features.
......@@ -493,6 +496,10 @@ class Frame(MutableMapping):
"""Return the keys."""
return self._columns.keys()
def values(self):
"""Return the values."""
return self._columns.values()
def clone(self):
"""Return a clone of this frame.
......@@ -636,6 +643,10 @@ class FrameRef(MutableMapping):
"""Return the keys."""
return self._frame.keys()
def values(self):
"""Return the values."""
return self._frame.values()
def __getitem__(self, key):
"""Get data from the frame.
......
"""Module for various graph generator functions."""
# pylint: disable= dangerous-default-value
from . import backend as F
from . import convert
......@@ -6,7 +7,8 @@ from . import random
__all__ = ['rand_graph', 'rand_bipartite']
def rand_graph(num_nodes, num_edges, restrict_format='any'):
def rand_graph(num_nodes, num_edges, idtype=F.int64, device=F.cpu(),
formats=['coo', 'csr', 'csc']):
"""Generate a random graph of the given number of nodes/edges.
It uniformly chooses ``num_edges`` from all pairs and form a graph.
......@@ -19,8 +21,13 @@ def rand_graph(num_nodes, num_edges, restrict_format='any'):
The number of nodes
num_edges : int
The number of edges
restrict_format : 'any', 'coo', 'csr', 'csc', optional
Force the storage format. Default: 'any' (i.e. let DGL decide what to use).
idtype : int32, int64, optional
Integer ID type. Must be int32 or int64. Default: int64.
device : Device context, optional
Device on which the graph is created. Default: CPU.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
Returns
-------
......@@ -28,14 +35,17 @@ def rand_graph(num_nodes, num_edges, restrict_format='any'):
Generated random graph.
"""
eids = random.choice(num_nodes * num_nodes, num_edges, replace=False)
rows = F.astype(eids / num_nodes, F.dtype(eids))
cols = F.astype(eids % num_nodes, F.dtype(eids))
rows = F.copy_to(F.astype(eids / num_nodes, idtype), device)
cols = F.copy_to(F.astype(eids % num_nodes, idtype), device)
g = convert.graph((rows, cols),
num_nodes=num_nodes, validate=False,
restrict_format=restrict_format)
formats=formats,
idtype=idtype, device=device)
return g
def rand_bipartite(num_src_nodes, num_dst_nodes, num_edges, restrict_format='any'):
def rand_bipartite(num_src_nodes, num_dst_nodes, num_edges,
idtype=F.int64, device=F.cpu(),
formats=['csr', 'coo', 'csc']):
"""Generate a random bipartite graph of the given number of src/dst nodes and
number of edges.
......@@ -47,10 +57,15 @@ def rand_bipartite(num_src_nodes, num_dst_nodes, num_edges, restrict_format='any
The number of source nodes, the :math:`|U|` in :math:`G=(U,V,E)`.
num_dst_nodes : int
The number of destination nodes, the :math:`|V|` in :math:`G=(U,V,E)`.
num_edges : int
num_edges : int
The number of edges
restrict_format : 'any', 'coo', 'csr', 'csc', optional
Force the storage format. Default: 'any' (i.e. let DGL decide what to use).
idtype : int32, int64, optional
Integer ID type. Must be int32 or int64. Default: int64.
device : Device context, optional
Device on which the graph is created. Default: CPU.
formats : str or list of str
It can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of them,
Force the storage formats. Default: ``['coo', 'csr', 'csc']``.
Returns
-------
......@@ -58,9 +73,10 @@ def rand_bipartite(num_src_nodes, num_dst_nodes, num_edges, restrict_format='any
Generated random bipartite graph.
"""
eids = random.choice(num_src_nodes * num_dst_nodes, num_edges, replace=False)
rows = F.astype(eids / num_dst_nodes, F.dtype(eids))
cols = F.astype(eids % num_dst_nodes, F.dtype(eids))
rows = F.copy_to(F.astype(eids / num_dst_nodes, idtype), device)
cols = F.copy_to(F.astype(eids % num_dst_nodes, idtype), device)
g = convert.bipartite((rows, cols),
num_nodes=(num_src_nodes, num_dst_nodes), validate=False,
restrict_format=restrict_format)
idtype=idtype, device=device,
formats=formats)
return g
......@@ -4038,7 +4038,7 @@ class DGLGraph(DGLBaseGraph):
self._node_frame = old_nframe
self._edge_frame = old_eframe
def is_homograph(self):
def is_homogeneous(self):
"""Return if the graph is homogeneous."""
return True
......
......@@ -4,6 +4,7 @@ from collections import defaultdict
from collections.abc import Mapping
from contextlib import contextmanager
import copy
import numbers
import networkx as nx
import numpy as np
......@@ -11,12 +12,10 @@ from . import graph_index
from . import heterograph_index
from . import utils
from . import backend as F
from . import init
from .runtime import ir, scheduler, Runtime, GraphAdapter
from .frame import Frame, FrameRef, frame_like
from .view import HeteroNodeView, HeteroNodeDataView, HeteroEdgeView, HeteroEdgeDataView
from .base import ALL, SLICE_FULL, NTYPE, NID, ETYPE, EID, is_all, DGLError, dgl_warning
from .udf import NodeBatch, EdgeBatch
from ._ffi.function import _init_api
__all__ = ['DGLHeteroGraph', 'combine_names']
......@@ -193,19 +192,33 @@ class DGLHeteroGraph(object):
"""
is_block = False
# pylint: disable=unused-argument
# pylint: disable=unused-argument, dangerous-default-value
def __init__(self,
gidx,
ntypes,
etypes,
gidx=[],
ntypes=['_U'],
etypes=['_V'],
node_frames=None,
edge_frames=None):
edge_frames=None,
**deprecate_kwargs):
if isinstance(gidx, DGLHeteroGraph):
raise DGLError('The input is already a DGLGraph. No need to create it again.')
if not isinstance(gidx, heterograph_index.HeteroGraphIndex):
dgl_warning('Recommend creating graphs by `dgl.graph(data)`'
' instead of `dgl.DGLGraph(data)`.')
u, v, num_src, num_dst = utils.graphdata2tensors(gidx)
gidx = heterograph_index.create_unitgraph_from_coo(
1, num_src, num_dst, u, v, ['coo', 'csr', 'csc'])
if len(deprecate_kwargs) != 0:
dgl_warning('Keyword arguments {} are deprecated in v0.5, and can be safely'
' removed in all cases.'.format(list(deprecate_kwargs.keys())))
self._init(gidx, ntypes, etypes, node_frames, edge_frames)
def _init(self, gidx, ntypes, etypes, node_frames, edge_frames):
"""Init internal states."""
self._graph = gidx
self._canonical_etypes = None
self._batch_num_nodes = None
self._batch_num_edges = None
# Handle node types
if isinstance(ntypes, tuple):
......@@ -273,46 +286,41 @@ class DGLHeteroGraph(object):
for i, frame in enumerate(edge_frames)]
self._edge_frames = edge_frames
# message indicators
self._msg_indices = [None] * len(self._etypes)
self._msg_frames = []
for i in range(len(self._etypes)):
frame = FrameRef(Frame(num_rows=self._graph.number_of_edges(i)))
frame.set_initializer(init.zero_initializer)
self._msg_frames.append(frame)
def __getstate__(self):
if self.is_block:
ntypes = (self.srctypes, self.dsttypes)
else:
ntypes = self._ntypes
return self._graph, ntypes, self._etypes, self._node_frames, self._edge_frames
metainfo = (self._ntypes, self._etypes, self._canonical_etypes,
self._srctypes_invmap, self._dsttypes_invmap,
self._is_unibipartite, self._etype2canonical, self._etypes_invmap)
return (self._graph, metainfo,
self._node_frames, self._edge_frames,
self._batch_num_nodes, self._batch_num_edges)
def __setstate__(self, state):
# Compatibility check
# TODO: version the storage
if isinstance(state, tuple) and len(state) == 5:
# DGL 0.4.3+
if isinstance(state, tuple) and len(state) == 6:
# DGL >= 0.5
#TODO(minjie): too many states in python; should clean up and lower to C
self._nx_metagraph = None
(self._graph, metainfo, self._node_frames, self._edge_frames,
self._batch_num_nodes, self._batch_num_edges) = state
(self._ntypes, self._etypes, self._canonical_etypes,
self._srctypes_invmap, self._dsttypes_invmap,
self._is_unibipartite, self._etype2canonical,
self._etypes_invmap) = metainfo
elif isinstance(state, tuple) and len(state) == 5:
# DGL == 0.4.3
dgl_warning("The object is pickled with DGL == 0.4.3. "
"Some of the original attributes are ignored.")
self._init(*state)
elif isinstance(state, dict):
# DGL 0.4.2-
dgl_warning("The object is pickled with DGL version 0.4.2-. "
# DGL <= 0.4.2
dgl_warning("The object is pickled with DGL <= 0.4.2. "
"Some of the original attributes are ignored.")
self._init(state['_graph'], state['_ntypes'], state['_etypes'], state['_node_frames'],
state['_edge_frames'])
else:
raise IOError("Unrecognized pickle format.")
def _get_msg_index(self, etid):
"""Internal function for getting the message index array of the given edge type id."""
if self._msg_indices[etid] is None:
self._msg_indices[etid] = utils.zero_index(
size=self._graph.number_of_edges(etid))
return self._msg_indices[etid]
def _set_msg_index(self, etid, index):
self._msg_indices[etid] = index
def __repr__(self):
if len(self.ntypes) == 1 and len(self.etypes) == 1:
ret = ('Graph(num_nodes={node}, num_edges={edge},\n'
......@@ -332,33 +340,539 @@ class DGLHeteroGraph(object):
meta = str(self.metagraph.edges(keys=True))
return ret.format(node=nnode_dict, edge=nedge_dict, meta=meta)
def __copy__(self):
"""Shallow copy implementation."""
#TODO(minjie): too many states in python; should clean up and lower to C
cls = type(self)
obj = cls.__new__(cls)
obj._graph = self._graph
obj._batch_num_nodes = self._batch_num_nodes
obj._batch_num_edges = self._batch_num_edges
obj._ntypes = self._ntypes
obj._etypes = self._etypes
obj._canonical_etypes = self._canonical_etypes
obj._srctypes_invmap = self._srctypes_invmap
obj._dsttypes_invmap = self._dsttypes_invmap
obj._is_unibipartite = self._is_unibipartite
obj._etype2canonical = self._etype2canonical
obj._etypes_invmap = self._etypes_invmap
obj._nx_metagraph = self._nx_metagraph
obj._node_frames = self._node_frames
obj._edge_frames = self._edge_frames
return obj
#################################################################
# Mutation operations
#################################################################
def add_nodes(self, num, data=None, ntype=None):
"""Add multiple new nodes of the same node type
r"""Add new nodes of the same node type
Parameters
----------
num : int
Number of nodes to add.
data : dict, optional
Feature data of the added nodes.
ntype : str, optional
The type of the new nodes. Can be omitted if there is
only one node type in the graph.
Notes
-----
* Inplace update is applied to the current graph.
* If the key of ``data`` does not contain some existing feature fields,
those features for the new nodes will be created by initializers
defined with :func:`set_n_initializer` (default initializer fills zeros).
* If the key of ``data`` contains new feature fields, those features for
the old nodes will be created by initializers defined with
:func:`set_n_initializer` (default initializer fills zeros).
Examples
--------
The following example uses PyTorch backend.
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
>>> g = dgl.graph((torch.tensor([0, 1]), torch.tensor([1, 2])))
>>> g.num_nodes()
3
>>> g.add_nodes(2)
>>> g.num_nodes()
5
If the graph has some node features and new nodes are added without
features, their features will be created by initializers defined
with :func:`set_n_initializer`.
>>> g.ndata['h'] = torch.ones(5, 1)
>>> g.add_nodes(1)
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [1.], [1.], [0.]])
We can also assign features for the new nodes in adding new nodes.
>>> g.add_nodes(1, {'h': torch.ones(1, 1), 'w': torch.ones(1, 1)})
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [1.], [1.], [0.], [1.]])
Since ``data`` contains new feature fields, the features for old nodes
will be created by initializers defined with :func:`set_n_initializer`.
>>> g.ndata['w']
tensor([[0.], [0.], [0.], [0.], [0.], [0.], [1.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.add_nodes(2)
DGLError: Node type name must be specified
if there are more than one node types.
>>> g.num_nodes('user')
3
>>> g.add_nodes(2, ntype='user')
>>> g.num_nodes('user')
5
Currently not supported.
See Also
--------
remove_nodes
add_edges
remove_edges
"""
raise DGLError('Mutation is not supported in heterograph.')
# TODO(xiangsx): block do not support add_nodes
if ntype is None:
if self._graph.number_of_ntypes() != 1:
raise DGLError('Node type name must be specified if there are more than one '
'node types.')
# nothing happen
if num == 0:
return
assert num > 0, 'Number of new nodes should be larger than one.'
ntid = self.get_ntype_id(ntype)
# update graph idx
metagraph = self._graph.metagraph
num_nodes_per_type = []
for c_ntype in self.ntypes:
if self.get_ntype_id(c_ntype) == ntid:
num_nodes_per_type.append(self.number_of_nodes(c_ntype) + num)
else:
num_nodes_per_type.append(self.number_of_nodes(c_ntype))
relation_graphs = []
for c_etype in self.canonical_etypes:
# src or dst == ntype, update the relation graph
if self.get_ntype_id(c_etype[0]) == ntid or self.get_ntype_id(c_etype[2]) == ntid:
u, v = self.edges(form='uv', order='eid', etype=c_etype)
hgidx = heterograph_index.create_unitgraph_from_coo(
1 if c_etype[0] == c_etype[2] else 2,
self.number_of_nodes(c_etype[0]) + \
(num if self.get_ntype_id(c_etype[0]) == ntid else 0),
self.number_of_nodes(c_etype[2]) + \
(num if self.get_ntype_id(c_etype[2]) == ntid else 0),
u,
v,
['coo', 'csr', 'csc'])
relation_graphs.append(hgidx)
else:
# do nothing
relation_graphs.append(self._graph.get_relation_graph(self.get_etype_id(c_etype)))
hgidx = heterograph_index.create_heterograph_from_relations(
metagraph, relation_graphs, utils.toindex(num_nodes_per_type, "int64"))
self._graph = hgidx
# update data frames
if data is None:
# Initialize feature with :func:`set_n_initializer`
self._node_frames[ntid].add_rows(num)
else:
self._node_frames[ntid].append(data)
self._reset_cached_info()
def add_edge(self, u, v, data=None, etype=None):
"""Add an edge of ``etype`` between u of the source node type, and v
of the destination node type..
"""Add one edge to the graph.
Currently not supported.
DEPRECATED: please use ``add_edges``.
"""
raise DGLError('Mutation is not supported in heterograph.')
dgl_warning("DGLGraph.add_edge is deprecated. Please use DGLGraph.add_edges")
self.add_edges(u, v, data, etype)
def add_edges(self, u, v, data=None, etype=None):
"""Add multiple edges of ``etype`` between list of source nodes ``u``
and list of destination nodes ``v`` of type ``vtype``. A single edge
is added between every pair of ``u[i]`` and ``v[i]``.
r"""Add multiple new edges for the specified edge type
The i-th new edge will be from ``u[i]`` to ``v[i]``.
Parameters
----------
u : int, tensor, numpy.ndarray, list
Source node IDs, ``u[i]`` gives the source node for the i-th new edge.
v : int, tensor, numpy.ndarray, list
Destination node IDs, ``v[i]`` gives the destination node for the i-th new edge.
data : dict, optional
Feature data of the added edges. The i-th row of the feature data
corresponds to the i-th new edge.
etype : str or tuple of str, optional
The type of the new edges. Can be omitted if there is
only one edge type in the graph.
Notes
-----
* Inplace update is applied to the current graph.
* If end nodes of adding edges does not exists, add_nodes is invoked
to add new nodes. The node features of the new nodes will be created
by initializers defined with :func:`set_n_initializer` (default
initializer fills zeros). In certain cases, it is recommanded to
add_nodes first and then add_edges.
* If the key of ``data`` does not contain some existing feature fields,
those features for the new edges will be created by initializers
defined with :func:`set_n_initializer` (default initializer fills zeros).
* If the key of ``data`` contains new feature fields, those features for
the old edges will be created by initializers defined with
:func:`set_n_initializer` (default initializer fills zeros).
Examples
--------
The following example uses PyTorch backend.
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Edge Type**
>>> g = dgl.graph((torch.tensor([0, 1]), torch.tensor([1, 2])))
>>> g.num_edges()
2
>>> g.add_edges(torch.tensor([1, 3]), torch.tensor([0, 1]))
>>> g.num_edges()
4
Since ``u`` or ``v`` contains a non-existing node ID, the nodes are
added implicitly.
>>> g.num_nodes()
4
If the graph has some edge features and new edges are added without
features, their features will be created by initializers defined
with :func:`set_n_initializer`.
>>> g.edata['h'] = torch.ones(4, 1)
>>> g.add_edges(torch.tensor([1]), torch.tensor([1]))
>>> g.edata['h']
tensor([[1.], [1.], [1.], [1.], [0.]])
We can also assign features for the new edges in adding new edges.
>>> g.add_edges(torch.tensor([0, 0]), torch.tensor([2, 2]),
>>> {'h': torch.tensor([[1.], [2.]]), 'w': torch.ones(2, 1)})
>>> g.edata['h']
tensor([[1.], [1.], [1.], [1.], [0.], [1.], [2.]])
Since ``data`` contains new feature fields, the features for old edges
will be created by initializers defined with :func:`set_n_initializer`.
>>> g.edata['w']
tensor([[0.], [0.], [0.], [0.], [0.], [1.], [1.]])
**Heterogeneous Graphs with Multiple Edge Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.add_edges(torch.tensor([3]), torch.tensor([3]))
DGLError: Edge type name must be specified
if there are more than one edge types.
>>> g.number_of_edges('plays')
4
>>> g.add_edges(torch.tensor([3]), torch.tensor([3]), etype='plays')
>>> g.number_of_edges('plays')
5
See Also
--------
add_nodes
remove_nodes
remove_edges
"""
# TODO(xiangsx): block do not support add_edges
u = utils.prepare_tensor(self, u, 'u')
v = utils.prepare_tensor(self, v, 'v')
if etype is None:
if self._graph.number_of_etypes() != 1:
raise DGLError('Edge type name must be specified if there are more than one '
'edge types.')
# nothing changed
if len(u) == 0 or len(v) == 0:
return
assert len(u) == len(v) or len(u) == 1 or len(v) == 1, \
'The number of source nodes and the number of destination nodes should be same, ' \
'or either the number of source nodes or the number of destination nodes is 1.'
if len(u) == 1 and len(v) > 1:
u = F.full_1d(len(v), F.as_scalar(u), dtype=F.dtype(u), ctx=F.context(u))
if len(v) == 1 and len(u) > 1:
v = F.full_1d(len(u), F.as_scalar(v), dtype=F.dtype(v), ctx=F.context(v))
u_type, e_type, v_type = self.to_canonical_etype(etype)
# if end nodes of adding edges does not exists
# use add_nodes to add new nodes first.
num_of_u = self.number_of_nodes(u_type)
num_of_v = self.number_of_nodes(v_type)
u_max = F.as_scalar(F.max(u, dim=0)) + 1
v_max = F.as_scalar(F.max(v, dim=0)) + 1
if u_type == v_type:
num_nodes = max(u_max, v_max)
if num_nodes > num_of_u:
self.add_nodes(num_nodes - num_of_u, ntype=u_type)
else:
if u_max > num_of_u:
self.add_nodes(u_max - num_of_u, ntype=u_type)
if v_max > num_of_v:
self.add_nodes(v_max - num_of_v, ntype=v_type)
# metagraph is not changed
metagraph = self._graph.metagraph
num_nodes_per_type = []
for ntype in self.ntypes:
num_nodes_per_type.append(self.number_of_nodes(ntype))
# update graph idx
relation_graphs = []
for c_etype in self.canonical_etypes:
# the target edge type
if c_etype == (u_type, e_type, v_type):
old_u, old_v = self.edges(form='uv', order='eid', etype=c_etype)
hgidx = heterograph_index.create_unitgraph_from_coo(
1 if u_type == v_type else 2,
self.number_of_nodes(u_type),
self.number_of_nodes(v_type),
F.cat([old_u, u], dim=0),
F.cat([old_v, v], dim=0),
['coo', 'csr', 'csc'])
relation_graphs.append(hgidx)
else:
# do nothing
# Note: node range change has been handled in add_nodes()
relation_graphs.append(self._graph.get_relation_graph(self.get_etype_id(c_etype)))
hgidx = heterograph_index.create_heterograph_from_relations(
metagraph, relation_graphs, utils.toindex(num_nodes_per_type, "int64"))
self._graph = hgidx
# handle data
etid = self.get_etype_id(etype)
if data is None:
self._edge_frames[etid].add_rows(len(u))
else:
self._edge_frames[etid].append(data)
self._reset_cached_info()
def remove_edges(self, eids, etype=None):
r"""Remove multiple edges with the specified edge type
Nodes will not be removed. After removing edges, the rest
edges will be re-indexed using consecutive integers from 0,
with their relative order preserved.
The features for the removed edges will be removed accordingly.
Parameters
----------
eids : int, tensor, numpy.ndarray, list
IDs for the edges to remove.
etype : str or tuple of str, optional
The type of the edges to remove. Can be omitted if there is
only one edge type in the graph.
Examples
--------
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Edge Type**
>>> g = dgl.graph((torch.tensor([0, 0, 2]), torch.tensor([0, 1, 2])))
>>> g.edata['he'] = torch.arange(3).float().reshape(-1, 1)
>>> g.remove_edges(torch.tensor([0, 1]))
>>> g
Graph(num_nodes=3, num_edges=1,
ndata_schemes={}
edata_schemes={'he': Scheme(shape=(1,), dtype=torch.float32)})
>>> g.edges('all')
(tensor([2]), tensor([2]), tensor([0]))
>>> g.edata['he']
tensor([[2.]])
**Heterogeneous Graphs with Multiple Edge Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.remove_edges(torch.tensor([0, 1]))
DGLError: Edge type name must be specified
if there are more than one edge types.
>>> g.remove_edges(torch.tensor([0, 1]), 'plays')
>>> g.edges('all', etype='plays')
(tensor([0, 1]), tensor([0, 0]), tensor([0, 1]))
See Also
--------
add_nodes
add_edges
remove_nodes
"""
# TODO(xiangsx): block do not support remove_edges
if etype is None:
if self._graph.number_of_etypes() != 1:
raise DGLError('Edge type name must be specified if there are more than one ' \
'edge types.')
eids = utils.prepare_tensor(self, eids, 'u')
if len(eids) == 0:
# no edge to delete
return
assert self.number_of_edges(etype) > F.as_scalar(F.max(eids, dim=0)), \
'The input eid {} is out of the range [0:{})'.format(
F.as_scalar(F.max(eids, dim=0)), self.number_of_edges(etype))
# edge_subgraph
edges = {}
u_type, e_type, v_type = self.to_canonical_etype(etype)
for c_etype in self.canonical_etypes:
# the target edge type
if c_etype == (u_type, e_type, v_type):
origin_eids = self.edges(form='eid', order='eid', etype=c_etype)
edges[c_etype] = utils.compensate(eids, origin_eids)
else:
edges[c_etype] = self.edges(form='eid', order='eid', etype=c_etype)
sub_g = self.edge_subgraph(edges, preserve_nodes=True)
self._graph = sub_g._graph
self._node_frames = sub_g._node_frames
self._edge_frames = sub_g._edge_frames
def remove_nodes(self, nids, ntype=None):
r"""Remove multiple nodes with the specified node type
Edges that connect to the nodes will be removed as well. After removing
nodes and edges, the rest nodes and edges will be re-indexed using
consecutive integers from 0, with their relative order preserved.
The features for the removed nodes/edges will be removed accordingly.
Parameters
----------
nids : int, tensor, numpy.ndarray, list
Nodes to remove.
ntype : str, optional
The type of the nodes to remove. Can be omitted if there is
only one node type in the graph.
Examples
--------
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
>>> g = dgl.graph((torch.tensor([0, 0, 2]), torch.tensor([0, 1, 2])))
>>> g.ndata['hv'] = torch.arange(3).float().reshape(-1, 1)
>>> g.edata['he'] = torch.arange(3).float().reshape(-1, 1)
>>> g.remove_nodes(torch.tensor([0, 1]))
>>> g
Graph(num_nodes=1, num_edges=1,
ndata_schemes={'hv': Scheme(shape=(1,), dtype=torch.float32)}
edata_schemes={'he': Scheme(shape=(1,), dtype=torch.float32)})
>>> g.ndata['hv']
tensor([[2.]])
>>> g.edata['he']
tensor([[2.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.remove_nodes(torch.tensor([0, 1]))
DGLError: Node type name must be specified
if there are more than one node types.
>>> g.remove_nodes(torch.tensor([0, 1]), ntype='game')
>>> g.num_nodes('user')
3
>>> g.num_nodes('game')
0
>>> g.num_edges('plays')
0
See Also
--------
add_nodes
add_edges
remove_edges
"""
# TODO(xiangsx): block do not support remove_nodes
if ntype is None:
if self._graph.number_of_ntypes() != 1:
raise DGLError('Node type name must be specified if there are more than one ' \
'node types.')
Currently not supported.
nids = utils.prepare_tensor(self, nids, 'u')
if len(nids) == 0:
# no node to delete
return
assert self.number_of_nodes(ntype) > F.as_scalar(F.max(nids, dim=0)), \
'The input nids {} is out of the range [0:{})'.format(
F.as_scalar(F.max(nids, dim=0)), self.number_of_nodes(ntype))
ntid = self.get_ntype_id(ntype)
nodes = {}
for c_ntype in self.ntypes:
if self.get_ntype_id(c_ntype) == ntid:
original_nids = self.nodes(c_ntype)
nodes[c_ntype] = utils.compensate(nids, original_nids)
else:
nodes[c_ntype] = self.nodes(c_ntype)
# node_subgraph
sub_g = self.subgraph(nodes)
self._graph = sub_g._graph
self._node_frames = sub_g._node_frames
self._edge_frames = sub_g._edge_frames
def _reset_cached_info(self):
"""Some info like batch_num_nodes may be stale after mutation
Clean these cached info
"""
raise DGLError('Mutation is not supported in heterograph.')
self._batch_num_nodes = None
self._batch_num_edges = None
#################################################################
# Metagraph query
......@@ -533,6 +1047,11 @@ class DGLHeteroGraph(object):
DGLError: Edge type "follows" is ambiguous.
Please use canonical etype type in the form of (srctype, etype, dsttype)
"""
if etype is None:
if len(self.etypes) != 1:
raise DGLError('Edge type name must be specified if there are more than one '
'edge types.')
etype = self.etypes[0]
if isinstance(etype, tuple):
return etype
else:
......@@ -653,6 +1172,58 @@ class DGLHeteroGraph(object):
raise DGLError('Edge type "{}" does not exist.'.format(etype))
return etid
#################################################################
# Batching
#################################################################
@property
def batch_size(self):
"""TBD"""
return len(self.batch_num_nodes(self.ntypes[0]))
def batch_num_nodes(self, ntype=None):
"""TBD"""
if self._batch_num_nodes is None:
self._batch_num_nodes = {}
for ty in self.ntypes:
bnn = F.copy_to(F.tensor([self.number_of_nodes(ty)], F.int64), self.device)
self._batch_num_nodes[ty] = bnn
if ntype is None:
if len(self.ntypes) != 1:
raise DGLError('Node type name must be specified if there are more than one '
'node types.')
ntype = self.ntypes[0]
return self._batch_num_nodes[ntype]
def set_batch_num_nodes(self, val):
"""TBD"""
if not isinstance(val, Mapping):
if len(self.ntypes) != 1:
raise DGLError('Must provide a dictionary when there are multiple node types.')
val = {self.ntypes[0] : val}
self._batch_num_nodes = val
def batch_num_edges(self, etype=None):
"""TBD"""
if self._batch_num_edges is None:
self._batch_num_edges = {}
for ty in self.canonical_etypes:
bne = F.copy_to(F.tensor([self.number_of_edges(ty)], F.int64), self.device)
self._batch_num_edges[ty] = bne
if etype is None:
if len(self.etypes) != 1:
raise DGLError('Edge type name must be specified if there are more than one '
'edge types.')
etype = self.canonical_etypes[0]
return self._batch_num_edges[etype]
def set_batch_num_edges(self, val):
"""TBD"""
if not isinstance(val, Mapping):
if len(self.etypes) != 1:
raise DGLError('Must provide a dictionary when there are multiple edge types.')
val = {self.canonical_etypes[0] : val}
self._batch_num_edges = val
#################################################################
# View
#################################################################
......@@ -1249,6 +1820,13 @@ class DGLHeteroGraph(object):
"""
return self._graph.number_of_edges(self.get_etype_id(etype))
def __len__(self):
"""Deprecated: please directly call :func:`number_of_nodes`
"""
dgl_warning('DGLGraph.__len__ is deprecated.'
'Please directly call DGLGraph.number_of_nodes.')
return self.number_of_nodes()
@property
def is_multigraph(self):
"""Whether the graph is a multigraph
......@@ -1262,14 +1840,17 @@ class DGLHeteroGraph(object):
@property
def is_readonly(self):
"""Whether the graph is readonly
"""Deprecated: DGLGraph will always be mutable.
Returns
-------
bool
True if the graph is readonly, False otherwise.
"""
return self._graph.is_readonly()
dgl_warning('DGLGraph.is_readonly is deprecated in v0.5.\n'
'DGLGraph now always supports mutable operations like add_nodes'
' and add_edges.')
return False
@property
def idtype(self):
......@@ -1298,84 +1879,73 @@ class DGLHeteroGraph(object):
"""
return self._graph.dtype
def has_node(self, vid, ntype=None):
def __contains__(self, vid):
"""Deprecated: please directly call :func:`has_nodes`.
"""
dgl_warning('DGLGraph.__contains__ is deprecated.'
' Please directly call has_nodes.')
return self.has_nodes(vid)
def has_nodes(self, vid, ntype=None):
"""Whether the graph has a node with a particular id and type.
Parameters
----------
vid : int
The node ID.
vid : int, iterable, tensor
Node ID(s).
ntype : str, optional
The node type. Can be omitted if there is only one node type
in the graph. (Default: None)
Returns
-------
bool
True if the node exists, False otherwise
bool or bool Tensor
Each element is a bool flag, which is True if the node exists,
and is False otherwise.
Examples
--------
>>> g.has_node(0, 'user')
>>> g.has_nodes(0, 'user')
True
>>> g.has_node(4, 'user')
>>> g.has_nodes(4, 'user')
False
See Also
--------
has_nodes
>>> g.has_nodes([0, 1, 2, 3, 4], 'user')
tensor([True, True, True, False, False])
"""
return self._graph.has_node(self.get_ntype_id(ntype), vid)
def has_nodes(self, vids, ntype=None):
"""Whether the graph has nodes with ids and a particular type.
Parameters
----------
vid : list or tensor
The array of node IDs.
ntype : str, optional
The node type. Can be omitted if there is only one node type
in the graph.
Returns
-------
a : tensor
Binary tensor indicating the existence of nodes with the specified ids and type.
``a[i]=1`` if the graph contains node ``vids[i]`` of type ``ntype``, 0 otherwise.
Examples
--------
The following example uses PyTorch backend.
ret = self._graph.has_nodes(
self.get_ntype_id(ntype),
utils.prepare_tensor(self, vid, "vid"))
if isinstance(vid, numbers.Integral):
return bool(F.as_scalar(ret))
else:
return F.astype(ret, F.bool)
>>> g.has_nodes([0, 1, 2, 3, 4], 'user')
tensor([1, 1, 1, 0, 0])
def has_node(self, vid, ntype=None):
"""Whether the graph has a node with ids and a particular type.
See Also
--------
has_node
DEPRECATED: see :func:`~DGLGraph.has_nodes`
"""
vids = utils.toindex(vids, self._idtype_str)
rst = self._graph.has_nodes(self.get_ntype_id(ntype), vids)
return rst.tousertensor()
dgl_warning("DGLGraph.has_node is deprecated. Please use DGLGraph.has_nodes")
return self.has_nodes(vid, ntype)
def has_edge_between(self, u, v, etype=None):
def has_edges_between(self, u, v, etype=None):
"""Whether the graph has an edge (u, v) of type ``etype``.
Parameters
----------
u : int
The node ID of source type.
v : int
The node ID of destination type.
u : int, iterable of int, Tensor
Source node ID(s).
v : int, iterable of int, Tensor
Destination node ID(s).
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
Returns
-------
bool
True if the edge is in the graph, False otherwise.
a : Tensor
Binary tensor indicating the existence of edges. ``a[i]=1`` if the graph
contains edge ``(u[i], v[i])`` of type ``etype``, 0 otherwise.
Examples
--------
......@@ -1384,47 +1954,26 @@ class DGLHeteroGraph(object):
True
>>> g.has_edge_between(0, 2, ('user', 'plays', 'game'))
False
See Also
--------
has_edges_between
>>> g.has_edge_between([0, 0], [1, 2], ('user', 'plays', 'game'))
tensor([1, 0])
"""
return self._graph.has_edge_between(self.get_etype_id(etype), u, v)
ret = self._graph.has_edges_between(
self.get_etype_id(etype),
utils.prepare_tensor(self, u, 'u'),
utils.prepare_tensor(self, v, 'v'))
if isinstance(u, numbers.Integral) and isinstance(v, numbers.Integral):
return bool(F.as_scalar(ret))
else:
return F.astype(ret, F.bool)
def has_edges_between(self, u, v, etype=None):
def has_edge_between(self, u, v, etype=None):
"""Whether the graph has edges of type ``etype``.
Parameters
----------
u : list, tensor
The node ID array of source type.
v : list, tensor
The node ID array of destination type.
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
Returns
-------
a : tensor
Binary tensor indicating the existence of edges. ``a[i]=1`` if the graph
contains edge ``(u[i], v[i])`` of type ``etype``, 0 otherwise.
Examples
--------
The following example uses PyTorch backend.
>>> g.has_edges_between([0, 0], [1, 2], ('user', 'plays', 'game'))
tensor([1, 0])
See Also
--------
has_edge_between
DEPRECATED: please use :func:`~DGLGraph.has_edge_between`.
"""
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
rst = self._graph.has_edges_between(self.get_etype_id(etype), u, v)
return rst.tousertensor()
dgl_warning("DGLGraph.has_edge_between is deprecated. "
"Please use DGLGraph.has_edges_between")
return self.has_edges_between(u, v, etype)
def predecessors(self, v, etype=None):
"""Return the predecessors of node `v` in the graph with the specified
......@@ -1462,7 +2011,7 @@ class DGLHeteroGraph(object):
--------
successors
"""
return self._graph.predecessors(self.get_etype_id(etype), v).tousertensor()
return self._graph.predecessors(self.get_etype_id(etype), v)
def successors(self, v, etype=None):
"""Return the successors of node `v` in the graph with the specified edge
......@@ -1500,85 +2049,27 @@ class DGLHeteroGraph(object):
--------
predecessors
"""
check_same_dtype(self._idtype_str, v)
return self._graph.successors(self.get_etype_id(etype), v).tousertensor()
return self._graph.successors(self.get_etype_id(etype), v)
def edge_id(self, u, v, force_multi=None, return_array=False, etype=None):
def edge_id(self, u, v, force_multi=None, return_uv=False, etype=None):
"""Return the edge ID, or an array of edge IDs, between source node
`u` and destination node `v`, with the specified edge type
**DEPRECATED**: See edge_ids
"""
dgl_warning("DGLGraph.edge_id is deprecated. Please use DGLGraph.edge_ids.")
return self.edge_ids(u, v, force_multi=force_multi,
return_uv=return_uv, etype=etype)
def edge_ids(self, u, v, force_multi=None, return_uv=False, etype=None):
"""Return all edge IDs between source node array `u` and destination
node array `v` with the specified edge type.
Parameters
----------
u : int
The node ID of source type.
v : int
The node ID of destination type.
force_multi : bool, optional
Deprecated (Will be deleted in the future).
If False, will return a single edge ID.
If True, will always return an array. (Default: False)
return_array : bool, optional
If False, will return a single edge ID.
If True, will always return an array. (Default: False)
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
Returns
-------
int or tensor
The edge ID if ``return_array == False``.
The edge ID array otherwise.
Notes
-----
If multiply edges exist between `u` and `v` and return_array is False,
the result is undefined.
Examples
--------
The following example uses PyTorch backend.
Instantiate a heterograph.
>>> plays_g = dgl.bipartite(([0, 1, 1, 2], [0, 0, 2, 1]), 'user', 'plays', 'game')
>>> follows_g = dgl.graph(([0, 1, 1], [1, 2, 2]), 'user', 'follows')
>>> g = dgl.hetero_from_relations([plays_g, follows_g])
Query for edge id.
>>> plays_g.edge_id(1, 2, etype=('user', 'plays', 'game'))
2
>>> g.edge_id(1, 2, return_array=True, etype=('user', 'follows', 'user'))
tensor([1, 2])
See Also
--------
edge_ids
"""
idx = self._graph.edge_id(self.get_etype_id(etype), u, v)
if force_multi is not None:
dgl_warning("force_multi will be deprecated." \
"Please use return_array instead")
return_array = force_multi
if return_array:
return idx.tousertensor()
else:
assert len(idx) == 1, "For return_array=False, there should be one and " \
"only one edge between u and v, but get {} edges. " \
"Please use return_array=True instead".format(len(idx))
return idx[0]
def edge_ids(self, u, v, force_multi=None, return_uv=False, etype=None):
"""Return all edge IDs between source node array `u` and destination
node array `v` with the specified edge type.
Parameters
----------
u : list, tensor
u : int, list, tensor
The node ID array of source type.
v : list, tensor
v : int, list, tensor
The node ID array of destination type.
force_multi : bool, optional
Deprecated (Will be deleted in the future).
......@@ -1628,28 +2119,27 @@ class DGLHeteroGraph(object):
tensor([2])
>>> g.edge_ids([1], [2], return_uv=True, etype=('user', 'follows', 'user'))
(tensor([1, 1]), tensor([2, 2]), tensor([1, 2]))
See Also
--------
edge_id
"""
check_same_dtype(self._idtype_str, u)
check_same_dtype(self._idtype_str, v)
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
src, dst, eid = self._graph.edge_ids(self.get_etype_id(etype), u, v)
is_int = isinstance(u, numbers.Integral) and isinstance(v, numbers.Integral)
u = utils.prepare_tensor(self, u, 'u')
v = utils.prepare_tensor(self, v, 'v')
if force_multi is not None:
dgl_warning("force_multi will be deprecated, " \
"Please use return_uv instead")
return_uv = force_multi
if return_uv:
return src.tousertensor(), dst.tousertensor(), eid.tousertensor()
return self._graph.edge_ids_all(self.get_etype_id(etype), u, v)
else:
assert len(eid) == max(len(u), len(v)), "If return_uv=False, there should be one and " \
"only one edge between each u and v, expect {} edges but get {}. " \
"Please use return_uv=True instead".format(max(len(u), len(v)), len(eid))
return eid.tousertensor()
eid = self._graph.edge_ids_one(self.get_etype_id(etype), u, v)
is_neg_one = F.equal(eid, -1)
if F.as_scalar(F.sum(is_neg_one, 0)):
# Raise error since some (u, v) pair is not a valid edge.
idx = F.nonzero_1d(is_neg_one)
raise DGLError("Error: (%d, %d) does not form a valid edge." % (
F.as_scalar(F.gather_row(u, idx)),
F.as_scalar(F.gather_row(v, idx))))
return F.as_scalar(eid) if is_int else eid
def find_edges(self, eid, etype=None):
"""Given an edge ID array with the specified type, return the source
......@@ -1681,25 +2171,17 @@ class DGLHeteroGraph(object):
>>> g.find_edges([0, 2])
(tensor([0, 1]), tensor([0, 2]))
"""
eid = utils.prepare_tensor(self, eid, 'eid')
if len(eid) == 0:
return F.tensor([], dtype=self.idtype), F.tensor([], dtype=self.idtype)
check_same_dtype(self._idtype_str, eid)
if F.is_tensor(eid):
max_eid = F.max(eid, dim=0)
else:
max_eid = np.max(eid, axis=0)
max_valid_eid = self.number_of_edges(etype) - 1
valid_ids = max_eid <= max_valid_eid
if etype is None:
assert valid_ids, \
'Expect edge ids to be in [0, ..., {:d}], got {}'.format(max_valid_eid, max_eid)
else:
assert valid_ids, 'Expect edge ids to be in [0, ..., {:d}]' \
' for type {}, got {}'.format(max_valid_eid, etype, max_eid)
eid = utils.toindex(eid, self._idtype_str)
empty = F.copy_to(F.tensor([], self.idtype), self.device)
return empty, empty
# sanity check
max_eid = F.as_scalar(F.max(eid, dim=0))
if max_eid >= self.number_of_edges(etype):
raise DGLError('Expect edge IDs to be smaller than number of edges ({}). '
' But got {}.'.format(self.number_of_edges(etype), max_eid))
src, dst, _ = self._graph.find_edges(self.get_etype_id(etype), eid)
return src.tousertensor(), dst.tousertensor()
return src, dst
def in_edges(self, v, form='uv', etype=None):
"""Return the inbound edges of the node(s) with the specified type.
......@@ -1743,17 +2225,16 @@ class DGLHeteroGraph(object):
>>> g.in_edges([0, 2], form='uv')
(tensor([0, 1]), tensor([0, 2]))
"""
check_same_dtype(self._idtype_str, v)
v = utils.toindex(v, self._idtype_str)
v = utils.prepare_tensor(self, v, 'v')
src, dst, eid = self._graph.in_edges(self.get_etype_id(etype), v)
if form == 'all':
return (src.tousertensor(), dst.tousertensor(), eid.tousertensor())
return src, dst, eid
elif form == 'uv':
return (src.tousertensor(), dst.tousertensor())
return src, dst
elif form == 'eid':
return eid.tousertensor()
return eid
else:
raise DGLError('Invalid form:', form)
raise DGLError('Invalid form: {}. Must be "all", "uv" or "eid".'.format(form))
def out_edges(self, u, form='uv', etype=None):
"""Return the outbound edges of the node(s) with the specified type.
......@@ -1795,17 +2276,16 @@ class DGLHeteroGraph(object):
>>> g.out_edges([0, 1], form='uv')
(tensor([0, 1, 1]), tensor([0, 1, 2]))
"""
check_same_dtype(self._idtype_str, u)
u = utils.toindex(u, self._idtype_str)
u = utils.prepare_tensor(self, u, 'u')
src, dst, eid = self._graph.out_edges(self.get_etype_id(etype), u)
if form == 'all':
return (src.tousertensor(), dst.tousertensor(), eid.tousertensor())
return src, dst, eid
elif form == 'uv':
return (src.tousertensor(), dst.tousertensor())
return src, dst
elif form == 'eid':
return eid.tousertensor()
return eid
else:
raise DGLError('Invalid form:', form)
raise DGLError('Invalid form: {}. Must be "all", "uv" or "eid".'.format(form))
def all_edges(self, form='uv', order=None, etype=None):
"""Return all edges with the specified type.
......@@ -1853,58 +2333,28 @@ class DGLHeteroGraph(object):
"""
src, dst, eid = self._graph.edges(self.get_etype_id(etype), order)
if form == 'all':
return (src.tousertensor(), dst.tousertensor(), eid.tousertensor())
return src, dst, eid
elif form == 'uv':
return (src.tousertensor(), dst.tousertensor())
return src, dst
elif form == 'eid':
return eid.tousertensor()
return eid
else:
raise DGLError('Invalid form:', form)
raise DGLError('Invalid form: {}. Must be "all", "uv" or "eid".'.format(form))
def in_degree(self, v, etype=None):
"""Return the in-degree of node ``v`` with edges of type ``etype``.
Parameters
----------
v : int
The node ID of destination type.
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph. (Default: None)
Returns
-------
int
The in-degree.
Examples
--------
Instantiate a heterograph.
>>> plays_g = dgl.bipartite(([0, 1, 1, 2], [0, 0, 2, 1]), 'user', 'plays', 'game')
>>> follows_g = dgl.graph(([0, 1, 1], [1, 2, 2]), 'user', 'follows')
>>> g = dgl.hetero_from_relations([plays_g, follows_g])
Query for node degree.
>>> g.in_degree(0, 'plays')
2
>>> g.in_degree(0, 'follows')
0
See Also
--------
in_degrees
DEPRECATED: Please use in_degrees
"""
return self._graph.in_degree(self.get_etype_id(etype), v)
dgl_warning("DGLGraph.in_degree is deprecated. Please use DGLGraph.in_degrees")
return self.in_degrees(v, etype)
def in_degrees(self, v=ALL, etype=None):
"""Return the in-degrees of nodes v with edges of type ``etype``.
Parameters
----------
v : list, tensor, optional.
v : int, iterable of int or tensor, optional.
The node ID array of the destination type. Default is to return the
degrees of all nodes.
etype : str or tuple of str or None, optional
......@@ -1913,9 +2363,10 @@ class DGLHeteroGraph(object):
Returns
-------
d : tensor
d : tensor or int
The in-degree array. ``d[i]`` gives the in-degree of node ``v[i]``
with edges of type ``etype``.
with edges of type ``etype``. If the argument is an integer, so will
be the return.
Examples
--------
......@@ -1930,60 +2381,27 @@ class DGLHeteroGraph(object):
Query for node degree.
>>> g.in_degrees(0, 'plays')
tensor([2])
2
>>> g.in_degrees(etype='follows')
tensor([0, 1, 2])
See Also
--------
in_degree
"""
check_same_dtype(self._idtype_str, v)
dsttype = self.to_canonical_etype(etype)[2]
etid = self.get_etype_id(etype)
_, dtid = self._graph.metagraph.find_edge(etid)
if is_all(v):
v = utils.toindex(slice(0, self._graph.number_of_nodes(dtid)), self._idtype_str)
v = self.dstnodes(dsttype)
deg = self._graph.in_degrees(etid, utils.prepare_tensor(self, v, 'v'))
if isinstance(v, numbers.Integral):
return F.as_scalar(deg)
else:
v = utils.toindex(v, self._idtype_str)
return self._graph.in_degrees(etid, v).tousertensor()
return deg
def out_degree(self, u, etype=None):
"""Return the out-degree of node `u` with edges of type ``etype``.
Parameters
----------
u : int
The node ID of source type.
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph. (Default: None)
Returns
-------
int
The out-degree of node `u` with edges of type ``etype``.
Examples
--------
Instantiate a heterograph.
>>> plays_g = dgl.bipartite(([0, 1, 1, 2], [0, 0, 2, 1]), 'user', 'plays', 'game')
>>> follows_g = dgl.graph(([0, 1, 1], [1, 2, 2]), 'user', 'follows')
>>> g = dgl.hetero_from_relations([plays_g, follows_g])
Query for node degree.
>>> g.out_degree(0, 'plays')
1
>>> g.out_degree(1, 'follows')
2
See Also
--------
out_degrees
DEPRECATED: please use DGL.out_degrees
"""
return self._graph.out_degree(self.get_etype_id(etype), u)
dgl_warning("DGLGraph.out_degree is deprecated. Please use DGLGraph.out_degrees")
return self.out_degrees(u, etype)
def out_degrees(self, u=ALL, etype=None):
"""Return the out-degrees of nodes u with edges of type ``etype``.
......@@ -2016,7 +2434,7 @@ class DGLHeteroGraph(object):
Query for node degree.
>>> g.out_degrees(0, 'plays')
tensor([1])
1
>>> g.out_degrees(etype='follows')
tensor([1, 2, 0])
......@@ -2024,35 +2442,40 @@ class DGLHeteroGraph(object):
--------
out_degree
"""
check_same_dtype(self._idtype_str, u)
srctype = self.to_canonical_etype(etype)[0]
etid = self.get_etype_id(etype)
stid, _ = self._graph.metagraph.find_edge(etid)
if is_all(u):
u = utils.toindex(slice(0, self._graph.number_of_nodes(stid)), self._idtype_str)
u = self.srcnodes(srctype)
deg = self._graph.out_degrees(etid, utils.prepare_tensor(self, u, 'u'))
if isinstance(u, numbers.Integral):
return F.as_scalar(deg)
else:
u = utils.toindex(u, self._idtype_str)
return self._graph.out_degrees(etid, u).tousertensor()
return deg
def _create_hetero_subgraph(self, sgi, induced_nodes, induced_edges):
"""Internal function to create a subgraph."""
node_frames = [
FrameRef(Frame(
self._node_frames[i][induced_nodes_of_ntype],
num_rows=len(induced_nodes_of_ntype)))
for i, induced_nodes_of_ntype in enumerate(induced_nodes)]
edge_frames = [
FrameRef(Frame(
self._edge_frames[i][induced_edges_of_etype],
num_rows=len(induced_edges_of_etype)))
for i, induced_edges_of_etype in enumerate(induced_edges)]
hsg = self.__class__(sgi.graph, self._ntypes, self._etypes, node_frames, edge_frames)
hsg.is_subgraph = True
node_frames = []
for i, ind_nodes in enumerate(induced_nodes):
subframe = self._node_frames[i][utils.toindex(ind_nodes, self._idtype_str)]
node_frames.append(FrameRef(Frame(subframe, num_rows=len(ind_nodes))))
edge_frames = []
for i, ind_edges in enumerate(induced_edges):
subframe = self._edge_frames[i][utils.toindex(ind_edges, self._idtype_str)]
edge_frames.append(FrameRef(Frame(subframe, num_rows=len(ind_edges))))
hsg = DGLHeteroGraph(sgi.graph, self._ntypes, self._etypes, node_frames, edge_frames)
for ntype, induced_nid in zip(self.ntypes, induced_nodes):
hsg.nodes[ntype].data[NID] = induced_nid.tousertensor()
ndata = hsg.nodes[ntype].data
orig_ndata = self.nodes[ntype].data
ndata[NID] = induced_nid
for key in orig_ndata:
ndata[key] = F.gather_row(orig_ndata[key], induced_nid)
for etype, induced_eid in zip(self.canonical_etypes, induced_edges):
hsg.edges[etype].data[EID] = induced_eid.tousertensor()
edata = hsg.edges[etype].data
orig_edata = self.edges[etype].data
edata[EID] = induced_eid
for key in orig_edata:
edata[key] = F.gather_row(orig_edata[key], induced_eid)
return hsg
def subgraph(self, nodes):
......@@ -2141,30 +2564,21 @@ class DGLHeteroGraph(object):
--------
edge_subgraph
"""
if self.is_block:
raise DGLError('Extracting subgraph from a block graph is not allowed.')
if not isinstance(nodes, Mapping):
assert len(self.ntypes) == 1, \
'need a dict of node type and IDs for graph with multiple node types'
nodes = {self.ntypes[0]: nodes}
for ntype, v in nodes.items():
if F.is_tensor(v):
# Check if the v is a bool tensor
if F.dtype(v) is F.data_type_dict['bool']:
assert len(F.shape(v)) == 1, \
"dgl.subgraph only support 1D tensor as ID array"
nodes_idx = F.nonzero_1d(v)
nodes[ntype] = F.astype(nodes_idx,
ty=F.data_type_dict[self._idtype_str])
else:
check_same_dtype(self._idtype_str, v)
def _process_nodes(ntype, v):
if F.is_tensor(v) and F.dtype(v) == F.bool:
return F.astype(F.nonzero_1d(F.copy_to(v, self.device)), self.idtype)
else:
v = F.tensor(v, dtype=F.data_type_dict[self._idtype_str])
induced_nodes = [utils.toindex(nodes.get(ntype, []), self._idtype_str)
for ntype in self.ntypes]
return utils.prepare_tensor(self, v, 'nodes["{}"]'.format(ntype))
induced_nodes = [_process_nodes(ntype, nodes.get(ntype, [])) for ntype in self.ntypes]
sgi = self._graph.node_subgraph(induced_nodes)
induced_edges = sgi.induced_edges
return self._create_hetero_subgraph(sgi, induced_nodes, induced_edges)
def edge_subgraph(self, edges, preserve_nodes=False):
......@@ -2261,29 +2675,23 @@ class DGLHeteroGraph(object):
--------
subgraph
"""
if self.is_block:
raise DGLError('Extracting subgraph from a block graph is not allowed.')
if not isinstance(edges, Mapping):
assert len(self.canonical_etypes) == 1, \
'need a dict of edge type and IDs for graph with multiple edge types'
edges = {self.canonical_etypes[0]: edges}
for etype, v in edges.items():
if F.is_tensor(v):
# Check if the v is a bool tensor
if F.dtype(v) is F.data_type_dict['bool']:
assert len(F.shape(v)) == 1, \
"dgl.edge_subgraph only support 1D tensor as ID array"
edges_idx = F.nonzero_1d(v)
edges[etype] = F.astype(edges_idx,
ty=F.data_type_dict[self._idtype_str])
else:
check_same_dtype(self._idtype_str, v)
def _process_edges(etype, e):
if F.is_tensor(e) and F.dtype(e) == F.bool:
return F.astype(F.nonzero_1d(F.copy_to(e, self.device)), self.idtype)
else:
v = F.tensor(v, dtype=F.data_type_dict[self._idtype_str])
return utils.prepare_tensor(self, e, 'edges["{}"]'.format(etype))
edges = {self.to_canonical_etype(etype): e for etype, e in edges.items()}
induced_edges = [
utils.toindex(edges.get(canonical_etype, []), self._idtype_str)
for canonical_etype in self.canonical_etypes]
_process_edges(cetype, edges.get(cetype, []))
for cetype in self.canonical_etypes]
sgi = self._graph.edge_subgraph(induced_edges, preserve_nodes)
induced_nodes = sgi.induced_nodes
......@@ -2366,7 +2774,7 @@ class DGLHeteroGraph(object):
# num_nodes_per_type doesn't need to be int32
hgidx = heterograph_index.create_heterograph_from_relations(
metagraph, rel_graphs, utils.toindex(num_nodes_per_type, "int64"))
hg = self.__class__(hgidx, ntypes, induced_etypes,
hg = DGLHeteroGraph(hgidx, ntypes, induced_etypes,
node_frames, edge_frames)
return hg
......@@ -2443,7 +2851,7 @@ class DGLHeteroGraph(object):
# num_nodes_per_type should be int64
hgidx = heterograph_index.create_heterograph_from_relations(
metagraph, rel_graphs, utils.toindex(num_nodes_per_induced_type, "int64"))
hg = self.__class__(hgidx, induced_ntypes, induced_etypes, node_frames, edge_frames)
hg = DGLHeteroGraph(hgidx, induced_ntypes, induced_etypes, node_frames, edge_frames)
return hg
def adjacency_matrix(self, transpose=None, ctx=F.cpu(), scipy_fmt=None, etype=None):
......@@ -2499,7 +2907,7 @@ class DGLHeteroGraph(object):
if transpose is None:
dgl_warning(
"Currently adjacency_matrix() returns a matrix with destination as rows"
" by default. In 0.5 the result will have source as rows"
" by default.\n\tIn 0.5 the result will have source as rows"
" (i.e. transpose=True)")
transpose = False
......@@ -2512,6 +2920,15 @@ class DGLHeteroGraph(object):
# Alias of ``adjacency_matrix``
adj = adjacency_matrix
def adjacency_matrix_scipy(self, transpose=None, fmt='csr', return_edge_ids=None):
"""DEPRECATED: please use ``dgl.adjacency_matrix(transpose, scipy_fmt=fmt)``.
"""
dgl_warning('DGLGraph.adjacency_matrix_scipy is deprecated. '
'Please replace it with:\n\n\t'
'DGLGraph.adjacency_matrix(transpose, scipy_fmt="{}").\n'.format(fmt))
return self.adjacency_matrix(transpose=transpose, scipy_fmt=fmt)
def incidence_matrix(self, typestr, ctx=F.cpu(), etype=None):
"""Return the incidence matrix representation of edges with the given
edge type.
......@@ -2743,18 +3160,23 @@ class DGLHeteroGraph(object):
if is_all(u):
num_nodes = self._graph.number_of_nodes(ntid)
else:
u = utils.toindex(u, self._idtype_str)
u = utils.prepare_tensor(self, u, 'u')
num_nodes = len(u)
for key, val in data.items():
nfeats = F.shape(val)[0]
if nfeats != num_nodes:
raise DGLError('Expect number of features to match number of nodes (len(u)).'
' Got %d and %d instead.' % (nfeats, num_nodes))
if F.context(val) != self.device:
raise DGLError('Cannot assign node feature "{}" on device {} to a graph on'
' device {}. Call DGLGraph.to() to copy the graph to the'
' same device.'.format(key, F.context(val), self.device))
if is_all(u):
for key, val in data.items():
self._node_frames[ntid][key] = val
else:
u = utils.toindex(u, self._idtype_str)
self._node_frames[ntid].update_rows(u, data, inplace=inplace)
def _get_n_repr(self, ntid, u):
......@@ -2777,6 +3199,7 @@ class DGLHeteroGraph(object):
if is_all(u):
return dict(self._node_frames[ntid])
else:
u = utils.prepare_tensor(self, u, 'u')
u = utils.toindex(u, self._idtype_str)
return self._node_frames[ntid].select_rows(u)
......@@ -2826,16 +3249,16 @@ class DGLHeteroGraph(object):
(Default: False)
"""
# parse argument
print('edges', edges)
if is_all(edges):
eid = ALL
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
_, _, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=self.canonical_etypes[etid])
else:
eid = utils.toindex(edges, self._idtype_str)
eid = utils.prepare_tensor(self, edges, 'edges')
# sanity check
if not utils.is_dict_like(data):
......@@ -2845,13 +3268,17 @@ class DGLHeteroGraph(object):
if is_all(eid):
num_edges = self._graph.number_of_edges(etid)
else:
eid = utils.toindex(eid, self._idtype_str)
num_edges = len(eid)
for key, val in data.items():
nfeats = F.shape(val)[0]
if nfeats != num_edges:
raise DGLError('Expect number of features to match number of edges.'
' Got %d and %d instead.' % (nfeats, num_edges))
if F.context(val) != self.device:
raise DGLError('Cannot assign edge feature "{}" on device {} to a graph on'
' device {}. Call DGLGraph.to() to copy the graph to the'
' same device.'.format(key, F.context(val), self.device))
# set
if is_all(eid):
# update column
......@@ -2859,6 +3286,7 @@ class DGLHeteroGraph(object):
self._edge_frames[etid][key] = val
else:
# update row
eid = utils.toindex(eid, self._idtype_str)
self._edge_frames[etid].update_rows(eid, data, inplace=inplace)
def _get_e_repr(self, etid, edges):
......@@ -2882,12 +3310,11 @@ class DGLHeteroGraph(object):
eid = ALL
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
_, _, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=self.canonical_etypes[etid])
else:
eid = utils.toindex(edges, self._idtype_str)
eid = utils.prepare_tensor(self, edges, 'edges')
if is_all(eid):
return dict(self._edge_frames[etid])
......@@ -2912,10 +3339,65 @@ class DGLHeteroGraph(object):
"""
self._edge_frames[etid].pop(key)
#################################################################
# DEPRECATED: from the old DGLGraph
#################################################################
def from_networkx(self, nx_graph, node_attrs=None, edge_attrs=None):
"""DEPRECATED: please use
``dgl.from_networkx(nx_graph, node_attrs, edge_attrs)``
which will return a new graph created from the networkx graph.
"""
raise DGLError('DGLGraph.from_networkx is deprecated. Please call the following\n\n'
'\t dgl.from_networkx(nx_graph, node_attrs, edge_attrs)\n\n'
', which creates a new DGLGraph from the networkx graph.')
def from_scipy_sparse_matrix(self, spmat, multigraph=None):
"""DEPRECATED: please use
``dgl.from_scipy(spmat)``
which will return a new graph created from the scipy matrix.
"""
raise DGLError('DGLGraph.from_scipy_sparse_matrix is deprecated. '
'Please call the following\n\n'
'\t dgl.from_scipy(spmat)\n\n'
', which creates a new DGLGraph from the scipy matrix.')
#################################################################
# Message passing
#################################################################
def register_apply_node_func(self, func):
"""Deprecated: please directly call :func:`apply_nodes` with ``func``
as argument.
"""
raise DGLError('DGLGraph.register_apply_node_func is deprecated.'
' Please directly call apply_nodes with func as the argument.')
def register_apply_edge_func(self, func):
"""Deprecated: please directly call :func:`apply_edges` with ``func``
as argument.
"""
raise DGLError('DGLGraph.register_apply_edge_func is deprecated.'
' Please directly call apply_edges with func as the argument.')
def register_message_func(self, func):
"""Deprecated: please directly call :func:`update_all` with ``func``
as argument.
"""
raise DGLError('DGLGraph.register_message_func is deprecated.'
' Please directly call update_all with func as the argument.')
def register_reduce_func(self, func):
"""Deprecated: please directly call :func:`update_all` with ``func``
as argument.
"""
raise DGLError('DGLGraph.register_reduce_func is deprecated.'
' Please directly call update_all with func as the argument.')
def apply_nodes(self, func, v=ALL, ntype=None, inplace=False):
"""Apply the function on the nodes with the same type to update their
features.
......@@ -2950,13 +3432,13 @@ class DGLHeteroGraph(object):
--------
apply_edges
"""
check_same_dtype(self._idtype_str, v)
ntid = self.get_ntype_id(ntype)
if is_all(v):
v_ntype = utils.toindex(slice(0, self.number_of_nodes(ntype)), self._idtype_str)
v = F.arange(0, self.number_of_nodes(ntype), self.idtype)
else:
v_ntype = utils.toindex(v, self._idtype_str)
v = utils.prepare_tensor(self, v, 'v')
with ir.prog() as prog:
v_ntype = utils.toindex(v, self._idtype_str)
scheduler.schedule_apply_nodes(v_ntype, func, self._node_frames[ntid],
inplace=inplace, ntype=self._ntypes[ntid])
Runtime.run(prog)
......@@ -2998,23 +3480,25 @@ class DGLHeteroGraph(object):
apply_nodes
group_apply_edges
"""
check_same_dtype(self._idtype_str, edges)
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
if is_all(edges):
u, v, _ = self._graph.edges(etid, 'eid')
u, v, _ = self.edges(etype=etype, form='all')
# TODO(minjie): temporary hack
eid = utils.toindex(slice(0, self.number_of_edges(etype)), self._idtype_str)
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=etype)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
eid = utils.prepare_tensor(self, edges, 'edges')
u, v = self.find_edges(eid, etype=etype)
with ir.prog() as prog:
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
eid = utils.toindex(eid, self._idtype_str)
scheduler.schedule_apply_edges(
AdaptedHeteroGraph(self, stid, dtid, etid),
u, v, eid, func, inplace=inplace)
......@@ -3060,26 +3544,26 @@ class DGLHeteroGraph(object):
--------
apply_edges
"""
check_same_dtype(self._idtype_str, edges)
if group_by not in ('src', 'dst'):
raise DGLError("Group_by should be either src or dst")
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
if is_all(edges):
u, v, _ = self._graph.edges(etid, 'eid')
eid = utils.toindex(slice(0, self.number_of_edges(etype)), self._idtype_str)
u, v, eid = self.edges(etype=etype, form='all')
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=etype)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
eid = utils.prepare_tensor(self, edges, 'edges')
u, v = self.find_edges(eid, etype=etype)
with ir.prog() as prog:
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
eid = utils.toindex(eid, self._idtype_str)
scheduler.schedule_group_apply_edge(
AdaptedHeteroGraph(self, stid, dtid, etid),
u, v, eid,
......@@ -3090,90 +3574,12 @@ class DGLHeteroGraph(object):
def send(self, edges, message_func, etype=None):
"""Send messages along the given edges with the same edge type.
``edges`` can be any of the following types:
* ``int`` : Specify one edge using its edge id (of the given edge type).
* ``pair of int`` : Specify one edge using its endpoints (of source node type
and destination node type respectively).
* ``int iterable`` / ``tensor`` : Specify multiple edges using their edge ids.
* ``pair of int iterable`` / ``pair of tensors`` :
Specify multiple edges using their endpoints.
**Only works if the graph has one edge type.** For multiple types, use
.. code::
g['edgetype'].send(edges, message_func)
The UDF returns messages on the edges and can be later fetched in
the destination node's ``mailbox``. Receiving will consume the messages.
See :func:`recv` for example.
If multiple ``send`` are triggered on the same edge without ``recv``. Messages
generated by the later ``send`` will overwrite previous messages.
Parameters
----------
edges : optional
Edges on which to apply ``message_func``.
message_func : callable
Message function on the edges. The function should be
an :mod:`Edge UDF <dgl.udf>`.
Notes
-----
On multigraphs, if :math:`u` and :math:`v` are specified, then the messages will be sent
along all edges between :math:`u` and :math:`v`.
Examples
--------
>>> import dgl.function as fn
>>> import torch
>>> g = dgl.graph(([0, 1], [1, 2]), 'user', 'follows')
>>> g.nodes['user'].data['h'] = torch.tensor([[0.], [1.], [2.]])
Different ways for sending messages.
>>> # Send the feature of source nodes along all edges
>>> g.send(g.edges(), fn.copy_src('h', 'm'))
>>> # Send the feature of source node along one edge specified by its id
>>> g.send(0, fn.copy_src('h', 'm'))
>>> # Send the feature of source node along one edge specified by its end points
>>> g.send((0, 1), fn.copy_src('h', 'm'))
>>> # Send the feature of source nodes along multiple edges specified by their ids
>>> g.send([0, 1], fn.copy_src('h', 'm'))
>>> # Send the feature of source nodes along multiple edges specified by their end points
>>> g.send(([0, 1], [1, 2]), fn.copy_src('h', 'm'))
DEPRECATE: please use send_and_recv, update_all.
"""
check_same_dtype(self._idtype_str, edges)
assert message_func is not None
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
if is_all(edges):
eid = utils.toindex(slice(0, self._graph.number_of_edges(etid)), self._idtype_str)
u, v, _ = self._graph.edges(etid, 'eid')
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
if len(eid) == 0:
# no edge to be triggered
return
with ir.prog() as prog:
scheduler.schedule_send(
AdaptedHeteroGraph(self, stid, dtid, etid),
u, v, eid,
message_func)
Runtime.run(prog)
raise DGLError('DGLGraph.send is deprecated. As a replacement, use DGLGraph.apply_edges\n'
' API to compute messages as edge data. Then use DGLGraph.send_and_recv\n'
' and set the message function as dgl.function.copy_e to conduct message\n'
' aggregation.')
def recv(self,
v,
......@@ -3183,179 +3589,22 @@ class DGLHeteroGraph(object):
inplace=False):
r"""Receive and reduce incoming messages and update the features of node(s) :math:`v`.
It calculates:
.. math::
h_v^{new} = \sigma(f(\{m_{uv} | u\in\mathcal{N}_{t}(v)\}))
where :math:`\mathcal{N}_t(v)` defines the predecessors of node(s) :math:`v` connected by
edges of type :math:`t`, and :math:`m_{uv}` is the message on edge :math:`(u,v)`.
* ``reduce_func`` specifies :math:`f`, e.g. summation or average.
* ``apply_node_func`` specifies :math:`\sigma`, e.g. ReLU activation.
Other notes:
* `reduce_func` will be skipped for nodes with no incoming message.
* If all ``v`` have no incoming message, this will downgrade to an :func:`apply_nodes`.
* If some ``v`` have no incoming message, their new feature value will be calculated
by the column initializer (see :func:`set_n_initializer`). The feature shapes and
dtypes will be inferred.
* The node features will be updated by the result of the ``reduce_func``.
* Messages are consumed once received.
* The provided UDF may be called multiple times so it is recommended to provide
function with no side effect.
Parameters
----------
v : int, container or tensor
The node(s) to be updated.
reduce_func : callable
Reduce function on the node. The function should be
a :mod:`Node UDF <dgl.udf>`.
apply_node_func : callable
Apply function on the nodes. The function should be
a :mod:`Node UDF <dgl.udf>`. (Default: None)
etype : str, optional
The edge type. Can be omitted if there is only one edge type
in the graph. (Default: None)
inplace: bool, optional
If True, update will be done in place, but autograd will break.
(Default: False)
Examples
--------
>>> import dgl
>>> import dgl.function as fn
>>> import torch
Instantiate a heterograph.
>>> follows_g = dgl.graph([(0, 1), (1, 2)], 'user', 'follows')
>>> plays_g = dgl.bipartite([(0, 0), (1, 0), (1, 1), (2, 1)], 'user', 'plays', 'game')
>>> g = dgl.hetero_from_relations([follows_g, plays_g])
>>> g.nodes['user'].data['h'] = torch.tensor([[0.], [1.], [2.]])
Send and receive.
>>> g.send(g['follows'].edges(), fn.copy_src('h', 'm'), etype='follows')
>>> g.recv(g.nodes('user'), fn.sum('m', 'h'), etype='follows')
>>> g.nodes['user'].data['h']
tensor([[0.],
[0.],
[1.]])
DEPRECATE: please use send_and_recv, update_all.
"""
check_same_dtype(self._idtype_str, v)
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
if is_all(v):
v = F.arange(0, self.number_of_nodes(dtid), self._idtype_str)
elif isinstance(v, int):
v = [v]
v = utils.toindex(v, dtype=self._idtype_str)
if len(v) == 0:
# no vertex to be triggered.
return
with ir.prog() as prog:
scheduler.schedule_recv(AdaptedHeteroGraph(self, stid, dtid, etid),
v, reduce_func, apply_node_func,
inplace=inplace)
Runtime.run(prog)
raise DGLError('DGLGraph.recv is deprecated. As a replacement, use DGLGraph.apply_edges\n'
' API to compute messages as edge data. Then use DGLGraph.send_and_recv\n'
' and set the message function as dgl.function.copy_e to conduct message\n'
' aggregation.')
def multi_recv(self, v, reducer_dict, cross_reducer, apply_node_func=None, inplace=False):
r"""Receive messages from multiple edge types and perform aggregation.
It calculates:
.. math::
\begin{align}
h_{v, t}^{new} &= f\left(\left\{m_{uv} | u\in\mathcal{N}_{t}(v)\right\}\right)\\
h_v^{new} &= \sigma\left(g\left(\left\{h_{v, t}^{new} | t\in T_e\right\}\right)\right)
\end{align}
* ``per_type_reducer`` is a dictionary mapping edge type (str or tuple of str) to
reduce functions :math:`f` of each type.
* ``cross_reducer`` specifies :math:`g`.
* ``apply_node_func`` specifies :math:`\sigma`.
Parameters
----------
v : int, container or tensor
The node(s) to be updated.
reducer_dict : dict of callable
Mapping edge type (str or tuple of str) to reduce function (:mod:`Node UDF <dgl.udf>`).
cross_reducer : str
Cross type reducer. One of ``"sum"``, ``"min"``, ``"max"``, ``"mean"``, ``"stack"``.
apply_node_func : callable
Apply function on the nodes. The function should be
a :mod:`Node UDF <dgl.udf>`. (Default: None)
inplace: bool, optional
If True, update will be done in place, but autograd will break.
(Default: False)
Examples
--------
>>> import dgl
>>> import dgl.function as fn
>>> import torch
Instantiate a heterograph.
>>> g1 = dgl.graph(([0], [1]), 'user', 'follows')
>>> g2 = dgl.bipartite(([0], [1]), 'game', 'attracts', 'user')
>>> g = dgl.hetero_from_relations([g1, g2])
>>> g.nodes['user'].data['h'] = torch.tensor([[1.], [2.]])
>>> g.nodes['game'].data['h'] = torch.tensor([[1.]])
Send and receive.
>>> g.send(g['follows'].edges(), fn.copy_src('h', 'm'), etype='follows')
>>> g.send(g['attracts'].edges(), fn.copy_src('h', 'm'), etype='attracts')
>>> g.multi_recv(g.nodes('user'), {'follows': fn.sum('m', 'h'),
>>> 'attracts': fn.sum('m', 'h')}, "sum")
>>> g.nodes['user'].data['h']
tensor([[0.],
[2.]])
DEPRECATE: please use multi_send_and_recv, multi_update_all.
"""
check_same_dtype(self._idtype_str, v)
# infer receive node type
ntype = infer_ntype_from_dict(self, reducer_dict)
ntid = self.get_ntype_id_from_dst(ntype)
if is_all(v):
v = F.arange(0, self.number_of_nodes(ntid), self._idtype_str)
elif isinstance(v, int):
v = [v]
v = utils.toindex(v, self._idtype_str)
if len(v) == 0:
return
# TODO(minjie): currently loop over each edge type and reuse the old schedule.
# Should replace it with fused kernel.
all_out = []
merge_order = []
with ir.prog() as prog:
for ety, args in reducer_dict.items():
outframe = FrameRef(frame_like(self._node_frames[ntid]._frame))
args = pad_tuple(args, 2)
if args is None:
raise DGLError('Invalid per-type arguments. Should be either '
'(1) reduce_func or (2) (reduce_func, apply_node_func)')
rfunc, afunc = args
etid = self.get_etype_id(ety)
stid, dtid = self._graph.metagraph.find_edge(etid)
scheduler.schedule_recv(AdaptedHeteroGraph(self, stid, dtid, etid),
v, rfunc, afunc,
inplace=inplace, outframe=outframe)
all_out.append(outframe)
merge_order.append(etid) # use edge type id as merge order hint
Runtime.run(prog)
# merge by cross_reducer
self._node_frames[ntid].update(merge_frames(all_out, cross_reducer, merge_order))
# apply
if apply_node_func is not None:
self.apply_nodes(apply_node_func, v, ntype, inplace)
raise DGLError('DGLGraph.multi_recv is deprecated. As a replacement,\n'
' use DGLGraph.apply_edges API to compute messages as edge data.\n'
' Then use DGLGraph.multi_send_and_recv and set the message function\n'
' as dgl.function.copy_e to conduct message aggregation.')
def send_and_recv(self,
edges,
......@@ -3407,24 +3656,9 @@ class DGLHeteroGraph(object):
>>> import dgl.function as fn
>>> import torch
Instantiate a heterograph.
>>> follows_g = dgl.graph(([0, 1], [1, 2]), 'user', 'follows')
>>> plays_g = dgl.bipartite(([0, 1, 1, 2], [0, 0, 1, 1]), 'user', 'plays', 'game')
>>> g = dgl.hetero_from_relations([follows_g, plays_g])
Trigger "send" and "receive" separately.
>>> g.nodes['user'].data['h'] = torch.tensor([[0.], [1.], [2.]])
>>> g.send(g['follows'].edges(), fn.copy_src('h', 'm'), etype='follows')
>>> g.recv(g.nodes('user'), fn.sum('m', 'h'), etype='follows')
>>> g.nodes['user'].data['h']
tensor([[0.],
[0.],
[1.]])
Trigger "send" and "receive" in one call.
>>> g.nodes['user'].data['h'] = torch.tensor([[0.], [1.], [2.]])
>>> g.send_and_recv(g['follows'].edges(), fn.copy_src('h', 'm'),
>>> fn.sum('m', 'h'), etype='follows')
......@@ -3438,19 +3672,21 @@ class DGLHeteroGraph(object):
if isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=etype)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
eid = utils.prepare_tensor(self, edges, 'edges')
u, v = self.find_edges(eid, etype=etype)
if len(u) == 0:
# no edges to be triggered
return
with ir.prog() as prog:
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
eid = utils.toindex(eid, self._idtype_str)
scheduler.schedule_snr(AdaptedHeteroGraph(self, stid, dtid, etid),
(u, v, eid),
message_func, reduce_func, apply_node_func,
......@@ -3551,17 +3787,19 @@ class DGLHeteroGraph(object):
edges, mfunc, rfunc, afunc = args
if isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
u = utils.prepare_tensor(self, u, 'edges[0]')
v = utils.prepare_tensor(self, v, 'edges[1]')
eid = self.edge_ids(u, v, etype=etype)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
eid = utils.prepare_tensor(self, edges, 'edges')
u, v = self.find_edges(eid, etype=etype)
all_vs.append(v)
if len(u) == 0:
# no edges to be triggered
continue
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
eid = utils.toindex(eid, self._idtype_str)
scheduler.schedule_snr(AdaptedHeteroGraph(self, stid, dtid, etid),
(u, v, eid),
mfunc, rfunc, afunc,
......@@ -3573,7 +3811,7 @@ class DGLHeteroGraph(object):
self._node_frames[dtid].update(merge_frames(all_out, cross_reducer, merge_order))
# apply
if apply_node_func is not None:
dstnodes = F.unique(F.cat([x.tousertensor() for x in all_vs], 0))
dstnodes = F.unique(F.cat(all_vs, 0))
self.apply_nodes(apply_node_func, dstnodes, ntype, inplace)
def pull(self,
......@@ -3646,15 +3884,15 @@ class DGLHeteroGraph(object):
[1.],
[1.]])
"""
check_same_dtype(self._idtype_str, v)
# only one type of edges
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
v = utils.toindex(v, self._idtype_str)
v = utils.prepare_tensor(self, v, 'v')
if len(v) == 0:
return
with ir.prog() as prog:
v = utils.toindex(v, self._idtype_str)
scheduler.schedule_pull(AdaptedHeteroGraph(self, stid, dtid, etid),
v,
message_func, reduce_func, apply_node_func,
......@@ -3720,8 +3958,7 @@ class DGLHeteroGraph(object):
tensor([[0.],
[3.]])
"""
check_same_dtype(self._idtype_str, v)
v = utils.toindex(v, self._idtype_str)
v = utils.prepare_tensor(self, v, 'v')
if len(v) == 0:
return
# infer receive node type
......@@ -3741,6 +3978,7 @@ class DGLHeteroGraph(object):
raise DGLError('Invalid per-type arguments. Should be '
'(msg_func, reduce_func, [apply_node_func])')
mfunc, rfunc, afunc = args
v = utils.toindex(v, self._idtype_str)
scheduler.schedule_pull(AdaptedHeteroGraph(self, stid, dtid, etid),
v,
mfunc, rfunc, afunc,
......@@ -3813,15 +4051,15 @@ class DGLHeteroGraph(object):
[0.],
[0.]])
"""
check_same_dtype(self._idtype_str, u)
# only one type of edges
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
u = utils.toindex(u, self._idtype_str)
u = utils.prepare_tensor(self, u, 'u')
if len(u) == 0:
return
with ir.prog() as prog:
u = utils.toindex(u, self._idtype_str)
scheduler.schedule_push(AdaptedHeteroGraph(self, stid, dtid, etid),
u,
message_func, reduce_func, apply_node_func,
......@@ -4099,63 +4337,6 @@ class DGLHeteroGraph(object):
# Misc
#################################################################
def to_networkx(self, node_attrs=None, edge_attrs=None):
"""Convert this graph to networkx graph.
The edge id will be saved as the 'id' edge attribute.
Parameters
----------
node_attrs : iterable of str, optional
The node attributes to be copied.
edge_attrs : iterable of str, optional
The edge attributes to be copied.
Returns
-------
networkx.DiGraph
The nx graph
Examples
--------
.. note:: Here we use pytorch syntax for demo. The general idea applies
to other frameworks with minor syntax change (e.g. replace
``torch.tensor`` with ``mxnet.ndarray``).
>>> import torch as th
>>> g = DGLGraph()
>>> g.add_nodes(5, {'n1': th.randn(5, 10)})
>>> g.add_edges([0,1,3,4], [2,4,0,3], {'e1': th.randn(4, 6)})
>>> nxg = g.to_networkx(node_attrs=['n1'], edge_attrs=['e1'])
See Also
--------
dgl.to_networkx
"""
# TODO(minjie): multi-type support
assert len(self.ntypes) == 1
assert len(self.etypes) == 1
src, dst = self.edges()
src = F.asnumpy(src)
dst = F.asnumpy(dst)
# xiangsx: Always treat graph as multigraph
nx_graph = nx.MultiDiGraph()
nx_graph.add_nodes_from(range(self.number_of_nodes()))
for eid, (u, v) in enumerate(zip(src, dst)):
nx_graph.add_edge(u, v, id=eid)
if node_attrs is not None:
for nid, attr in nx_graph.nodes(data=True):
feat_dict = self._get_n_repr(0, nid)
attr.update({key: F.squeeze(feat_dict[key], 0) for key in node_attrs})
if edge_attrs is not None:
for _, _, attr in nx_graph.edges(data=True):
eid = attr['id']
feat_dict = self._get_e_repr(0, eid)
attr.update({key: F.squeeze(feat_dict[key], 0) for key in edge_attrs})
return nx_graph
def filter_nodes(self, predicate, nodes=ALL, ntype=None):
"""Return a tensor of node IDs with the given node type that satisfy
the given predicate.
......@@ -4189,22 +4370,15 @@ class DGLHeteroGraph(object):
>>> g.filter_nodes(lambda nodes: (nodes.data['h'] == 1.).squeeze(1), ntype='user')
tensor([1, 2])
"""
check_same_dtype(self._idtype_str, nodes)
ntid = self.get_ntype_id(ntype)
if is_all(nodes):
v = utils.toindex(slice(0, self._graph.number_of_nodes(ntid)), self._idtype_str)
else:
v = utils.toindex(nodes, self._idtype_str)
n_repr = self._get_n_repr(ntid, v)
nbatch = NodeBatch(v, n_repr, ntype=self.ntypes[ntid])
n_mask = F.copy_to(predicate(nbatch), F.cpu())
if is_all(nodes):
return F.nonzero_1d(n_mask)
else:
nodes = F.tensor(nodes)
return F.boolean_mask(nodes, n_mask)
with self.local_scope():
self.apply_nodes(lambda nbatch: {'_mask' : predicate(nbatch)}, nodes, ntype)
ntype = self.ntypes[0] if ntype is None else ntype
mask = self.nodes[ntype].data['_mask']
if is_all(nodes):
return F.nonzero_1d(mask)
else:
v = utils.prepare_tensor(self, nodes, 'nodes')
return F.boolean_mask(v, mask[v])
def filter_edges(self, predicate, edges=ALL, etype=None):
"""Return a tensor of edge IDs with the given edge type that satisfy
......@@ -4240,34 +4414,24 @@ class DGLHeteroGraph(object):
>>> g.filter_edges(lambda edges: (edges.data['h'] == 1.).squeeze(1), etype='follows')
tensor([1, 2])
"""
check_same_dtype(self._idtype_str, edges)
etid = self.get_etype_id(etype)
stid, dtid = self._graph.metagraph.find_edge(etid)
if is_all(edges):
u, v, _ = self._graph.edges(etid, 'eid')
eid = utils.toindex(slice(0, self._graph.number_of_edges(etid)), self._idtype_str)
elif isinstance(edges, tuple):
u, v = edges
u = utils.toindex(u, self._idtype_str)
v = utils.toindex(v, self._idtype_str)
# Rewrite u, v to handle edge broadcasting and multigraph.
u, v, eid = self._graph.edge_ids(etid, u, v)
else:
eid = utils.toindex(edges, self._idtype_str)
u, v, _ = self._graph.find_edges(etid, eid)
src_data = self._get_n_repr(stid, u)
edge_data = self._get_e_repr(etid, eid)
dst_data = self._get_n_repr(dtid, v)
ebatch = EdgeBatch((u, v, eid), src_data, edge_data, dst_data,
canonical_etype=self.canonical_etypes[etid])
e_mask = F.copy_to(predicate(ebatch), F.cpu())
with self.local_scope():
self.apply_edges(lambda ebatch: {'_mask' : predicate(ebatch)}, edges, etype)
etype = self.canonical_etypes[0] if etype is None else etype
mask = self.edges[etype].data['_mask']
if is_all(edges):
return F.nonzero_1d(mask)
else:
if isinstance(edges, tuple):
e = self.edge_ids(edges[0], edges[1], etype=etype)
else:
e = utils.prepare_tensor(self, edges, 'edges')
return F.boolean_mask(e, mask[e])
if is_all(edges):
return F.nonzero_1d(e_mask)
else:
edges = F.tensor(edges)
return F.boolean_mask(edges, e_mask)
def readonly(self, readonly_state=True):
"""Deprecated: DGLGraph will always be mutable."""
dgl_warning('DGLGraph.is_readonly is deprecated in v0.5.\n'
'DGLGraph now always supports mutable operations like add_nodes'
' and add_edges.')
@property
def device(self):
......@@ -4290,12 +4454,12 @@ class DGLHeteroGraph(object):
"""
return F.to_backend_ctx(self._graph.ctx)
def to(self, ctx, **kwargs): # pylint: disable=invalid-name
"""Move ndata, edata and graph structure to the targeted device context (cpu/gpu).
def to(self, device, **kwargs): # pylint: disable=invalid-name
"""Move ndata, edata and graph structure to the targeted device (cpu/gpu).
Parameters
----------
ctx : Framework-specific device context object
device : Framework-specific device context object
The context to move data to.
kwargs : Key-word arguments.
Key-word arguments fed to the framework copy function.
......@@ -4319,19 +4483,86 @@ class DGLHeteroGraph(object):
>>> print(g.device)
device(type='cpu')
"""
if device is None or self.device == device:
return utils.to_int32_graph_if_on_gpu(self)
ret = copy.copy(self)
# 1. Copy graph structure
ret._graph = self._graph.copy_to(utils.to_dgl_context(device))
# 2. Copy features
# TODO(minjie): handle initializer
new_nframes = []
for nframe in self._node_frames:
new_feats = {k : F.copy_to(feat, ctx) for k, feat in nframe.items()}
new_feats = {k : F.copy_to(feat, device, **kwargs) for k, feat in nframe.items()}
new_nframes.append(FrameRef(Frame(new_feats, num_rows=nframe.num_rows)))
ret._node_frames = new_nframes
new_eframes = []
for eframe in self._edge_frames:
new_feats = {k : F.copy_to(feat, ctx) for k, feat in eframe.items()}
new_feats = {k : F.copy_to(feat, device, **kwargs) for k, feat in eframe.items()}
new_eframes.append(FrameRef(Frame(new_feats, num_rows=eframe.num_rows)))
# TODO(minjie): replace the following line with the commented one to enable GPU graph.
new_gidx = self._graph
#new_gidx = self._graph.copy_to(utils.to_dgl_context(ctx))
return self.__class__(new_gidx, self.ntypes, self.etypes,
new_nframes, new_eframes)
ret._edge_frames = new_eframes
# 2. Copy misc info
if self._batch_num_nodes is not None:
new_bnn = {k : F.copy_to(num, device, **kwargs)
for k, num in self._batch_num_nodes.items()}
ret._batch_num_nodes = new_bnn
if self._batch_num_edges is not None:
new_bne = {k : F.copy_to(num, device, **kwargs)
for k, num in self._batch_num_edges.items()}
ret._batch_num_edges = new_bne
ret = utils.to_int32_graph_if_on_gpu(ret)
return ret
def cpu(self):
"""Return a new copy of this graph on CPU.
Returns
-------
DGLHeteroGraph
Graph on CPU.
See Also
--------
to
"""
return self.to(F.cpu())
def clone(self):
"""Return a heterograph object that is a clone of current graph.
Returns
-------
DGLHeteroGraph
The graph object that is a clone of current graph.
"""
# XXX(minjie): Do a shallow copy first to clone some internal metagraph information.
# Not a beautiful solution though.
ret = copy.copy(self)
# Clone the graph structure
meta_edges = []
for s_ntype, _, d_ntype in self.canonical_etypes:
meta_edges.append((self.get_ntype_id(s_ntype), self.get_ntype_id(d_ntype)))
metagraph = graph_index.from_edge_list(meta_edges, True)
# rebuild graph idx
num_nodes_per_type = [self.number_of_nodes(c_ntype) for c_ntype in self.ntypes]
relation_graphs = [self._graph.get_relation_graph(self.get_etype_id(c_etype))
for c_etype in self.canonical_etypes]
ret._graph = heterograph_index.create_heterograph_from_relations(
metagraph, relation_graphs, utils.toindex(num_nodes_per_type, "int64"))
# Clone the frames
ret._node_frames = [fr.clone() for fr in self._node_frames]
ret._edge_frames = [fr.clone() for fr in self._edge_frames]
return ret
def local_var(self):
"""Return a heterograph object that can be used in a local function scope.
......@@ -4391,11 +4622,9 @@ class DGLHeteroGraph(object):
--------
local_var
"""
local_node_frames = [fr.clone() for fr in self._node_frames]
local_edge_frames = [fr.clone() for fr in self._edge_frames]
ret = copy.copy(self)
ret._node_frames = local_node_frames
ret._edge_frames = local_edge_frames
ret._node_frames = [fr.clone() for fr in self._node_frames]
ret._edge_frames = [fr.clone() for fr in self._edge_frames]
return ret
@contextmanager
......@@ -4451,215 +4680,158 @@ class DGLHeteroGraph(object):
self._node_frames = old_nframes
self._edge_frames = old_eframes
def is_homograph(self):
def is_homogeneous(self):
"""Return if the graph is homogeneous."""
return len(self.ntypes) == 1 and len(self.etypes) == 1
def format_in_use(self, etype=None):
"""Return the sparse formats in use of the given edge/relation type.
def formats(self, formats=None):
r"""Get a cloned graph with the specified sparse format(s) or query
for the usage status of sparse formats
The API copies both the graph structure and the features.
If the input graph has multiple edge types, they will have the same
sparse format.
Parameters
----------
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
formats : str or list of str or None
* If formats is None, return the usage status of sparse formats
* Otherwise, it can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of
them, specifying the sparse formats to use.
Returns
-------
list of str
Return all the formats currently in use (could be multiple).
dict or DGLGraph
* If formats is None, the result will be a dict recording the usage
status of sparse formats.
* Otherwise, a DGLGraph will be returned, which is a clone of the
original graph with the specified sparse format(s) ``formats``.
Examples
--------
For graph with only one edge type.
>>> g = dgl.graph(([0, 1], [1, 2]), 'user', 'follows', restrict_format='csr')
>>> g.format_in_use()
['csr']
For a graph with multiple types.
The following example uses PyTorch backend.
>>> g = dgl.heterograph({
... ('user', 'plays', 'game'): ([0, 1, 1, 2], [0, 0, 1, 1]),
... ('developer', 'develops', 'game'): ([0, 1], [0, 1]),
... }, restrict_format='any')
>>> g.format_in_use('develops')
['coo']
>>> spmat = g['develops'].adjacency_matrix(
... transpose=True, scipy_fmt='csr') // Create CSR representation.
>>> g.format_in_use('develops')
['coo', 'csr']
>>> import dgl
>>> import torch
which is equivalent to:
**Homographs or Heterographs with A Single Edge Type**
>>> g = dgl.graph([(0, 2), (0, 3), (1, 2)])
>>> g.ndata['h'] = torch.ones(4, 1)
>>> # Check status of format usage
>>> g.formats()
{'created': ['coo'], 'not created': ['csr', 'csc']}
>>> # Get a clone of the graph with 'csr' format
>>> csr_g = g.formats('csr')
>>> # Only allowed formats will be displayed in the status query
>>> csr_g.formats()
{'created': ['csr'], 'not created': []}
>>> # Features are copied as well
>>> csr_g.ndata['h']
tensor([[1.],
[1.],
[1.],
[1.]])
>>> g['develops'].restrict_format()
['coo', 'csr']
**Heterographs with Multiple Edge Types**
See Also
--------
restrict_format
request_format
to_format
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.formats()
{'created': ['coo'], 'not created': ['csr', 'csc']}
>>> # Get a clone of the graph with 'csr' format
>>> csr_g = g.formats('csr')
>>> # Only allowed formats will be displayed in the status query
>>> csr_g.formats()
{'created': ['csr'], 'not created': []}
"""
return self._graph.format_in_use(self.get_etype_id(etype))
def restrict_format(self, etype=None):
"""Return the allowed sparse formats of the given edge/relation type.
if formats is None:
# Return the format information
return self._graph.formats()
else:
# Convert the graph to use another format
ret = copy.copy(self)
ret._graph = self._graph.formats(formats)
return ret
Parameters
----------
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
def create_format_(self):
r"""Create all sparse matrices allowed for the graph.
Returns
-------
str : ``'any'``, ``'coo'``, ``'csr'``, or ``'csc'``
``'any'`` indicates all sparse formats are allowed in .
By default, we create sparse matrices for a graph only when necessary.
In some cases we may want to create them immediately (e.g. in a
multi-process data loader), which can be achieved via this API.
Examples
--------
For graph with only one edge type.
>>> g = dgl.graph([(0, 1), (1, 2)], 'user', 'follows', restrict_format='csr')
>>> g.restrict_format()
'csr'
For a graph with multiple types.
>>> g = dgl.heterograph({
... ('user', 'plays', 'game'): ([0, 1, 1, 2], [0, 0, 1, 1]),
... ('developer', 'develops', 'game'): ([0, 1], [0, 1]),
... }, restrict_format='any')
>>> g.restrict_format('develops')
'any'
which is equivalent to:
>>> g['develops'].restrict_format()
'any'
See Also
--------
format_in_use
request_format
to_format
"""
return self._graph.restrict_format(self.get_etype_id(etype))
The following example uses PyTorch backend.
def request_format(self, sparse_format, etype=None):
"""Create a sparse matrix representation in given format immediately.
>>> import dgl
>>> import torch
When the restrict format of the given edge type is ``any``, all formats of
sparse matrix representation are created in demand. In some cases user may
want a sparse matrix representation to be created immediately (e.g. in a
multi-process data loader), this API is designed for such purpose.
**Homographs or Heterographs with A Single Edge Type**
Parameters
----------
sparse_format : str
``'coo'``, ``'csr'``, or ``'csc'``
etype : str or tuple of str, optional
The edge type. Can be omitted if there is only one edge type
in the graph.
Examples
--------
For graph with only one edge type.
>>> g = dgl.graph([(0, 1), (1, 2)], 'user', 'follows', restrict_format='any')
>>> g.format_in_use()
['coo']
>>> g.request_format('csr')
>>> g.format_in_use()
['coo', 'csr']
>>> g = dgl.graph([(0, 2), (0, 3), (1, 2)])
>>> g.format()
{'created': ['coo'], 'not created': ['csr', 'csc']}
>>> g.create_format_()
>>> g.format()
{'created': ['coo', 'csr', 'csc'], 'not created': []}
For a graph with multiple types.
**Heterographs with Multiple Edge Types**
>>> g = dgl.heterograph({
... ('user', 'plays', 'game'): ([0, 1, 1, 2], [0, 0, 1, 1]),
... ('developer', 'develops', 'game'): ([0, 1], [0, 1]),
... }, restrict_format='any')
>>> g.format_in_use('develops')
['coo']
>>> g.request_format('csc', etype='develops')
>>> g.format_in_use('develops')
['coo', 'csc']
Another way to request format for a given etype is:
>>> g['plays'].request_format('csr')
>>> g['plays'].format_in_use()
['coo', 'csr']
See Also
--------
format_in_use
restrict_format
to_format
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g.format()
{'created': ['coo'], 'not created': ['csr', 'csc']}
>>> g.create_format_()
>>> g.format()
{'created': ['coo', 'csr', 'csc'], 'not created': []}
"""
if self.restrict_format(etype) != 'any':
raise KeyError("request_format is only available for "
"graph whose restrict_format is 'any'")
if not sparse_format in ['coo', 'csr', 'csc']:
raise KeyError("can only request coo/csr/csr.")
return self._graph.request_format(sparse_format, self.get_etype_id(etype))
return self._graph.create_format_()
def astype(self, idtype):
"""Cast this graph to use another ID type.
def to_format(self, restrict_format):
"""Return a cloned graph but stored in the given restrict format.
If ``'any'`` is given, the restrict formats of the returned graph is relaxed.
The returned graph share the same node/edge data of the original graph.
Features are copied (shallow copy) to the new graph.
Parameters
----------
restrict_format : str
Desired restrict format (``'any'``, ``'coo'``, ``'csr'``, ``'csc'``).
idtype : Data type object.
New ID type. Can only be int32 or int64.
Returns
-------
A new graph.
Examples
--------
For a graph with single edge type:
>>> g = dgl.graph([(0, 1), (1, 2)], 'user', 'follows', restrict_format='csr')
>>> g.ndata['h'] = th.ones(3, 3)
>>> g.restrict_format()
'csr'
>>> g1 = g.to_format('coo')
>>> g1.ndata
{'h': tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])}
>>> g1.restrict_format()
'coo'
For a graph with multiple edge types:
>>> g = dgl.heterograph({
... ('user', 'plays', 'game'): ([0, 1, 1, 2], [0, 0, 1, 1]),
... ('developer', 'develops', 'game'): ([0, 1], [0, 1]),
... }, restrict_format='coo')
>>> g.restrict_format('develops')
'coo'
>>> g1 = g.to_format('any')
>>> g1.restrict_format('plays')
'any'
See Also
--------
format_in_use
restrict_format
request_format
DGLHeteroGraph
Graph in the new ID type.
"""
return self.__class__(self._graph.to_format(restrict_format), self.ntypes, self.etypes,
self._node_frames,
self._edge_frames)
if idtype is None:
return self
if not idtype in (F.int32, F.int64):
raise DGLError("ID type must be int32 or int64, but got {}.".format(idtype))
if self.idtype == idtype:
return self
bits = 32 if idtype == F.int32 else 64
ret = copy.copy(self)
ret._graph = self._graph.asbits(bits)
return ret
def long(self):
"""Return a heterograph object use int64 as index dtype,
with the ndata and edata as the original object
"""Cast this graph to use int64 IDs.
Features are copied (shallow copy) to the new graph.
Returns
-------
......@@ -4670,17 +4842,16 @@ class DGLHeteroGraph(object):
--------
>>> g = dgl.bipartite(([0, 1, 1], [0, 0, 2]), 'user', 'plays', 'game',
>>> index_dtype='int32')
>>> idtype=torch.int32)
>>> g_long = g.long() # Convert g to int64 indexed, not changing the original `g`
See Also
--------
int
idtype
astype
"""
return self.__class__(self._graph.asbits(64), self.ntypes, self.etypes,
self._node_frames,
self._edge_frames)
return self.astype(F.int64)
def int(self):
"""Return a heterograph object use int32 as index dtype,
......@@ -4695,17 +4866,16 @@ class DGLHeteroGraph(object):
--------
>>> g = dgl.bipartite(([0, 1, 1], [0, 0, 2]), 'user', 'plays', 'game',
>>> index_dtype='int64')
>>> idtype=torch.int64)
>>> g_int = g.int() # Convert g to int32 indexed, not changing the original `g`
See Also
--------
long
idtype
astype
"""
return self.__class__(self._graph.asbits(32), self.ntypes, self.etypes,
self._node_frames,
self._edge_frames)
return self.astype(F.int32)
############################################################
......@@ -5012,13 +5182,24 @@ class AdaptedHeteroGraph(GraphAdapter):
self.graph._set_msg_index(self.etid, val)
def in_edges(self, nodes):
return self.graph._graph.in_edges(self.etid, nodes)
nodes = nodes.tousertensor(self.graph.device)
src, dst, eid = self.graph._graph.in_edges(self.etid, nodes)
return (utils.toindex(src, self.graph._graph.dtype),
utils.toindex(dst, self.graph._graph.dtype),
utils.toindex(eid, self.graph._graph.dtype))
def out_edges(self, nodes):
return self.graph._graph.out_edges(self.etid, nodes)
nodes = nodes.tousertensor(self.graph.device)
src, dst, eid = self.graph._graph.out_edges(self.etid, nodes)
return (utils.toindex(src, self.graph._graph.dtype),
utils.toindex(dst, self.graph._graph.dtype),
utils.toindex(eid, self.graph._graph.dtype))
def edges(self, form):
return self.graph._graph.edges(self.etid, form)
src, dst, eid = self.graph._graph.edges(self.etid, form)
return (utils.toindex(src, self.graph._graph.dtype),
utils.toindex(dst, self.graph._graph.dtype),
utils.toindex(eid, self.graph._graph.dtype))
def get_immutable_gidx(self, ctx):
return self.graph._graph.get_unitgraph(self.etid, ctx)
......
......@@ -53,7 +53,8 @@ class HeteroGraphIndex(ObjectBase):
num_dst = number_of_nodes[dst_ntype]
src_id, dst_id, _ = edges_per_type
rel_graphs.append(create_unitgraph_from_coo(
1 if src_ntype == dst_ntype else 2, num_src, num_dst, src_id, dst_id, "any"))
1 if src_ntype == dst_ntype else 2, num_src, num_dst, src_id, dst_id,
['coo', 'csr', ' csc']))
self.__init_handle_by_constructor__(
_CAPI_DGLHeteroCreateHeteroGraph, metagraph, rel_graphs)
......@@ -283,23 +284,6 @@ class HeteroGraphIndex(ObjectBase):
"""
return _CAPI_DGLHeteroNumEdges(self, int(etype))
def has_node(self, ntype, vid):
"""Return true if the node exists.
Parameters
----------
ntype : int
Node type
vid : int
The nodes
Returns
-------
bool
True if the node exists, False otherwise.
"""
return bool(_CAPI_DGLHeteroHasVertex(self, int(ntype), int(vid)))
def has_nodes(self, ntype, vids):
"""Return true if the nodes exist.
......@@ -307,35 +291,16 @@ class HeteroGraphIndex(ObjectBase):
----------
ntype : int
Node type
vid : utils.Index
The nodes
vid : Tensor
Node IDs
Returns
-------
utils.Index
Tensor
0-1 array indicating existence
"""
vid_array = vids.todgltensor()
return utils.toindex(_CAPI_DGLHeteroHasVertices(self, int(ntype), vid_array), self.dtype)
def has_edge_between(self, etype, u, v):
"""Return true if the edge exists.
Parameters
----------
etype : int
Edge type
u : int
The src node.
v : int
The dst node.
Returns
-------
bool
True if the edge exists, False otherwise
"""
return bool(_CAPI_DGLHeteroHasEdgeBetween(self, int(etype), int(u), int(v)))
return F.from_dgl_nd(_CAPI_DGLHeteroHasVertices(
self, int(ntype), F.to_dgl_nd(vids)))
def has_edges_between(self, etype, u, v):
"""Return true if the edge exists.
......@@ -344,20 +309,18 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
u : utils.Index
The src nodes.
v : utils.Index
The dst nodes.
u : Tensor
Src node Ids.
v : Tensor
Dst node Ids.
Returns
-------
utils.Index
Tensor
0-1 array indicating existence
"""
u_array = u.todgltensor()
v_array = v.todgltensor()
return utils.toindex(_CAPI_DGLHeteroHasEdgesBetween(
self, int(etype), u_array, v_array), self.dtype)
return F.from_dgl_nd(_CAPI_DGLHeteroHasEdgesBetween(
self, int(etype), F.to_dgl_nd(u), F.to_dgl_nd(v)))
def predecessors(self, etype, v):
"""Return the predecessors of the node.
......@@ -373,11 +336,11 @@ class HeteroGraphIndex(ObjectBase):
Returns
-------
utils.Index
Tensor
Array of predecessors
"""
return utils.toindex(_CAPI_DGLHeteroPredecessors(
self, int(etype), int(v)), self.dtype)
return F.from_dgl_nd(_CAPI_DGLHeteroPredecessors(
self, int(etype), int(v)))
def successors(self, etype, v):
"""Return the successors of the node.
......@@ -393,62 +356,62 @@ class HeteroGraphIndex(ObjectBase):
Returns
-------
utils.Index
Tensor
Array of successors
"""
return utils.toindex(_CAPI_DGLHeteroSuccessors(
self, int(etype), int(v)), self.dtype)
return F.from_dgl_nd(_CAPI_DGLHeteroSuccessors(
self, int(etype), int(v)))
def edge_id(self, etype, u, v):
"""Return the id array of all edges between u and v.
def edge_ids_all(self, etype, u, v):
"""Return a triplet of arrays that contains the edge IDs.
Parameters
----------
etype : int
Edge type
u : int
The src node.
v : int
The dst node.
u : Tensor
The src nodes.
v : Tensor
The dst nodes.
Returns
-------
utils.Index
The edge id array.
Tensor
The src nodes.
Tensor
The dst nodes.
Tensor
The edge ids.
"""
return utils.toindex(_CAPI_DGLHeteroEdgeId(
self, int(etype), int(u), int(v)), self.dtype)
edge_array = _CAPI_DGLHeteroEdgeIdsAll(
self, int(etype), F.to_dgl_nd(u), F.to_dgl_nd(v))
def edge_ids(self, etype, u, v):
"""Return a triplet of arrays that contains the edge IDs.
src = F.from_dgl_nd(edge_array(0))
dst = F.from_dgl_nd(edge_array(1))
eid = F.from_dgl_nd(edge_array(2))
return src, dst, eid
def edge_ids_one(self, etype, u, v):
"""Return an arrays of edge IDs.
Parameters
----------
etype : int
Edge type
u : utils.Index
u : Tensor
The src nodes.
v : utils.Index
v : Tensor
The dst nodes.
Returns
-------
utils.Index
The src nodes.
utils.Index
The dst nodes.
utils.Index
Tensor
The edge ids.
"""
u_array = u.todgltensor()
v_array = v.todgltensor()
edge_array = _CAPI_DGLHeteroEdgeIds(self, int(etype), u_array, v_array)
src = utils.toindex(edge_array(0), self.dtype)
dst = utils.toindex(edge_array(1), self.dtype)
eid = utils.toindex(edge_array(2), self.dtype)
return src, dst, eid
eid = F.from_dgl_nd(_CAPI_DGLHeteroEdgeIdsOne(
self, int(etype), F.to_dgl_nd(u), F.to_dgl_nd(v)))
return eid
def find_edges(self, etype, eid):
"""Return a triplet of arrays that contains the edge IDs.
......@@ -457,24 +420,24 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
eid : utils.Index
The edge ids.
eid : Tensor
Edge ids.
Returns
-------
utils.Index
Tensor
The src nodes.
utils.Index
Tensor
The dst nodes.
utils.Index
Tensor
The edge ids.
"""
eid_array = eid.todgltensor()
edge_array = _CAPI_DGLHeteroFindEdges(self, int(etype), eid_array)
edge_array = _CAPI_DGLHeteroFindEdges(
self, int(etype), F.to_dgl_nd(eid))
src = utils.toindex(edge_array(0), self.dtype)
dst = utils.toindex(edge_array(1), self.dtype)
eid = utils.toindex(edge_array(2), self.dtype)
src = F.from_dgl_nd(edge_array(0))
dst = F.from_dgl_nd(edge_array(1))
eid = F.from_dgl_nd(edge_array(2))
return src, dst, eid
......@@ -487,26 +450,22 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
v : utils.Index
The node(s).
v : Tensor
Node IDs.
Returns
-------
utils.Index
Tensor
The src nodes.
utils.Index
Tensor
The dst nodes.
utils.Index
Tensor
The edge ids.
"""
if len(v) == 1:
edge_array = _CAPI_DGLHeteroInEdges_1(self, int(etype), int(v[0]))
else:
v_array = v.todgltensor()
edge_array = _CAPI_DGLHeteroInEdges_2(self, int(etype), v_array)
src = utils.toindex(edge_array(0), self.dtype)
dst = utils.toindex(edge_array(1), self.dtype)
eid = utils.toindex(edge_array(2), self.dtype)
edge_array = _CAPI_DGLHeteroInEdges_2(self, int(etype), F.to_dgl_nd(v))
src = F.from_dgl_nd(edge_array(0))
dst = F.from_dgl_nd(edge_array(1))
eid = F.from_dgl_nd(edge_array(2))
return src, dst, eid
def out_edges(self, etype, v):
......@@ -518,26 +477,22 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
v : utils.Index
The node(s).
v : Tensor
Node IDs.
Returns
-------
utils.Index
Tensor
The src nodes.
utils.Index
Tensor
The dst nodes.
utils.Index
Tensor
The edge ids.
"""
if len(v) == 1:
edge_array = _CAPI_DGLHeteroOutEdges_1(self, int(etype), int(v[0]))
else:
v_array = v.todgltensor()
edge_array = _CAPI_DGLHeteroOutEdges_2(self, int(etype), v_array)
src = utils.toindex(edge_array(0), self.dtype)
dst = utils.toindex(edge_array(1), self.dtype)
eid = utils.toindex(edge_array(2), self.dtype)
edge_array = _CAPI_DGLHeteroOutEdges_2(self, int(etype), F.to_dgl_nd(v))
src = F.from_dgl_nd(edge_array(0))
dst = F.from_dgl_nd(edge_array(1))
eid = F.from_dgl_nd(edge_array(2))
return src, dst, eid
@utils.cached_member(cache='_cache', prefix='edges')
......@@ -557,43 +512,21 @@ class HeteroGraphIndex(ObjectBase):
Returns
-------
utils.Index
Tensor
The src nodes.
utils.Index
Tensor
The dst nodes.
utils.Index
Tensor
The edge ids.
"""
if order is None:
order = ""
edge_array = _CAPI_DGLHeteroEdges(self, int(etype), order)
src = edge_array(0)
dst = edge_array(1)
eid = edge_array(2)
src = utils.toindex(src, self.dtype)
dst = utils.toindex(dst, self.dtype)
eid = utils.toindex(eid, self.dtype)
src = F.from_dgl_nd(edge_array(0))
dst = F.from_dgl_nd(edge_array(1))
eid = F.from_dgl_nd(edge_array(2))
return src, dst, eid
def in_degree(self, etype, v):
"""Return the in degree of the node.
Assume that node_type(v) == dst_type(etype). Thus, the ntype argument is omitted.
Parameters
----------
etype : int
Edge type
v : int
The node.
Returns
-------
int
The in degree.
"""
return _CAPI_DGLHeteroInDegree(self, int(etype), int(v))
def in_degrees(self, etype, v):
"""Return the in degrees of the nodes.
......@@ -603,35 +536,16 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
v : utils.Index
v : Tensor
The nodes.
Returns
-------
int
Tensor
The in degree array.
"""
v_array = v.todgltensor()
return utils.toindex(_CAPI_DGLHeteroInDegrees(self, int(etype), v_array), self.dtype)
def out_degree(self, etype, v):
"""Return the out degree of the node.
Assume that node_type(v) == src_type(etype). Thus, the ntype argument is omitted.
Parameters
----------
etype : int
Edge type
v : int
The node.
Returns
-------
int
The out degree.
"""
return _CAPI_DGLHeteroOutDegree(self, int(etype), int(v))
return F.from_dgl_nd(_CAPI_DGLHeteroInDegrees(
self, int(etype), F.to_dgl_nd(v)))
def out_degrees(self, etype, v):
"""Return the out degrees of the nodes.
......@@ -642,16 +556,16 @@ class HeteroGraphIndex(ObjectBase):
----------
etype : int
Edge type
v : utils.Index
v : Tensor
The nodes.
Returns
-------
int
Tensor
The out degree array.
"""
v_array = v.todgltensor()
return utils.toindex(_CAPI_DGLHeteroOutDegrees(self, int(etype), v_array), self.dtype)
return F.from_dgl_nd(_CAPI_DGLHeteroOutDegrees(
self, int(etype), F.to_dgl_nd(v)))
def adjacency_matrix(self, etype, transpose, ctx):
"""Return the adjacency matrix representation of this graph.
......@@ -675,7 +589,7 @@ class HeteroGraphIndex(ObjectBase):
-------
SparseTensor
The adjacency matrix.
utils.Index
Tensor
A index for data shuffling due to sparse format change. Return None
if shuffle is not required.
"""
......@@ -690,20 +604,18 @@ class HeteroGraphIndex(ObjectBase):
ncols = self.number_of_nodes(dsttype) if transpose else self.number_of_nodes(srctype)
nnz = self.number_of_edges(etype)
if fmt == "csr":
indptr = F.copy_to(utils.toindex(rst(0), self.dtype).tousertensor(), ctx)
indices = F.copy_to(utils.toindex(rst(1), self.dtype).tousertensor(), ctx)
shuffle = utils.toindex(rst(2), self.dtype)
indptr = F.copy_to(F.from_dgl_nd(rst(0)), ctx)
indices = F.copy_to(F.from_dgl_nd(rst(1)), ctx)
shuffle = F.copy_to(F.from_dgl_nd(rst(2)), ctx)
dat = F.ones(nnz, dtype=F.float32, ctx=ctx) # FIXME(minjie): data type
spmat = F.sparse_matrix(dat, ('csr', indices, indptr), (nrows, ncols))[0]
return spmat, shuffle
elif fmt == "coo":
idx = F.copy_to(utils.toindex(rst(0), self.dtype).tousertensor(), ctx)
idx = F.copy_to(F.from_dgl_nd(rst(0)), ctx)
idx = F.reshape(idx, (2, nnz))
dat = F.ones((nnz,), dtype=F.float32, ctx=ctx)
adj, shuffle_idx = F.sparse_matrix(
dat, ('coo', idx), (nrows, ncols))
shuffle_idx = utils.toindex(
shuffle_idx, self.dtype) if shuffle_idx is not None else None
return adj, shuffle_idx
else:
raise Exception("unknown format")
......@@ -802,9 +714,6 @@ class HeteroGraphIndex(ObjectBase):
if shuffle is not required.
"""
src, dst, eid = self.edges(etype)
src = src.tousertensor(ctx) # the index of the ctx will be cached
dst = dst.tousertensor(ctx) # the index of the ctx will be cached
eid = eid.tousertensor(ctx) # the index of the ctx will be cached
srctype, dsttype = self.metagraph.find_edge(etype)
m = self.number_of_edges(etype)
......@@ -845,7 +754,6 @@ class HeteroGraphIndex(ObjectBase):
inc, shuffle_idx = F.sparse_matrix(dat, ('coo', idx), (n, m))
else:
raise DGLError('Invalid incidence matrix type: %s' % str(typestr))
shuffle_idx = utils.toindex(shuffle_idx) if shuffle_idx is not None else None
return inc, shuffle_idx
def node_subgraph(self, induced_nodes):
......@@ -862,7 +770,7 @@ class HeteroGraphIndex(ObjectBase):
SubgraphIndex
The subgraph index.
"""
vids = [nodes.todgltensor() for nodes in induced_nodes]
vids = [F.to_dgl_nd(nodes) for nodes in induced_nodes]
return _CAPI_DGLHeteroVertexSubgraph(self, vids)
def edge_subgraph(self, induced_edges, preserve_nodes):
......@@ -883,7 +791,7 @@ class HeteroGraphIndex(ObjectBase):
SubgraphIndex
The subgraph index.
"""
eids = [edges.todgltensor() for edges in induced_edges]
eids = [F.to_dgl_nd(edges) for edges in induced_edges]
return _CAPI_DGLHeteroEdgeSubgraph(self, eids, preserve_nodes)
@utils.cached_member(cache='_cache', prefix='unitgraph')
......@@ -928,73 +836,54 @@ class HeteroGraphIndex(ObjectBase):
rev_order = rev_csr(2)
return utils.toindex(order, self.dtype), utils.toindex(rev_order, self.dtype)
def format_in_use(self, etype):
"""Return the sparse formats in use of the given edge/relation type.
def formats(self, formats=None):
"""Get a graph index with the specified sparse format(s) or query
for the usage status of sparse formats
Parameters
----------
etype : int
The edge/relation type.
Returns
-------
list of string : return all the formats currently in use (could be multiple).
"""
format_code = _CAPI_DGLHeteroGetFormatInUse(self, etype)
ret = []
if format_code & 1:
ret.append('coo')
format_code >>= 1
if format_code & 1:
ret.append('csr')
format_code >>= 1
if format_code & 1:
ret.append('csc')
return ret
def restrict_format(self, etype):
"""Return restrict sparse format of the given edge/relation type.
If the graph has multiple edge types, they will have the same
sparse format.
Parameters
----------
etype : int
The edge/relation type.
formats : str or list of str or None
* If formats is None, return the usage status of sparse formats
* Otherwise, it can be ``'coo'``/``'csr'``/``'csc'`` or a sublist of
them, specifying the sparse formats to use.
Returns
-------
string : ``'any'``, ``'coo'``, ``'csr'``, or ``'csc'``
"""
ret = _CAPI_DGLHeteroGetRestrictFormat(self, etype)
return ret
dict or GraphIndex
def request_format(self, sparse_format, etype):
"""Create a sparse matrix representation in given format immediately.
* If formats is None, the result will be a dict recording the usage
status of sparse formats.
* Otherwise, a GraphIndex will be returned, which is a clone of the
original graph with the specified sparse format(s) ``formats``.
Parameters
----------
etype : int
The edge/relation type.
sparse_format : str
``'coo'``, ``'csr'``, or ``'csc'``
"""
_CAPI_DGLHeteroRequestFormat(self, sparse_format, etype)
def to_format(self, restrict_format):
"""Return a clone graph index but stored in the given sparse format.
If 'any' is given, the restrict formats of the returned graph index
is relaxed.
Parameters
----------
restrict_format : str
Desired restrict format (``'any'``, ``'coo'``, ``'csr'``, ``'csc'``).
formats_allowed = _CAPI_DGLHeteroGetAllowedFormats(self)
formats_created = _CAPI_DGLHeteroGetCreatedFormats(self)
created = []
not_created = []
if formats is None:
for fmt in ['coo', 'csr', 'csc']:
if fmt in formats_allowed:
if fmt in formats_created:
created.append(fmt)
else:
not_created.append(fmt)
return {
'created': created,
'not created': not_created
}
else:
if isinstance(formats, str):
formats = [formats]
return _CAPI_DGLHeteroGetFormatGraph(self, formats)
Returns
-------
A new graph index.
"""
return _CAPI_DGLHeteroGetFormatGraph(self, restrict_format)
def create_format_(self):
"""Create all sparse matrices allowed for the graph."""
return _CAPI_DGLHeteroCreateFormat(self)
@utils.cached_member(cache='_cache', prefix='reverse')
def reverse(self):
......@@ -1033,7 +922,7 @@ class HeteroSubgraphIndex(ObjectBase):
Induced nodes
"""
ret = _CAPI_DGLHeteroSubgraphGetInducedVertices(self)
return [utils.toindex(v, self.graph.dtype) for v in ret]
return [F.from_dgl_nd(v) for v in ret]
@property
def induced_edges(self):
......@@ -1046,7 +935,7 @@ class HeteroSubgraphIndex(ObjectBase):
Induced edges
"""
ret = _CAPI_DGLHeteroSubgraphGetInducedEdges(self)
return [utils.toindex(v, self.graph.dtype) for v in ret]
return [F.from_dgl_nd(v) for v in ret]
#################################################################
......@@ -1054,7 +943,7 @@ class HeteroSubgraphIndex(ObjectBase):
#################################################################
def create_unitgraph_from_coo(num_ntypes, num_src, num_dst, row, col,
restrict_format):
formats):
"""Create a unitgraph graph index from COO format
Parameters
......@@ -1069,19 +958,22 @@ def create_unitgraph_from_coo(num_ntypes, num_src, num_dst, row, col,
Row index.
col : utils.Index
Col index.
restrict_format : "any", "coo", "csr" or "csc"
Restrict the storage format of the unit graph.
formats : list of str.
Restrict the storage formats allowed for the unit graph.
Returns
-------
HeteroGraphIndex
"""
if isinstance(formats, str):
formats = [formats]
return _CAPI_DGLHeteroCreateUnitGraphFromCOO(
int(num_ntypes), int(num_src), int(num_dst), row.todgltensor(), col.todgltensor(),
restrict_format)
int(num_ntypes), int(num_src), int(num_dst),
F.to_dgl_nd(row), F.to_dgl_nd(col),
formats)
def create_unitgraph_from_csr(num_ntypes, num_src, num_dst, indptr, indices, edge_ids,
restrict_format):
formats):
"""Create a unitgraph graph index from CSR format
Parameters
......@@ -1098,17 +990,19 @@ def create_unitgraph_from_csr(num_ntypes, num_src, num_dst, indptr, indices, edg
CSR indices.
edge_ids : utils.Index
Edge shuffle id.
restrict_format : "any", "coo", "csr" or "csc"
Restrict the storage format of the unit graph.
formats : str
Restrict the storage formats allowed for the unit graph.
Returns
-------
HeteroGraphIndex
"""
if isinstance(formats, str):
formats = [formats]
return _CAPI_DGLHeteroCreateUnitGraphFromCSR(
int(num_ntypes), int(num_src), int(num_dst),
indptr.todgltensor(), indices.todgltensor(), edge_ids.todgltensor(),
restrict_format)
F.to_dgl_nd(indptr), F.to_dgl_nd(indices), F.to_dgl_nd(edge_ids),
formats)
def create_heterograph_from_relations(metagraph, rel_graphs, num_nodes_per_type):
"""Create a heterograph from metagraph and graphs of every relation.
......
......@@ -90,6 +90,26 @@ def zerocopy_from_numpy(np_data):
handle = ctypes.pointer(arr)
return NDArray(handle, is_view=True)
def cast_to_signed(arr):
"""Cast this NDArray from unsigned integer to signed one.
uint64 -> int64
uint32 -> int32
Useful for backends with poor signed integer support (e.g., TensorFlow).
Parameters
----------
arr : NDArray
Input array
Returns
-------
NDArray
Cased array
"""
return _CAPI_DGLArrayCastToSigned(arr)
def exist_shared_mem_array(name):
""" Check the existence of shared-memory array.
......@@ -162,7 +182,6 @@ class SparseMatrix(ObjectBase):
"""
ret = [_CAPI_DGLSparseMatrixGetIndices(self, i) for i in range(3)]
return [F.zerocopy_from_dgl_ndarray(arr) for arr in ret]
#return [F.zerocopy_from_dgl_ndarray(v.data) for v in ret]
@property
def flags(self):
......@@ -172,7 +191,7 @@ class SparseMatrix(ObjectBase):
-------
list of boolean
"""
return [v for v in _CAPI_DGLSparseMatrixGetFlags(self)]
return _CAPI_DGLSparseMatrixGetFlags(self)
def __getstate__(self):
return self.format, self.num_rows, self.num_cols, self.indices, self.flags
......
......@@ -76,7 +76,7 @@ class GatedGraphConv(nn.Block):
is the output feature size.
"""
with graph.local_scope():
assert graph.is_homograph(), \
assert graph.is_homogeneous(), \
"not a homograph; convert it with to_homo and pass in the edge type as argument"
zero_pad = nd.zeros((feat.shape[0], self._out_feats - feat.shape[1]),
ctx=feat.context)
......@@ -86,7 +86,8 @@ class GatedGraphConv(nn.Block):
graph.ndata['h'] = feat
for i in range(self._n_etypes):
eids = (etypes.asnumpy() == i).nonzero()[0]
eids = nd.from_numpy(eids, zero_copy=True)
eids = nd.from_numpy(eids, zero_copy=True).as_in_context(
feat.context).astype(graph.idtype)
if len(eids) > 0:
graph.apply_edges(
lambda edges: {'W_e*h': self.linears[i](edges.src['h'])},
......
......@@ -181,7 +181,7 @@ class RelGraphConv(gluon.Block):
mx.ndarray.NDArray
New node features.
"""
assert g.is_homograph(), \
assert g.is_homogeneous(), \
"not a homograph; convert it with to_homo and pass in the edge type as argument"
with g.local_scope():
g.ndata['h'] = x
......
......@@ -77,7 +77,7 @@ class TAGConv(gluon.Block):
is size of output feature.
"""
with graph.local_scope():
assert graph.is_homograph(), 'Graph is not homogeneous'
assert graph.is_homogeneous(), 'Graph is not homogeneous'
degs = graph.in_degrees().astype('float32')
norm = mx.nd.power(mx.nd.clip(degs, a_min=1, a_max=float("inf")), -0.5)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment