Unverified Commit 44089c8b authored by Minjie Wang's avatar Minjie Wang Committed by GitHub
Browse files

[Refactor][Graph] Merge DGLGraph and DGLHeteroGraph (#1862)



* Merge

* [Graph][CUDA] Graph on GPU and many refactoring (#1791)

* change edge_ids behavior and C++ impl

* fix unittests; remove utils.Index in edge_id

* pass mx and th tests

* pass tf test

* add aten::Scatter_

* Add nonzero; impl CSRGetDataAndIndices/CSRSliceMatrix

* CSRGetData and CSRGetDataAndIndices passed tests

* CSRSliceMatrix basic tests

* fix bug in empty slice

* CUDA CSRHasDuplicate

* has_node; has_edge_between

* predecessors, successors

* deprecate send/recv; fix send_and_recv

* deprecate send/recv; fix send_and_recv

* in_edges; out_edges; all_edges; apply_edges

* in deg/out deg

* subgraph/edge_subgraph

* adj

* in_subgraph/out_subgraph

* sample neighbors

* set/get_n/e_repr

* wip: working on refactoring all idtypes

* pass ndata/edata tests on gpu

* fix

* stash

* workaround nonzero issue

* stash

* nx conversion

* test_hetero_basics except update routines

* test_update_routines

* test_hetero_basics for pytorch

* more fixes

* WIP: flatten graph

* wip: flatten

* test_flatten

* test_to_device

* fix bug in to_homo

* fix bug in CSRSliceMatrix

* pass subgraph test

* fix send_and_recv

* fix filter

* test_heterograph

* passed all pytorch tests

* fix mx unittest

* fix pytorch test_nn

* fix all unittests for PyTorch

* passed all mxnet tests

* lint

* fix tf nn test

* pass all tf tests

* lint

* lint

* change deprecation

* try fix compile

* lint

* update METIDS

* fix utest

* fix

* fix utests

* try debug

* revert

* small fix

* fix utests

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* trigger

* +1s

* [kernel] Use heterograph index instead of unitgraph index (#1813)

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* trigger

* +1s

* [Graph] Mutation for Heterograph (#1818)

* mutation add_nodes and add_edges

* Add support for remove_edges, remove_nodes, add_selfloop, remove_selfloop

* Fix
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>

* upd

* upd

* upd

* fix

* [Transfom] Mutable transform (#1833)

* add nodesy

* All three

* Fix

* lint

* Add some test case

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* fix

* triger

* Fix

* fix
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>

* [Graph] Migrate Batch & Readout module to heterograph (#1836)

* dgl.batch

* unbatch

* fix to device

* reduce readout; segment reduce

* change batch_num_nodes|edges to function

* reduce readout/ softmax

* broadcast

* topk

* fix

* fix tf and mx

* fix some ci

* fix batch but unbatch differently

* new checkk

* upd

* upd

* upd

* idtype behavior; code reorg

* idtype behavior; code reorg

* wip: test_basics

* pass test_basics

* WIP: from nx/ to nx

* missing files

* upd

* pass test_basics:test_nx_conversion

* Fix test

* Fix inplace update

* WIP: fixing tests

* upd

* pass test_transform cpu

* pass gpu test_transform

* pass test_batched_graph

* GPU graph auto cast to int32

* missing file

* stash

* WIP: rgcn-hetero

* Fix two datasety

* upd

* weird

* Fix capsuley

* fuck you

* fuck matthias

* Fix dgmg

* fix bug in block degrees; pass rgcn-hetero

* rgcn

* gat and diffpool fix
also fix ppi and tu dataset

* Tree LSTM

* pointcloud

* rrn; wip: sgc

* resolve conflicts

* upd

* sgc and reddit dataset

* upd

* Fix deepwalk, gindt and gcn

* fix datasets and sign

* optimization

* optimization

* upd

* upd

* Fix GIN

* fix bug in add_nodes add_edges; tagcn

* adaptive sampling and gcmc

* upd

* upd

* fix geometric

* fix

* metapath2vec

* fix agnn

* fix pickling problem of block

* fix utests

* miss file

* linegraph

* upd

* upd

* upd

* graphsage

* stgcn_wave

* fix hgt

* on unittests

* Fix transformer

* Fix HAN

* passed pytorch unittests

* lint

* fix

* Fix cluster gcn

* cluster-gcn is ready

* on fixing block related codes

* 2nd order derivative

* Revert "2nd order derivative"

This reverts commit 523bf6c249bee61b51b1ad1babf42aad4167f206.

* passed torch utests again

* fix all mxnet unittests

* delete some useless tests

* pass all tf cpu tests

* disable

* disable distributed unittest

* fix

* fix

* lint

* fix

* fix

* fix script

* fix tutorial

* fix apply edges bug

* fix 2 basics

* fix tutorial
Co-authored-by: default avataryzh119 <expye@outlook.com>
Co-authored-by: default avatarxiang song(charlie.song) <classicxsong@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-51-214.ec2.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-7-42.us-west-2.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-1-5.us-west-2.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-68-185.ec2.internal>
parent 015acfd2
...@@ -152,7 +152,7 @@ class SortPooling(nn.Block): ...@@ -152,7 +152,7 @@ class SortPooling(nn.Block):
feat = feat.sort(axis=-1) feat = feat.sort(axis=-1)
graph.ndata['h'] = feat graph.ndata['h'] = feat
# Sort nodes according to their last features. # Sort nodes according to their last features.
ret = topk_nodes(graph, 'h', self.k)[0].reshape( ret = topk_nodes(graph, 'h', self.k, sortby=-1)[0].reshape(
-1, self.k * feat.shape[-1]) -1, self.k * feat.shape[-1])
return ret return ret
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
# pylint: disable= no-member, arguments-differ, access-member-before-definition, unpacking-non-sequence # pylint: disable= no-member, arguments-differ, access-member-before-definition, unpacking-non-sequence
import mxnet as mx import mxnet as mx
from ... import function as fn from ... import backend as F
from ...base import ALL, is_all from ...base import ALL, is_all
__all__ = ['edge_softmax'] __all__ = ['edge_softmax']
...@@ -28,7 +28,7 @@ class EdgeSoftmax(mx.autograd.Function): ...@@ -28,7 +28,7 @@ class EdgeSoftmax(mx.autograd.Function):
def __init__(self, g, eids): def __init__(self, g, eids):
super(EdgeSoftmax, self).__init__() super(EdgeSoftmax, self).__init__()
if not is_all(eids): if not is_all(eids):
g = g.edge_subgraph(eids.astype('int64')) g = g.edge_subgraph(eids.astype(g.idtype), preserve_nodes=True)
self.g = g self.g = g
def forward(self, score): def forward(self, score):
...@@ -46,16 +46,12 @@ class EdgeSoftmax(mx.autograd.Function): ...@@ -46,16 +46,12 @@ class EdgeSoftmax(mx.autograd.Function):
return out.data return out.data
""" """
g = self.g g = self.g
with g.local_scope(): score_max = F.copy_e_max(g, score)
g.edata['s'] = score score = mx.nd.exp(F.e_sub_v(g, score, score_max))
g.update_all(fn.copy_e('s', 'm'), fn.max('m', 'smax')) score_sum = F.copy_e_sum(g, score)
g.apply_edges(fn.e_sub_v('s', 'smax', 'out')) out = F.e_div_v(g, score, score_sum)
g.edata['out'] = g.edata['out'].exp() self.save_for_backward(out)
g.update_all(fn.copy_e('out', 'm'), fn.sum('m', 'out_sum')) return out
g.apply_edges(fn.e_div_v('out', 'out_sum', 'out'))
out = g.edata['out']
self.save_for_backward(out)
return out
def backward(self, grad_out): def backward(self, grad_out):
"""Backward function. """Backward function.
...@@ -71,17 +67,13 @@ class EdgeSoftmax(mx.autograd.Function): ...@@ -71,17 +67,13 @@ class EdgeSoftmax(mx.autograd.Function):
sds_sum = sds.dst_sum() # type dgl.NData sds_sum = sds.dst_sum() # type dgl.NData
grad_score = sds - sds * sds_sum # multiple expressions grad_score = sds - sds * sds_sum # multiple expressions
""" """
out, = self.saved_tensors
g = self.g g = self.g
with g.local_scope(): sds = out * grad_out
out, = self.saved_tensors accum = F.copy_e_sum(g, sds)
# clear saved tensors explicitly grad_score = sds - F.e_mul_v(g, out, accum)
self.saved_tensors = None self.save_tensors = None
g.edata['out'] = out return grad_score
g.edata['grad_score'] = out * grad_out
g.update_all(fn.copy_e('grad_score', 'm'), fn.sum('m', 'accum'))
g.apply_edges(fn.e_mul_v('out', 'accum', 'out'))
grad_score = g.edata['grad_score'] - g.edata['out']
return grad_score
def edge_softmax(graph, logits, eids=ALL): def edge_softmax(graph, logits, eids=ALL):
r"""Compute edge softmax. r"""Compute edge softmax.
......
...@@ -78,7 +78,7 @@ class GatedGraphConv(nn.Module): ...@@ -78,7 +78,7 @@ class GatedGraphConv(nn.Module):
is the output feature size. is the output feature size.
""" """
with graph.local_scope(): with graph.local_scope():
assert graph.is_homograph(), \ assert graph.is_homogeneous(), \
"not a homograph; convert it with to_homo and pass in the edge type as argument" "not a homograph; convert it with to_homo and pass in the edge type as argument"
zero_pad = feat.new_zeros((feat.shape[0], self._out_feats - feat.shape[1])) zero_pad = feat.new_zeros((feat.shape[0], self._out_feats - feat.shape[1]))
feat = th.cat([feat, zero_pad], -1) feat = th.cat([feat, zero_pad], -1)
...@@ -86,7 +86,7 @@ class GatedGraphConv(nn.Module): ...@@ -86,7 +86,7 @@ class GatedGraphConv(nn.Module):
for _ in range(self._n_steps): for _ in range(self._n_steps):
graph.ndata['h'] = feat graph.ndata['h'] = feat
for i in range(self._n_etypes): for i in range(self._n_etypes):
eids = (etypes == i).nonzero().view(-1) eids = (etypes == i).nonzero().view(-1).type(graph.idtype)
if len(eids) > 0: if len(eids) > 0:
graph.apply_edges( graph.apply_edges(
lambda edges: {'W_e*h': self.linears[i](edges.src['h'])}, lambda edges: {'W_e*h': self.linears[i](edges.src['h'])},
......
...@@ -138,7 +138,7 @@ class GraphConv(nn.Module): ...@@ -138,7 +138,7 @@ class GraphConv(nn.Module):
feat_src, feat_dst = expand_as_pair(feat, graph) feat_src, feat_dst = expand_as_pair(feat, graph)
if self._norm == 'both': if self._norm == 'both':
degs = graph.out_degrees().to(feat_src.device).float().clamp(min=1) degs = graph.out_degrees().float().clamp(min=1)
norm = th.pow(degs, -0.5) norm = th.pow(degs, -0.5)
shp = norm.shape + (1,) * (feat_src.dim() - 1) shp = norm.shape + (1,) * (feat_src.dim() - 1)
norm = th.reshape(norm, shp) norm = th.reshape(norm, shp)
...@@ -170,7 +170,7 @@ class GraphConv(nn.Module): ...@@ -170,7 +170,7 @@ class GraphConv(nn.Module):
rst = th.matmul(rst, weight) rst = th.matmul(rst, weight)
if self._norm != 'none': if self._norm != 'none':
degs = graph.in_degrees().to(feat_dst.device).float().clamp(min=1) degs = graph.in_degrees().float().clamp(min=1)
if self._norm == 'both': if self._norm == 'both':
norm = th.pow(degs, -0.5) norm = th.pow(degs, -0.5)
else: else:
......
...@@ -74,7 +74,7 @@ class TAGConv(nn.Module): ...@@ -74,7 +74,7 @@ class TAGConv(nn.Module):
is size of output feature. is size of output feature.
""" """
with graph.local_scope(): with graph.local_scope():
assert graph.is_homograph(), 'Graph is not homogeneous' assert graph.is_homogeneous(), 'Graph is not homogeneous'
norm = th.pow(graph.in_degrees().float().clamp(min=1), -0.5) norm = th.pow(graph.in_degrees().float().clamp(min=1), -0.5)
shp = norm.shape + (1,) * (feat.dim() - 1) shp = norm.shape + (1,) * (feat.dim() - 1)
......
...@@ -145,7 +145,7 @@ class SortPooling(nn.Module): ...@@ -145,7 +145,7 @@ class SortPooling(nn.Module):
feat, _ = feat.sort(dim=-1) feat, _ = feat.sort(dim=-1)
graph.ndata['h'] = feat graph.ndata['h'] = feat
# Sort nodes according to their last features. # Sort nodes according to their last features.
ret = topk_nodes(graph, 'h', self.k, idx=-1)[0].view( ret = topk_nodes(graph, 'h', self.k, sortby=-1)[0].view(
-1, self.k * feat.shape[-1]) -1, self.k * feat.shape[-1])
return ret return ret
...@@ -564,7 +564,7 @@ class SetTransformerEncoder(nn.Module): ...@@ -564,7 +564,7 @@ class SetTransformerEncoder(nn.Module):
torch.Tensor torch.Tensor
The output feature with shape :math:`(N, D)`. The output feature with shape :math:`(N, D)`.
""" """
lengths = graph.batch_num_nodes lengths = graph.batch_num_nodes()
for layer in self.layers: for layer in self.layers:
feat = layer(feat, lengths) feat = layer(feat, lengths)
return feat return feat
...@@ -626,7 +626,7 @@ class SetTransformerDecoder(nn.Module): ...@@ -626,7 +626,7 @@ class SetTransformerDecoder(nn.Module):
The output feature with shape :math:`(B, D)`, where The output feature with shape :math:`(B, D)`, where
:math:`B` refers to the batch size. :math:`B` refers to the batch size.
""" """
len_pma = graph.batch_num_nodes len_pma = graph.batch_num_nodes()
len_sab = [self.k] * graph.batch_size len_sab = [self.k] * graph.batch_size
feat = self.pma(feat, len_pma) feat = self.pma(feat, len_pma)
for layer in self.layers: for layer in self.layers:
......
...@@ -43,13 +43,11 @@ class EdgeSoftmax(th.autograd.Function): ...@@ -43,13 +43,11 @@ class EdgeSoftmax(th.autograd.Function):
# remember to save the graph to backward cache before making it # remember to save the graph to backward cache before making it
# a local variable # a local variable
if not is_all(eids): if not is_all(eids):
g = g.edge_subgraph(eids.long()) g = g.edge_subgraph(eids.type(g.idtype), preserve_nodes=True)
score_max = F.copy_e_max(g, score) score_max = F.copy_e_max(g, score)
score = th.exp(F.e_sub_v(g, score, score_max)) score = th.exp(F.e_sub_v(g, score, score_max))
score_sum = F.copy_e_sum(g, score) score_sum = F.copy_e_sum(g, score)
out = F.e_div_v(g, score, score_sum) out = F.e_div_v(g, score, score_sum)
ctx.backward_cache = g ctx.backward_cache = g
ctx.save_for_backward(out) ctx.save_for_backward(out)
return out return out
......
...@@ -214,7 +214,7 @@ class RelGraphConv(layers.Layer): ...@@ -214,7 +214,7 @@ class RelGraphConv(layers.Layer):
tf.Tensor tf.Tensor
New node features. New node features.
""" """
assert g.is_homograph(), \ assert g.is_homogeneous(), \
"not a homograph; convert it with to_homo and pass in the edge type as argument" "not a homograph; convert it with to_homo and pass in the edge type as argument"
with g.local_scope(): with g.local_scope():
g.ndata['h'] = x g.ndata['h'] = x
......
...@@ -148,7 +148,7 @@ class SortPooling(layers.Layer): ...@@ -148,7 +148,7 @@ class SortPooling(layers.Layer):
feat = tf.sort(feat, -1) feat = tf.sort(feat, -1)
graph.ndata['h'] = feat graph.ndata['h'] = feat
# Sort nodes according to their last features. # Sort nodes according to their last features.
ret = tf.reshape(topk_nodes(graph, 'h', self.k, idx=-1)[0], ( ret = tf.reshape(topk_nodes(graph, 'h', self.k, sortby=-1)[0], (
-1, self.k * feat.shape[-1])) -1, self.k * feat.shape[-1]))
return ret return ret
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
# pylint: disable= no-member, arguments-differ # pylint: disable= no-member, arguments-differ
import tensorflow as tf import tensorflow as tf
from ... import function as fn from ...sparse import _gspmm, _gsddmm
from ...base import ALL, is_all from ...base import ALL, is_all
__all__ = ['edge_softmax'] __all__ = ['edge_softmax']
...@@ -11,25 +11,18 @@ __all__ = ['edge_softmax'] ...@@ -11,25 +11,18 @@ __all__ = ['edge_softmax']
def edge_softmax_real(graph, score, eids=ALL): def edge_softmax_real(graph, score, eids=ALL):
"""Edge Softmax function""" """Edge Softmax function"""
if not is_all(eids): if not is_all(eids):
graph = graph.edge_subgraph(tf.cast(eids, tf.int64)) graph = graph.edge_subgraph(tf.cast(eids, graph.idtype))
with graph.local_scope(): gidx = graph._graph
graph.edata['s'] = score score_max = _gspmm(gidx, 'copy_rhs', 'max', None, score)[0]
graph.update_all(fn.copy_e('s', 'm'), fn.max('m', 'smax')) score = tf.math.exp(_gsddmm(gidx, 'sub', score, score_max, 'e', 'v'))
graph.apply_edges(fn.e_sub_v('s', 'smax', 'out')) score_sum = _gspmm(gidx, 'copy_rhs', 'sum', None, score)[0]
graph.edata['out'] = tf.math.exp(graph.edata['out']) out = _gsddmm(gidx, 'div', score, score_sum, 'e', 'v')
graph.update_all(fn.copy_e('out', 'm'), fn.sum('m', 'out_sum'))
graph.apply_edges(fn.e_div_v('out', 'out_sum', 'out'))
out = graph.edata['out']
def edge_softmax_backward(grad_out): def edge_softmax_backward(grad_out):
with graph.local_scope(): sds = out * grad_out
# clear backward cache explicitly accum = _gspmm(gidx, 'copy_rhs', 'sum', None, sds)[0]
graph.edata['out'] = out grad_score = sds - _gsddmm(gidx, 'mul', out, accum, 'e', 'v')
graph.edata['grad_s'] = out * grad_out return grad_score
graph.update_all(fn.copy_e('grad_s', 'm'), fn.sum('m', 'accum'))
graph.apply_edges(fn.e_mul_v('out', 'accum', 'out'))
grad_score = graph.edata['grad_s'] - graph.edata['out']
return grad_score
return out, edge_softmax_backward return out, edge_softmax_backward
......
...@@ -3,7 +3,7 @@ from __future__ import absolute_import ...@@ -3,7 +3,7 @@ from __future__ import absolute_import
from ._ffi.object import register_object, ObjectBase from ._ffi.object import register_object, ObjectBase
from ._ffi.function import _init_api from ._ffi.function import _init_api
from .base import ALL, is_all, DGLError from .base import ALL, is_all, DGLError, dgl_warning
from . import backend as F from . import backend as F
from .frame import Frame, FrameRef from .frame import Frame, FrameRef
from .graph import DGLBaseGraph from .graph import DGLBaseGraph
...@@ -89,13 +89,15 @@ class NodeFlow(DGLBaseGraph): ...@@ -89,13 +89,15 @@ class NodeFlow(DGLBaseGraph):
Parameters Parameters
---------- ----------
parent : DGLGraph parent : DGLGraphStale
The parent graph. The parent graph.
nfobj : NodeFlowObject nfobj : NodeFlowObject
The nodeflow object The nodeflow object
""" """
def __init__(self, parent, nfobj): def __init__(self, parent, nfobj):
super(NodeFlow, self).__init__(nfobj.graph) super(NodeFlow, self).__init__(nfobj.graph)
dgl_warning('NodeFlow APIs are deprecated starting from v0.5. Please read our'
' guide<link> for how to use the new sampling APIs.')
self._parent = parent self._parent = parent
self._node_mapping = utils.toindex(nfobj.node_mapping) self._node_mapping = utils.toindex(nfobj.node_mapping)
self._edge_mapping = utils.toindex(nfobj.edge_mapping) self._edge_mapping = utils.toindex(nfobj.edge_mapping)
...@@ -891,7 +893,7 @@ class NodeFlow(DGLBaseGraph): ...@@ -891,7 +893,7 @@ class NodeFlow(DGLBaseGraph):
def block_compute(self, block_id, message_func="default", reduce_func="default", def block_compute(self, block_id, message_func="default", reduce_func="default",
apply_node_func="default", v=ALL, inplace=False): apply_node_func="default", v=ALL, inplace=False):
"""Perform the computation on the specified block. It's similar to `pull` """Perform the computation on the specified block. It's similar to `pull`
in DGLGraph. in DGLGraphStale.
On the given block i, it runs `pull` on nodes in layer i+1, which generates On the given block i, it runs `pull` on nodes in layer i+1, which generates
messages on edges in block i, runs the reduce function and node update messages on edges in block i, runs the reduce function and node update
function on nodes in layer i+1. function on nodes in layer i+1.
......
...@@ -112,8 +112,8 @@ def partition_graph_with_halo(g, node_part, extra_cached_hops, reshuffle=False): ...@@ -112,8 +112,8 @@ def partition_graph_with_halo(g, node_part, extra_cached_hops, reshuffle=False):
# This creaets a subgraph from subgraphs returned from the CAPI above. # This creaets a subgraph from subgraphs returned from the CAPI above.
def create_subgraph(subg, induced_nodes, induced_edges): def create_subgraph(subg, induced_nodes, induced_edges):
subg1 = DGLHeteroGraph(gidx=subg.graph, ntypes=['_N'], etypes=['_E']) subg1 = DGLHeteroGraph(gidx=subg.graph, ntypes=['_N'], etypes=['_E'])
subg1.ndata[NID] = induced_nodes[0].tousertensor() subg1.ndata[NID] = induced_nodes[0]
subg1.edata[EID] = induced_edges[0].tousertensor() subg1.edata[EID] = induced_edges[0]
return subg1 return subg1
for i, subg in enumerate(subgs): for i, subg in enumerate(subgs):
......
"""Module for message propagation.""" """Module for message propagation."""
from __future__ import absolute_import from __future__ import absolute_import
from . import backend as F
from . import traversal as trv from . import traversal as trv
from .heterograph import DGLHeteroGraph from .heterograph import DGLHeteroGraph
...@@ -86,7 +87,11 @@ def prop_nodes_bfs(graph, ...@@ -86,7 +87,11 @@ def prop_nodes_bfs(graph,
'DGLGraph is deprecated, Please use DGLHeteroGraph' 'DGLGraph is deprecated, Please use DGLHeteroGraph'
assert len(graph.canonical_etypes) == 1, \ assert len(graph.canonical_etypes) == 1, \
'prop_nodes_bfs only support homogeneous graph' 'prop_nodes_bfs only support homogeneous graph'
nodes_gen = trv.bfs_nodes_generator(graph, source, reverse) # TODO(murphy): Graph traversal currently is only supported on
# CPP graphs. Move graph to CPU as a workaround,
# which should be fixed in the future.
nodes_gen = trv.bfs_nodes_generator(graph.cpu(), source, reverse)
nodes_gen = [F.copy_to(frontier, graph.device) for frontier in nodes_gen]
prop_nodes(graph, nodes_gen, message_func, reduce_func, apply_node_func) prop_nodes(graph, nodes_gen, message_func, reduce_func, apply_node_func)
def prop_nodes_topo(graph, def prop_nodes_topo(graph,
...@@ -117,7 +122,11 @@ def prop_nodes_topo(graph, ...@@ -117,7 +122,11 @@ def prop_nodes_topo(graph,
'DGLGraph is deprecated, Please use DGLHeteroGraph' 'DGLGraph is deprecated, Please use DGLHeteroGraph'
assert len(graph.canonical_etypes) == 1, \ assert len(graph.canonical_etypes) == 1, \
'prop_nodes_topo only support homogeneous graph' 'prop_nodes_topo only support homogeneous graph'
nodes_gen = trv.topological_nodes_generator(graph, reverse) # TODO(murphy): Graph traversal currently is only supported on
# CPP graphs. Move graph to CPU as a workaround,
# which should be fixed in the future.
nodes_gen = trv.topological_nodes_generator(graph.cpu(), reverse)
nodes_gen = [F.copy_to(frontier, graph.device) for frontier in nodes_gen]
prop_nodes(graph, nodes_gen, message_func, reduce_func, apply_node_func) prop_nodes(graph, nodes_gen, message_func, reduce_func, apply_node_func)
def prop_edges_dfs(graph, def prop_edges_dfs(graph,
...@@ -157,7 +166,11 @@ def prop_edges_dfs(graph, ...@@ -157,7 +166,11 @@ def prop_edges_dfs(graph,
'DGLGraph is deprecated, Please use DGLHeteroGraph' 'DGLGraph is deprecated, Please use DGLHeteroGraph'
assert len(graph.canonical_etypes) == 1, \ assert len(graph.canonical_etypes) == 1, \
'prop_edges_dfs only support homogeneous graph' 'prop_edges_dfs only support homogeneous graph'
# TODO(murphy): Graph traversal currently is only supported on
# CPP graphs. Move graph to CPU as a workaround,
# which should be fixed in the future.
edges_gen = trv.dfs_labeled_edges_generator( edges_gen = trv.dfs_labeled_edges_generator(
graph, source, reverse, has_reverse_edge, has_nontree_edge, graph.cpu(), source, reverse, has_reverse_edge, has_nontree_edge,
return_labels=False) return_labels=False)
edges_gen = [F.copy_to(frontier, graph.device) for frontier in edges_gen]
prop_edges(graph, edges_gen, message_func, reduce_func, apply_node_func) prop_edges(graph, edges_gen, message_func, reduce_func, apply_node_func)
"""Classes and functions for batching multiple graphs together.""" """Classes and functions for batching multiple graphs together."""
from __future__ import absolute_import from __future__ import absolute_import
import numpy as np
from .base import DGLError from .base import DGLError
from . import backend as F from . import backend as F
from . import segment
__all__ = ['sum_nodes', 'sum_edges', 'mean_nodes', 'mean_edges', __all__ = ['readout_nodes', 'readout_edges',
'sum_nodes', 'sum_edges', 'mean_nodes', 'mean_edges',
'max_nodes', 'max_edges', 'softmax_nodes', 'softmax_edges', 'max_nodes', 'max_edges', 'softmax_nodes', 'softmax_edges',
'broadcast_nodes', 'broadcast_edges', 'topk_nodes', 'topk_edges'] 'broadcast_nodes', 'broadcast_edges', 'topk_nodes', 'topk_edges']
READOUT_ON_ATTRS = { def readout_nodes(graph, feat, weight=None, *, op='sum', ntype=None):
'nodes': ('ndata', 'batch_num_nodes', 'number_of_nodes'), """Generate a graph-level representation by aggregating node features
'edges': ('edata', 'batch_num_edges', 'number_of_edges'), :attr:`feat`.
}
def _sum_on(graph, typestr, feat, weight): The function is commonly used as a *readout* function on a batch of graphs
"""Internal function to sum node or edge features. to generate graph-level representation. Thus, the result tensor shape
depends on the batch size of the input graph. Given a graph of batch size
Parameters :math:`B`, and a feature size of :math:`D`, the result shape will be
---------- :math:`(B, D)`, with each row being the aggregated node features of each
graph : DGLGraph graph.
The graph.
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
weight : str
The weight field name.
Returns
-------
tensor
The (weighted) summed node or edge features.
"""
data_attr, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr]
data = getattr(graph, data_attr)
feat = data[feat]
if weight is not None:
weight = data[weight]
weight = F.reshape(weight, (-1,) + (1,) * (F.ndim(feat) - 1))
feat = weight * feat
n_graphs = graph.batch_size
batch_num_objs = getattr(graph, batch_num_objs_attr)
seg_id = F.zerocopy_from_numpy(np.arange(n_graphs, dtype='int64').repeat(batch_num_objs))
seg_id = F.copy_to(seg_id, F.context(feat))
y = F.unsorted_1d_segment_sum(feat, seg_id, n_graphs, 0)
return y
def sum_nodes(graph, feat, weight=None):
"""Sums all the values of node field :attr:`feat` in :attr:`graph`, optionally
multiplies the field by a scalar node field :attr:`weight`.
Parameters Parameters
---------- ----------
graph : DGLGraph. graph : DGLGraph.
The graph. Input graph.
feat : str feat : str
The feature field. Node feature name.
weight : str, optional weight : str, optional
The weight field. If None, no weighting will be performed, Node weight name. If None, no weighting will be performed,
otherwise, weight each node feature with field :attr:`feat`. otherwise, weight each node feature with field :attr:`feat`.
for summation. The weight feature associated in the :attr:`graph` for aggregation. The weight feature shape must be compatible with
should be a tensor of shape ``[graph.number_of_nodes(), 1]``. an element-wise multiplication with the feature tensor.
op : str, optional
Readout operator. Can be 'sum', 'max', 'min', 'mean'.
ntype : str, optional
Node type. Can be omitted if there is only one node type in the graph.
Returns Returns
------- -------
tensor tensor
The summed tensor. Result tensor.
Notes
-----
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of the
i-th graph in the batched graph. If a graph has no nodes,
a zero tensor with the same shape is returned at the corresponding row.
Examples Examples
-------- --------
...@@ -88,177 +51,73 @@ def sum_nodes(graph, feat, weight=None): ...@@ -88,177 +51,73 @@ def sum_nodes(graph, feat, weight=None):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
node features. node features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0, 1], [1, 0])) # Graph 1
>>> g1.add_nodes(2) >>> g1.ndata['h'] = th.tensor([1., 2.])
>>> g1.ndata['h'] = th.tensor([[1.], [2.]]) >>> g2 = dgl.graph(([0, 1], [1, 2])) # Graph 2
>>> g1.ndata['w'] = th.tensor([[3.], [6.]]) >>> g2.ndata['h'] = th.tensor([1., 2., 3.])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.ndata['h'] = th.tensor([[1.], [2.], [3.]])
Sum over node attribute :attr:`h` without weighting for each graph in a
batched graph.
>>> bg = dgl.batch([g1, g2], node_attrs='h')
>>> dgl.sum_nodes(bg, 'h')
tensor([[3.], # 1 + 2
[6.]]) # 1 + 2 + 3
Sum node attribute :attr:`h` with weight from node attribute :attr:`w`
for a single graph.
>>> dgl.sum_nodes(g1, 'h', 'w')
tensor([[15.]]) # 1 * 3 + 2 * 6
See Also
--------
mean_nodes
sum_edges
mean_edges
"""
return _sum_on(graph, 'nodes', feat, weight)
def sum_edges(graph, feat, weight=None):
"""Sums all the values of edge field :attr:`feat` in :attr:`graph`,
optionally multiplies the field by a scalar edge field :attr:`weight`.
Parameters
----------
graph : DGLGraph
The graph.
feat : str
The feature field.
weight : str, optional
The weight field. If None, no weighting will be performed,
otherwise, weight each edge feature with field :attr:`feat`.
for summation. The weight feature associated in the :attr:`graph`
should be a tensor of shape ``[graph.number_of_edges(), 1]``.
Returns
-------
tensor
The summed tensor.
Notes
-----
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of the
i-th graph in the batched graph. If a graph has no edges,
a zero tensor with the same shape is returned at the corresponding row.
Examples
--------
>>> import dgl Sum over one graph:
>>> import torch as th
Create two :class:`~dgl.DGLGraph` objects and initialize their >>> dgl.readout_nodes(g1, 'h')
edge features. tensor([3.]) # 1 + 2
>>> g1 = dgl.DGLGraph() # Graph 1 Sum over a batched graph:
>>> g1.add_nodes(2)
>>> g1.add_edges([0, 1], [1, 0])
>>> g1.edata['h'] = th.tensor([[1.], [2.]])
>>> g1.edata['w'] = th.tensor([[3.], [6.]])
>>> g2 = dgl.DGLGraph() # Graph 2 >>> bg = dgl.batch([g1, g2])
>>> g2.add_nodes(3) >>> dgl.readout_nodes(bg, 'h')
>>> g2.add_edges([0, 1, 2], [1, 2, 0]) tensor([3., 6.]) # [1 + 2, 1 + 2 + 3]
>>> g2.edata['h'] = th.tensor([[1.], [2.], [3.]])
Sum over edge attribute :attr:`h` without weighting for each graph in a Weighted sum:
batched graph.
>>> bg = dgl.batch([g1, g2], edge_attrs='h') >>> bg.ndata['w'] = th.tensor([.1, .2, .1, .5, .2])
>>> dgl.sum_edges(bg, 'h') >>> dgl.readout_nodes(bg, 'h', 'w')
tensor([[3.], # 1 + 2 tensor([.5, 1.7])
[6.]]) # 1 + 2 + 3
Sum edge attribute :attr:`h` with weight from edge attribute :attr:`w` Readout by max:
for a single graph.
>>> dgl.sum_edges(g1, 'h', 'w') >>> dgl.readout_nodes(bg, 'h', op='max')
tensor([[15.]]) # 1 * 3 + 2 * 6 tensor([2., 3.])
See Also See Also
-------- --------
sum_nodes readout_edges
mean_nodes
mean_edges
""" """
return _sum_on(graph, 'edges', feat, weight) x = graph.nodes[ntype].data[feat]
def _mean_on(graph, typestr, feat, weight):
"""Internal function to sum node or edge features.
Parameters
----------
graph : DGLGraph
The graph.
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
weight : str
The weight field name.
Returns
-------
tensor
The (weighted) summed node or edge features.
"""
data_attr, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr]
data = getattr(graph, data_attr)
feat = data[feat]
if weight is not None: if weight is not None:
weight = data[weight] x = x * graph.nodes[ntype].data[weight]
weight = F.reshape(weight, (-1,) + (1,) * (F.ndim(feat) - 1)) return segment.segment_reduce(graph.batch_num_nodes(ntype), x, reducer=op)
feat = weight * feat
n_graphs = graph.batch_size
batch_num_objs = getattr(graph, batch_num_objs_attr)
seg_id = F.zerocopy_from_numpy(np.arange(n_graphs, dtype='int64').repeat(batch_num_objs))
seg_id = F.copy_to(seg_id, F.context(feat))
if weight is not None:
w = F.unsorted_1d_segment_sum(weight, seg_id, n_graphs, 0)
y = F.unsorted_1d_segment_sum(feat, seg_id, n_graphs, 0)
y = y / w
else:
y = F.unsorted_1d_segment_mean(feat, seg_id, n_graphs, 0)
return y
def mean_nodes(graph, feat, weight=None): def readout_edges(graph, feat, weight=None, *, op='sum', etype=None):
"""Averages all the values of node field :attr:`feat` in :attr:`graph`, """Sum the edge feature :attr:`feat` in :attr:`graph`, optionally
optionally multiplies the field by a scalar node field :attr:`weight`. multiplies it by a edge :attr:`weight`.
The function is commonly used as a *readout* function on a batch of graphs
to generate graph-level representation. Thus, the result tensor shape
depends on the batch size of the input graph. Given a graph of batch size
:math:`B`, and a feature size of :math:`D`, the result shape will be
:math:`(B, D)`, with each row being the aggregated edge features of each
graph.
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph.
The graph. Input graph.
feat : str feat : str
The feature field. Edge feature name.
weight : str, optional weight : str, optional
The weight field. If None, no weighting will be performed, Edge weight name. If None, no weighting will be performed,
otherwise, weight each node feature with field :attr:`feat`. otherwise, weight each edge feature with field :attr:`feat`.
for calculating mean. The weight feature associated in the :attr:`graph` for summation. The weight feature shape must be compatible with
should be a tensor of shape ``[graph.number_of_nodes(), 1]``. an element-wise multiplication with the feature tensor.
op : str, optional
Readout operator. Can be 'sum', 'max', 'min', 'mean'.
etype : str, tuple of str, optional
Edge type. Can be omitted if there is only one edge type in the graph.
Returns Returns
------- -------
tensor tensor
The averaged tensor. Result tensor.
Notes
-----
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of
the i-th graph in the batch. If a graph has no nodes,
a zero tensor with the same shape is returned at the corresponding row.
Examples Examples
-------- --------
...@@ -267,302 +126,100 @@ def mean_nodes(graph, feat, weight=None): ...@@ -267,302 +126,100 @@ def mean_nodes(graph, feat, weight=None):
>>> import torch as th >>> import torch as th
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
node features. edge features.
>>> g1 = dgl.DGLGraph() # Graph 1
>>> g1.add_nodes(2)
>>> g1.ndata['h'] = th.tensor([[1.], [2.]])
>>> g1.ndata['w'] = th.tensor([[3.], [6.]])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.ndata['h'] = th.tensor([[1.], [2.], [3.]])
Average over node attribute :attr:`h` without weighting for each graph in a
batched graph.
>>> bg = dgl.batch([g1, g2], node_attrs='h')
>>> dgl.mean_nodes(bg, 'h')
tensor([[1.5000], # (1 + 2) / 2
[2.0000]]) # (1 + 2 + 3) / 3
Sum node attribute :attr:`h` with normalized weight from node attribute :attr:`w`
for a single graph.
>>> dgl.mean_nodes(g1, 'h', 'w') # h1 * (w1 / (w1 + w2)) + h2 * (w2 / (w1 + w2))
tensor([[1.6667]]) # 1 * (3 / (3 + 6)) + 2 * (6 / (3 + 6))
See Also
--------
sum_nodes
sum_edges
mean_edges
"""
return _mean_on(graph, 'nodes', feat, weight)
def mean_edges(graph, feat, weight=None):
"""Averages all the values of edge field :attr:`feat` in :attr:`graph`,
optionally multiplies the field by a scalar edge field :attr:`weight`.
Parameters
----------
graph : DGLGraph
The graph.
feat : str
The feature field.
weight : optional, str
The weight field. If None, no weighting will be performed,
otherwise, weight each edge feature with field :attr:`feat`.
for calculating mean. The weight feature associated in the :attr:`graph`
should be a tensor of shape ``[graph.number_of_edges(), 1]``.
Returns >>> g1 = dgl.graph(([0, 1], [1, 0])) # Graph 1
------- >>> g1.edata['h'] = th.tensor([1., 2.])
tensor >>> g2 = dgl.graph(([0, 1], [1, 2])) # Graph 2
The averaged tensor. >>> g2.edata['h'] = th.tensor([2., 3.])
Notes Sum over one graph:
-----
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of
the i-th graph in the batched graph. If a graph has no edges,
a zero tensor with the same shape is returned at the corresponding row.
Examples >>> dgl.readout_edges(g1, 'h')
-------- tensor([3.]) # 1 + 2
>>> import dgl Sum over a batched graph:
>>> import torch as th
Create two :class:`~dgl.DGLGraph` objects and initialize their
edge features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> bg = dgl.batch([g1, g2])
>>> g1.add_nodes(2) >>> dgl.readout_edges(bg, 'h')
>>> g1.add_edges([0, 1], [1, 0]) tensor([3., 5.]) # [1 + 2, 2 + 3]
>>> g1.edata['h'] = th.tensor([[1.], [2.]])
>>> g1.edata['w'] = th.tensor([[3.], [6.]])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.add_edges([0, 1, 2], [1, 2, 0])
>>> g2.edata['h'] = th.tensor([[1.], [2.], [3.]])
Average over edge attribute :attr:`h` without weighting for each graph in a Weighted sum:
batched graph.
>>> bg = dgl.batch([g1, g2], edge_attrs='h') >>> bg.edata['w'] = th.tensor([.1, .2, .1, .5])
>>> dgl.mean_edges(bg, 'h') >>> dgl.readout_edges(bg, 'h', 'w')
tensor([[1.5000], # (1 + 2) / 2 tensor([.5, 1.7])
[2.0000]]) # (1 + 2 + 3) / 3
Sum edge attribute :attr:`h` with normalized weight from edge attribute :attr:`w` Readout by max:
for a single graph.
>>> dgl.mean_edges(g1, 'h', 'w') # h1 * (w1 / (w1 + w2)) + h2 * (w2 / (w1 + w2)) >>> dgl.readout_edges(bg, 'w', op='max')
tensor([[1.6667]]) # 1 * (3 / (3 + 6)) + 2 * (6 / (3 + 6)) tensor([2., 3.])
See Also See Also
-------- --------
sum_nodes readout_nodes
mean_nodes
sum_edges
""" """
return _mean_on(graph, 'edges', feat, weight) x = graph.edges[etype].data[feat]
if weight is not None:
def _max_on(graph, typestr, feat): x = x * graph.edges[etype].data[weight]
"""Internal function to take elementwise maximum return segment.segment_reduce(graph.batch_num_edges(etype), x, reducer=op)
over node or edge features.
Parameters
----------
graph : DGLGraph
The graph.
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
Returns def sum_nodes(graph, feat, weight=None, *, ntype=None):
------- """Syntax sugar for ``dgl.readout_nodes(graph, feat, weight, ntype=ntype, op='sum')``.
tensor
The (weighted) summed node or edge features.
""" """
data_attr, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr] return readout_nodes(graph, feat, weight, ntype=ntype, op='sum')
data = getattr(graph, data_attr)
feat = data[feat]
# TODO: the current solution pads the different graph sizes to the same,
# a more efficient way is to use segment max, we need to implement it in
# the future.
batch_num_objs = getattr(graph, batch_num_objs_attr)
feat = F.pad_packed_tensor(feat, batch_num_objs, -float('inf'))
return F.max(feat, 1)
def _softmax_on(graph, typestr, feat):
"""Internal function of applying batch-wise graph-level softmax
over node or edge features of a given field.
Parameters def sum_edges(graph, feat, weight=None, *, etype=None):
---------- """Syntax sugar for ``dgl.readout_edges(graph, feat, weight, etype=etype, op='sum')``.
graph : DGLGraph
The graph
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
Returns
-------
tensor
The obtained tensor.
""" """
data_attr, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr] return readout_edges(graph, feat, weight, etype=etype, op='sum')
data = getattr(graph, data_attr)
feat = data[feat]
# TODO: the current solution pads the different graph sizes to the same,
# a more efficient way is to use segment sum/max, we need to implement
# it in the future.
batch_num_objs = getattr(graph, batch_num_objs_attr)
feat = F.pad_packed_tensor(feat, batch_num_objs, -float('inf'))
feat = F.softmax(feat, 1)
return F.pack_padded_tensor(feat, batch_num_objs)
def _broadcast_on(graph, typestr, feat_data):
"""Internal function of broadcasting features to all nodes/edges.
Parameters
----------
graph : DGLGraph
The graph
typestr : str
'nodes' or 'edges'
feat_data : tensor
The feature to broadcast. Tensor shape is :math:`(*)` for single graph,
and :math:`(B, *)` for batched graph.
Returns def mean_nodes(graph, feat, weight=None, *, ntype=None):
------- """Syntax sugar for ``dgl.readout_nodes(graph, feat, weight, ntype=ntype, op='mean')``.
tensor
The node/edge features tensor with shape :math:`(N, *)`.
""" """
_, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr] return readout_nodes(graph, feat, weight, ntype=ntype, op='mean')
batch_num_objs = getattr(graph, batch_num_objs_attr)
index = []
for i, num_obj in enumerate(batch_num_objs):
index.extend([i] * num_obj)
ctx = F.context(feat_data)
index = F.copy_to(F.tensor(index), ctx)
return F.gather_row(feat_data, index)
def _topk_on(graph, typestr, feat, k, descending=True, idx=None):
"""Internal function to take graph-wise top-k node/edge features of
field :attr:`feat` in :attr:`graph` ranked by keys at given
index :attr:`idx`. If :attr:`descending` is set to False, return the
k smallest elements instead.
If idx is set to None, the function would return top-k value of all
indices, which is equivalent to calling `th.topk(graph.ndata[feat], dim=0)`
for each single graph of the input batched-graph.
Parameters
---------
graph : DGLGraph
The graph
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
k : int
The :math:`k` in "top-:math`k`".
descending : bool
Controls whether to return the largest or smallest elements,
defaults to True.
idx : int or None, defaults to None
The key index we sort :attr:`feat` on, if set to None, we sort
the whole :attr:`feat`.
Returns
-------
tuple of tensors:
The first tensor returns top-k features of each single graph of
the input graph:
a tensor with shape :math:`(B, K, D)` would be returned, where
:math:`B` is the batch size of the input graph.
The second tensor returns the top-k indices of each single graph
of the input graph:
a tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if` idx
is set to None) would be returned, where
:math:`B` is the batch size of the input graph.
Notes def mean_edges(graph, feat, weight=None, *, etype=None):
----- """Syntax sugar for ``dgl.readout_edges(graph, feat, weight, etype=etype, op='mean')``.
If an example has :math:`n` nodes/edges and :math:`n<k`, in the first
returned tensor the :math:`n+1` to :math:`k`th rows would be padded
with all zero; in the second returned tensor, the behavior of :math:`n+1`
to :math:`k`th elements is not defined.
""" """
data_attr, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr] return readout_edges(graph, feat, weight, etype=etype, op='mean')
data = getattr(graph, data_attr)
if F.ndim(data[feat]) > 2:
raise DGLError('The {} feature `{}` should have dimension less than or'
' equal to 2'.format(typestr, feat))
feat = data[feat] def max_nodes(graph, feat, weight=None, *, ntype=None):
hidden_size = F.shape(feat)[-1] """Syntax sugar for ``dgl.readout_nodes(graph, feat, weight, ntype=ntype, op='max')``.
batch_num_objs = getattr(graph, batch_num_objs_attr) """
batch_size = len(batch_num_objs) return readout_nodes(graph, feat, weight, ntype=ntype, op='max')
length = max(max(batch_num_objs), k)
fill_val = -float('inf') if descending else float('inf')
feat_ = F.pad_packed_tensor(feat, batch_num_objs, fill_val, l_min=k)
if idx is not None:
keys = F.squeeze(F.slice_axis(feat_, -1, idx, idx+1), -1)
order = F.argsort(keys, -1, descending=descending)
else:
order = F.argsort(feat_, 1, descending=descending)
topk_indices = F.slice_axis(order, 1, 0, k) def max_edges(graph, feat, weight=None, *, etype=None):
"""Syntax sugar for ``dgl.readout_edges(graph, feat, weight, etype=etype, op='max')``.
"""
return readout_edges(graph, feat, weight, etype=etype, op='max')
# zero padding def softmax_nodes(graph, feat, *, ntype=None):
feat_ = F.pad_packed_tensor(feat, batch_num_objs, 0, l_min=k) r"""Perform graph-wise softmax on the node features.
if idx is not None: For each node :math:`v\in\mathcal{V}` and its feature :math:`x_v`,
feat_ = F.reshape(feat_, (batch_size * length, -1)) calculate its normalized feature as follows:
shift = F.repeat(F.arange(0, batch_size) * length, k, -1)
shift = F.copy_to(shift, F.context(feat))
topk_indices_ = F.reshape(topk_indices, (-1,)) + shift
else:
feat_ = F.reshape(feat_, (-1,))
shift = F.repeat(F.arange(0, batch_size), k * hidden_size, -1) * length * hidden_size +\
F.cat([F.arange(0, hidden_size)] * batch_size * k, -1)
shift = F.copy_to(shift, F.context(feat))
topk_indices_ = F.reshape(topk_indices, (-1,)) * hidden_size + shift
return F.reshape(F.gather_row(feat_, topk_indices_), (batch_size, k, -1)),\ .. math::
topk_indices z_v = \frac{\exp(x_v)}{\sum_{u\in\mathcal{V}}\exp(x_u)}
If the graph is a batch of multiple graphs, each graph computes softmax
def max_nodes(graph, feat): independently. The result tensor has the same shape as the original node
"""Take elementwise maximum over all the values of node field feature.
:attr:`feat` in :attr:`graph`
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph.
The graph. Input graph.
feat : str feat : str
The feature field. Node feature name.
ntype : str, optional
Node type. Can be omitted if there is only one node type in the graph.
Returns Returns
------- -------
tensor tensor
The tensor obtained. Result tensor.
Examples Examples
-------- --------
...@@ -573,167 +230,55 @@ def max_nodes(graph, feat): ...@@ -573,167 +230,55 @@ def max_nodes(graph, feat):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
node features. node features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0, 1], [1, 0])) # Graph 1
>>> g1.add_nodes(2) >>> g1.ndata['h'] = th.tensor([1., 1.])
>>> g1.ndata['h'] = th.tensor([[1.], [2.]]) >>> g2 = dgl.graph(([0, 1], [1, 2])) # Graph 2
>>> g2.ndata['h'] = th.tensor([1., 1., 1.])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.ndata['h'] = th.tensor([[1.], [2.], [3.]])
Max over node attribute :attr:`h` in a batched graph.
>>> bg = dgl.batch([g1, g2], node_attrs='h')
>>> dgl.max_nodes(bg, 'h')
tensor([[2.], # max(1, 2)
[3.]]) # max(1, 2, 3)
Max over node attribute :attr:`h` in a single graph.
>>> dgl.max_nodes(g1, 'h') Softmax over one graph:
tensor([[2.]])
Notes >>> dgl.softmax_nodes(g1, 'h')
----- tensor([.5000, .5000])
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of
the i-th graph in the batched graph. If a graph has no nodes,
a tensor filed with -inf of the same shape is returned at the
corresponding row.
"""
return _max_on(graph, 'nodes', feat)
def max_edges(graph, feat):
"""Take elementwise maximum over all the values of edge field
:attr:`feat` in :attr:`graph`
Parameters Softmax over a batched graph:
----------
graph : DGLGraph
The graph.
feat : str
The feature field.
Returns >>> bg = dgl.batch([g1, g2])
------- >>> dgl.softmax_nodes(bg, 'h')
tensor tensor([.5000, .5000, .3333, .3333, .3333])
The tensor obtained.
Examples See Also
-------- --------
softmax_edges
>>> import dgl
>>> import torch as th
Create two :class:`~dgl.DGLGraph` objects and initialize their
edge features.
>>> g1 = dgl.DGLGraph() # Graph 1
>>> g1.add_nodes(2)
>>> g1.add_edges([0, 1], [1, 0])
>>> g1.edata['h'] = th.tensor([[1.], [2.]])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.add_edges([0, 1, 2], [1, 2, 0])
>>> g2.edata['h'] = th.tensor([[1.], [2.], [3.]])
Max over edge attribute :attr:`h` in a batched graph.
>>> bg = dgl.batch([g1, g2], edge_attrs='h')
>>> dgl.max_edges(bg, 'h')
tensor([[2.], # max(1, 2)
[3.]]) # max(1, 2, 3)
Max over edge attribute :attr:`h` in a single graph.
>>> dgl.max_edges(g1, 'h')
tensor([[2.]])
Notes
-----
Return a stacked tensor with an extra first dimension whose size equals
batch size of the input graph.
The i-th row of the stacked tensor contains the readout result of
the i-th graph in the batched graph. If a graph has no edges,
a tensor filled with -inf of the same shape is returned at the
corresponding row.
""" """
return _max_on(graph, 'edges', feat) x = graph.nodes[ntype].data[feat]
return segment.segment_softmax(graph.batch_num_nodes(ntype), x)
def softmax_nodes(graph, feat):
"""Apply batch-wise graph-level softmax over all the values of node field
:attr:`feat` in :attr:`graph`.
Parameters
----------
graph : DGLGraph
The graph.
feat : str
The feature field.
Returns
-------
tensor
The tensor obtained.
Examples
--------
>>> import dgl
>>> import torch as th
Create two :class:`~dgl.DGLGraph` objects and initialize their
node features.
>>> g1 = dgl.DGLGraph() # Graph 1 def softmax_edges(graph, feat, *, etype=None):
>>> g1.add_nodes(2) r"""Perform graph-wise softmax on the edge features.
>>> g1.ndata['h'] = th.tensor([[1., 0.], [2., 0.]])
>>> g2 = dgl.DGLGraph() # Graph 2 For each edge :math:`e\in\mathcal{E}` and its feature :math:`x_e`,
>>> g2.add_nodes(3) calculate its normalized feature as follows:
>>> g2.ndata['h'] = th.tensor([[1., 0.], [2., 0.], [3., 0.]])
Softmax over node attribute :attr:`h` in a batched graph. .. math::
z_e = \frac{\exp(x_e)}{\sum_{e'\in\mathcal{E}}\exp(x_{e'})}
>>> bg = dgl.batch([g1, g2], node_attrs='h')
>>> dgl.softmax_nodes(bg, 'h')
tensor([[0.2689, 0.5000], # [0.2689, 0.7311] = softmax([1., 2.])
[0.7311, 0.5000], # [0.5000, 0.5000] = softmax([0., 0.])
[0.0900, 0.3333], # [0.0900, 0.2447, 0.6652] = softmax([1., 2., 3.])
[0.2447, 0.3333], # [0.3333, 0.3333, 0.3333] = softmax([0., 0., 0.])
[0.6652, 0.3333]])
Softmax over node attribute :attr:`h` in a single graph.
>>> dgl.softmax_nodes(g1, 'h')
tensor([[0.2689, 0.5000], # [0.2689, 0.7311] = softmax([1., 2.])
[0.7311, 0.5000]]), # [0.5000, 0.5000] = softmax([0., 0.])
Notes
-----
If the input graph has batch size greater then one, the softmax is applied at
each single graph in the batched graph.
"""
return _softmax_on(graph, 'nodes', feat)
If the graph is a batch of multiple graphs, each graph computes softmax
def softmax_edges(graph, feat): independently. The result tensor has the same shape as the original edge
"""Apply batch-wise graph-level softmax over all the values of edge field feature.
:attr:`feat` in :attr:`graph`.
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph.
The graph. Input graph.
feat : str feat : str
The feature field. Edge feature name.
etype : str, typle of str, optional
Edge type. Can be omitted if there is only one edge type in the graph.
Returns Returns
------- -------
tensor tensor
The tensor obtained. Result tensor.
Examples Examples
-------- --------
...@@ -744,55 +289,56 @@ def softmax_edges(graph, feat): ...@@ -744,55 +289,56 @@ def softmax_edges(graph, feat):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
edge features. edge features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0, 1], [1, 0])) # Graph 1
>>> g1.add_nodes(2) >>> g1.edata['h'] = th.tensor([1., 1.])
>>> g1.add_edges([0, 1], [1, 0]) >>> g2 = dgl.graph(([0, 1, 0], [1, 2, 2])) # Graph 2
>>> g1.edata['h'] = th.tensor([[1., 0.], [2., 0.]]) >>> g2.edata['h'] = th.tensor([1., 1., 1.])
>>> g2 = dgl.DGLGraph() # Graph 2 Softmax over one graph:
>>> g2.add_nodes(3)
>>> g2.add_edges([0, 1, 2], [1, 2, 0])
>>> g2.edata['h'] = th.tensor([[1., 0.], [2., 0.], [3., 0.]])
Softmax over edge attribute :attr:`h` in a batched graph. >>> dgl.softmax_edges(g1, 'h')
tensor([.5000, .5000])
Softmax over a batched graph:
>>> bg = dgl.batch([g1, g2], edge_attrs='h') >>> bg = dgl.batch([g1, g2])
>>> dgl.softmax_edges(bg, 'h') >>> dgl.softmax_edges(bg, 'h')
tensor([[0.2689, 0.5000], # [0.2689, 0.7311] = softmax([1., 2.]) tensor([.5000, .5000, .3333, .3333, .3333])
[0.7311, 0.5000], # [0.5000, 0.5000] = softmax([0., 0.])
[0.0900, 0.3333], # [0.0900, 0.2447, 0.6652] = softmax([1., 2., 3.])
[0.2447, 0.3333], # [0.3333, 0.3333, 0.3333] = softmax([0., 0., 0.])
[0.6652, 0.3333]])
Softmax over edge attribute :attr:`h` in a single graph. See Also
--------
softmax_nodes
"""
x = graph.edges[etype].data[feat]
return segment.segment_softmax(graph.batch_num_edges(etype), x)
>>> dgl.softmax_edges(g1, 'h') def broadcast_nodes(graph, graph_feat, *, ntype=None):
tensor([[0.2689, 0.5000], # [0.2689, 0.7311] = softmax([1., 2.]) """Generate a node feature equal to the graph-level feature :attr:`graph_feat`.
[0.7311, 0.5000]]), # [0.5000, 0.5000] = softmax([0., 0.])
Notes The operation is similar to ``numpy.repeat`` (or ``torch.repeat_interleave``).
----- It is commonly used to normalize node features by a global vector. For example,
If the input graph has batch size greater then one, the softmax is applied at each to normalize node features across graph to range :math:`[0~1)`:
example in the batch.
"""
return _softmax_on(graph, 'edges', feat)
def broadcast_nodes(graph, feat_data): >>> g = dgl.batch([...]) # batch multiple graphs
"""Broadcast :attr:`feat_data` to all nodes in :attr:`graph`, and return a >>> g.ndata['h'] = ... # some node features
tensor of node features. >>> h_sum = dgl.broadcast_nodes(g, dgl.sum_nodes(g, 'h'))
>>> g.ndata['h'] /= h_sum # normalize by summation
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph
The graph. The graph.
feat_data : tensor graph_feat : tensor
The feature to broadcast. Tensor shape is :math:`(*)` for single graph, and The feature to broadcast. Tensor shape is :math:`(*)` for single graph, and
:math:`(B, *)` for batched graph. :math:`(B, *)` for batched graph.
ntype : str, optional
Node type. Can be omitted if there is only one node type.
Returns Returns
------- -------
tensor Tensor
The node features tensor with shape :math:`(N, *)`. The node features tensor with shape :math:`(N, *)`, where :math:`N` is the
number of nodes.
Examples Examples
-------- --------
...@@ -803,12 +349,8 @@ def broadcast_nodes(graph, feat_data): ...@@ -803,12 +349,8 @@ def broadcast_nodes(graph, feat_data):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
node features. node features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0], [1])) # Graph 1
>>> g1.add_nodes(2) >>> g2 = dgl.graph(([0, 1], [1, 2])) # Graph 2
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> bg = dgl.batch([g1, g2]) >>> bg = dgl.batch([g1, g2])
>>> feat = th.rand(2, 5) >>> feat = th.rand(2, 5)
>>> feat >>> feat
...@@ -825,34 +367,45 @@ def broadcast_nodes(graph, feat_data): ...@@ -825,34 +367,45 @@ def broadcast_nodes(graph, feat_data):
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014], [0.2721, 0.4629, 0.7269, 0.0724, 0.1014],
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014]]) [0.2721, 0.4629, 0.7269, 0.0724, 0.1014]])
Broadcast feature to all nodes in the batched graph. Broadcast feature to all nodes in the single graph.
>>> dgl.broadcast_nodes(g1, feat[0]) >>> dgl.broadcast_nodes(g1, feat[0])
tensor([[0.4325, 0.7710, 0.5541, 0.0544, 0.9368], tensor([[0.4325, 0.7710, 0.5541, 0.0544, 0.9368],
[0.4325, 0.7710, 0.5541, 0.0544, 0.9368]]) [0.4325, 0.7710, 0.5541, 0.0544, 0.9368]])
Notes See Also
----- --------
feat[i] is broadcast to the nodes in i-th graph in the batched graph. broadcast_edges
""" """
return _broadcast_on(graph, 'nodes', feat_data) return F.repeat(graph_feat, graph.batch_num_nodes(ntype), dim=0)
def broadcast_edges(graph, feat_data): def broadcast_edges(graph, graph_feat, *, etype=None):
"""Broadcast :attr:`feat_data` to all edges in :attr:`graph`, and return a """Generate an edge feature equal to the graph-level feature :attr:`graph_feat`.
tensor of edge features.
The operation is similar to ``numpy.repeat`` (or ``torch.repeat_interleave``).
It is commonly used to normalize edge features by a global vector. For example,
to normalize edge features across graph to range :math:`[0~1)`:
>>> g = dgl.batch([...]) # batch multiple graphs
>>> g.edata['h'] = ... # some node features
>>> h_sum = dgl.broadcast_edges(g, dgl.sum_edges(g, 'h'))
>>> g.edata['h'] /= h_sum # normalize by summation
Parameters Parameters
---------- ----------
graph : DGLGraph graph : DGLGraph
The graph. The graph.
feat_data : tensor graph_feat : tensor
The feature to broadcast. Tensor shape is :math:`(*)` for single The feature to broadcast. Tensor shape is :math:`(*)` for single graph, and
graph, and :math:`(B, *)` for batched graph. :math:`(B, *)` for batched graph.
etype : str, typle of str, optional
Edge type. Can be omitted if there is only one edge type in the graph.
Returns Returns
------- -------
tensor Tensor
The edge features tensor with shape :math:`(E, *)` The edge features tensor with shape :math:`(M, *)`, where :math:`M` is the
number of edges.
Examples Examples
-------- --------
...@@ -863,14 +416,8 @@ def broadcast_edges(graph, feat_data): ...@@ -863,14 +416,8 @@ def broadcast_edges(graph, feat_data):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
edge features. edge features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0], [1])) # Graph 1
>>> g1.add_nodes(2) >>> g2 = dgl.graph(([0, 1], [1, 2])) # Graph 2
>>> g1.add_edges([0, 1], [1, 0])
>>> g2 = dgl.DGLGraph() # Graph 2
>>> g2.add_nodes(3)
>>> g2.add_edges([0, 1, 2], [1, 2, 0])
>>> bg = dgl.batch([g1, g2]) >>> bg = dgl.batch([g1, g2])
>>> feat = th.rand(2, 5) >>> feat = th.rand(2, 5)
>>> feat >>> feat
...@@ -882,32 +429,119 @@ def broadcast_edges(graph, feat_data): ...@@ -882,32 +429,119 @@ def broadcast_edges(graph, feat_data):
>>> dgl.broadcast_edges(bg, feat) >>> dgl.broadcast_edges(bg, feat)
tensor([[0.4325, 0.7710, 0.5541, 0.0544, 0.9368], tensor([[0.4325, 0.7710, 0.5541, 0.0544, 0.9368],
[0.4325, 0.7710, 0.5541, 0.0544, 0.9368],
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014],
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014], [0.2721, 0.4629, 0.7269, 0.0724, 0.1014],
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014]]) [0.2721, 0.4629, 0.7269, 0.0724, 0.1014]])
Broadcast feature to all edges in the batched graph. Broadcast feature to all edges in the single graph.
>>> dgl.broadcast_edges(g2, feat[1])
tensor([[0.2721, 0.4629, 0.7269, 0.0724, 0.1014],
[0.2721, 0.4629, 0.7269, 0.0724, 0.1014]])
See Also
--------
broadcast_nodes
"""
return F.repeat(graph_feat, graph.batch_num_edges(etype), dim=0)
READOUT_ON_ATTRS = {
'nodes': ('ndata', 'batch_num_nodes', 'number_of_nodes'),
'edges': ('edata', 'batch_num_edges', 'number_of_edges'),
}
def _topk_on(graph, typestr, feat, k, descending, sortby, ntype_or_etype):
"""Internal function to take graph-wise top-k node/edge features of
field :attr:`feat` in :attr:`graph` ranked by keys at given
index :attr:`sortby`. If :attr:`descending` is set to False, return the
k smallest elements instead.
Parameters
---------
graph : DGLGraph
The graph
typestr : str
'nodes' or 'edges'
feat : str
The feature field name.
k : int
The :math:`k` in "top-:math`k`".
descending : bool
Controls whether to return the largest or smallest elements,
defaults to True.
sortby : int
The key index we sort :attr:`feat` on, if set to None, we sort
the whole :attr:`feat`.
ntype_or_etype : str, tuple of str
Node/edge type.
Returns
-------
sorted_feat : Tensor
A tensor with shape :math:`(B, K, D)`, where
:math:`B` is the batch size of the input graph.
sorted_idx : Tensor
A tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if sortby
is set to None), where
:math:`B` is the batch size of the input graph, :math:`D`
is the feature size.
>>> dgl.broadcast_edges(g1, feat[0])
tensor([[0.4325, 0.7710, 0.5541, 0.0544, 0.9368],
[0.4325, 0.7710, 0.5541, 0.0544, 0.9368]])
Notes Notes
----- -----
feat[i] is broadcast to the edges in i-th graph in the batched graph. If an example has :math:`n` nodes/edges and :math:`n<k`, in the first
returned tensor the :math:`n+1` to :math:`k`th rows would be padded
with all zero; in the second returned tensor, the behavior of :math:`n+1`
to :math:`k`th elements is not defined.
""" """
return _broadcast_on(graph, 'edges', feat_data) _, batch_num_objs_attr, _ = READOUT_ON_ATTRS[typestr]
data = getattr(graph, typestr)[ntype_or_etype].data
if F.ndim(data[feat]) > 2:
raise DGLError('Only support {} feature `{}` with dimension less than or'
' equal to 2'.format(typestr, feat))
feat = data[feat]
hidden_size = F.shape(feat)[-1]
batch_num_objs = getattr(graph, batch_num_objs_attr)(ntype_or_etype)
batch_size = len(batch_num_objs)
length = max(max(F.asnumpy(batch_num_objs)), k)
fill_val = -float('inf') if descending else float('inf')
feat_ = F.pad_packed_tensor(feat, batch_num_objs, fill_val, l_min=k)
if sortby is not None:
keys = F.squeeze(F.slice_axis(feat_, -1, sortby, sortby+1), -1)
order = F.argsort(keys, -1, descending=descending)
else:
order = F.argsort(feat_, 1, descending=descending)
def topk_nodes(graph, feat, k, descending=True, idx=None): topk_indices = F.slice_axis(order, 1, 0, k)
"""Return graph-wise top-k node features of field :attr:`feat` in
:attr:`graph` ranked by keys at given index :attr:`idx`. If :attr: # zero padding
feat_ = F.pad_packed_tensor(feat, batch_num_objs, 0, l_min=k)
if sortby is not None:
feat_ = F.reshape(feat_, (batch_size * length, -1))
shift = F.repeat(F.arange(0, batch_size) * length, k, -1)
shift = F.copy_to(shift, F.context(feat))
topk_indices_ = F.reshape(topk_indices, (-1,)) + shift
else:
feat_ = F.reshape(feat_, (-1,))
shift = F.repeat(F.arange(0, batch_size), k * hidden_size, -1) * length * hidden_size +\
F.cat([F.arange(0, hidden_size)] * batch_size * k, -1)
shift = F.copy_to(shift, F.context(feat))
topk_indices_ = F.reshape(topk_indices, (-1,)) * hidden_size + shift
return F.reshape(F.gather_row(feat_, topk_indices_), (batch_size, k, -1)),\
topk_indices
def topk_nodes(graph, feat, k, *, descending=True, sortby=None, ntype=None):
"""Perform a graph-wise top-k on node features :attr:`feat` in
:attr:`graph` by feature at index :attr:`sortby`. If :attr:
`descending` is set to False, return the k smallest elements instead. `descending` is set to False, return the k smallest elements instead.
If idx is set to None, the function would return top-k value of all If :attr:`sortby` is set to None, the function would perform top-k on
indices, which is equivalent to calling all dimensions independently, equivalent to calling
:code:`torch.topk(graph.ndata[feat], dim=0)` :code:`torch.topk(graph.ndata[feat], dim=0)`.
for each example of the input graph.
Parameters Parameters
---------- ----------
...@@ -919,22 +553,21 @@ def topk_nodes(graph, feat, k, descending=True, idx=None): ...@@ -919,22 +553,21 @@ def topk_nodes(graph, feat, k, descending=True, idx=None):
The k in "top-k" The k in "top-k"
descending : bool descending : bool
Controls whether to return the largest or smallest elements. Controls whether to return the largest or smallest elements.
idx : int or None, defaults to None sortby : int, optional
The index of keys we rank :attr:`feat` on, if set to None, we sort Sort according to which feature. If is None, all features are sorted independently.
the whole :attr:`feat`. ntype : str, optional
Node type. Can be omitted if there is only one node type in the graph.
Returns Returns
------- -------
tuple of tensors sorted_feat : Tensor
The first tensor returns top-k node features of each single graph of A tensor with shape :math:`(B, K, D)`, where
the input graph:
a tensor with shape :math:`(B, K, D)` would be returned, where
:math:`B` is the batch size of the input graph.
The second tensor returns the top-k node indices of each single graph
of the input graph:
a tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if` idx
is set to None) would be returned, where
:math:`B` is the batch size of the input graph. :math:`B` is the batch size of the input graph.
sorted_idx : Tensor
A tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if sortby
is set to None), where
:math:`B` is the batch size of the input graph, :math:`D`
is the feature size.
Examples Examples
-------- --------
...@@ -945,8 +578,7 @@ def topk_nodes(graph, feat, k, descending=True, idx=None): ...@@ -945,8 +578,7 @@ def topk_nodes(graph, feat, k, descending=True, idx=None):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
node features. node features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0, 1], [2, 3])) # Graph 1
>>> g1.add_nodes(4)
>>> g1.ndata['h'] = th.rand(4, 5) >>> g1.ndata['h'] = th.rand(4, 5)
>>> g1.ndata['h'] >>> g1.ndata['h']
tensor([[0.0297, 0.8307, 0.9140, 0.6702, 0.3346], tensor([[0.0297, 0.8307, 0.9140, 0.6702, 0.3346],
...@@ -954,8 +586,7 @@ def topk_nodes(graph, feat, k, descending=True, idx=None): ...@@ -954,8 +586,7 @@ def topk_nodes(graph, feat, k, descending=True, idx=None):
[0.0880, 0.6515, 0.4451, 0.7507, 0.5297], [0.0880, 0.6515, 0.4451, 0.7507, 0.5297],
[0.5171, 0.6379, 0.2695, 0.8954, 0.5197]]) [0.5171, 0.6379, 0.2695, 0.8954, 0.5197]])
>>> g2 = dgl.DGLGraph() # Graph 2 >>> g2 = dgl.graph(([0, 1, 2], [2, 3, 4])) # Graph 2
>>> g2.add_nodes(5)
>>> g2.ndata['h'] = th.rand(5, 5) >>> g2.ndata['h'] = th.rand(5, 5)
>>> g2.ndata['h'] >>> g2.ndata['h']
tensor([[0.3168, 0.3174, 0.5303, 0.0804, 0.3808], tensor([[0.3168, 0.3174, 0.5303, 0.0804, 0.3808],
...@@ -966,63 +597,58 @@ def topk_nodes(graph, feat, k, descending=True, idx=None): ...@@ -966,63 +597,58 @@ def topk_nodes(graph, feat, k, descending=True, idx=None):
Top-k over node attribute :attr:`h` in a batched graph. Top-k over node attribute :attr:`h` in a batched graph.
>>> bg = dgl.batch([g1, g2], node_attrs='h') >>> bg = dgl.batch([g1, g2], ndata=['h'])
>>> dgl.topk_nodes(bg, 'h', 3) >>> dgl.topk_nodes(bg, 'h', 3)
(tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997], (tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997],
[0.5171, 0.6515, 0.9140, 0.7507, 0.5297], [0.5171, 0.6515, 0.9140, 0.7507, 0.5297],
[0.0880, 0.6379, 0.4451, 0.6893, 0.5197]], [0.0880, 0.6379, 0.4451, 0.6893, 0.5197]],
[[0.5065, 0.9105, 0.5692, 0.8489, 0.3872],
[[0.5065, 0.9105, 0.5692, 0.8489, 0.3872], [0.3168, 0.5182, 0.5418, 0.6114, 0.3808],
[0.3168, 0.5182, 0.5418, 0.6114, 0.3808], [0.1931, 0.4954, 0.5303, 0.3934, 0.1458]]]), tensor([[[1, 0, 1, 3, 1],
[0.1931, 0.4954, 0.5303, 0.3934, 0.1458]]]), tensor([[[1, 0, 1, 3, 1], [3, 2, 0, 2, 2],
[3, 2, 0, 2, 2], [2, 3, 2, 1, 3]],
[2, 3, 2, 1, 3]], [[4, 2, 2, 2, 4],
[0, 4, 4, 1, 0],
[[4, 2, 2, 2, 4], [3, 3, 0, 3, 1]]]))
[0, 4, 4, 1, 0],
[3, 3, 0, 3, 1]]])) Top-k over node attribute :attr:`h` along the last dimension in a batched graph.
Top-k over node attribute :attr:`h` along index -1 in a batched graph.
(used in SortPooling) (used in SortPooling)
>>> dgl.topk_nodes(bg, 'h', 3, idx=-1) >>> dgl.topk_nodes(bg, 'h', 3, sortby=-1)
(tensor([[[0.5901, 0.3030, 0.9280, 0.6893, 0.7997], (tensor([[[0.5901, 0.3030, 0.9280, 0.6893, 0.7997],
[0.0880, 0.6515, 0.4451, 0.7507, 0.5297], [0.0880, 0.6515, 0.4451, 0.7507, 0.5297],
[0.5171, 0.6379, 0.2695, 0.8954, 0.5197]], [0.5171, 0.6379, 0.2695, 0.8954, 0.5197]],
[[0.5065, 0.5182, 0.5418, 0.1520, 0.3872],
[[0.5065, 0.5182, 0.5418, 0.1520, 0.3872], [0.3168, 0.3174, 0.5303, 0.0804, 0.3808],
[0.3168, 0.3174, 0.5303, 0.0804, 0.3808], [0.1323, 0.2766, 0.4318, 0.6114, 0.1458]]]), tensor([[1, 2, 3],
[0.1323, 0.2766, 0.4318, 0.6114, 0.1458]]]), tensor([[1, 2, 3], [4, 0, 1]]))
[4, 0, 1]]))
Top-k over node attribute :attr:`h` in a single graph. Top-k over node attribute :attr:`h` in a single graph.
>>> dgl.topk_nodes(g1, 'h', 3) >>> dgl.topk_nodes(g1, 'h', 3)
(tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997], (tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997],
[0.5171, 0.6515, 0.9140, 0.7507, 0.5297], [0.5171, 0.6515, 0.9140, 0.7507, 0.5297],
[0.0880, 0.6379, 0.4451, 0.6893, 0.5197]]]), tensor([[[1, 0, 1, 3, 1], [0.0880, 0.6379, 0.4451, 0.6893, 0.5197]]]), tensor([[[1, 0, 1, 3, 1],
[3, 2, 0, 2, 2], [3, 2, 0, 2, 2],
[2, 3, 2, 1, 3]]])) [2, 3, 2, 1, 3]]]))
Notes Notes
----- -----
If an example has :math:`n` nodes and :math:`n<k`, in the first If an example has :math:`n` nodes and :math:`n<k`, the ``sorted_feat``
returned tensor the :math:`n+1` to :math:`k`th rows would be padded tensor will pad the :math:`n+1` to :math:`k`th rows with zero;
with all zero; in the second returned tensor, the behavior of :math:`n+1`
to :math:`k`th elements is not defined.
""" """
return _topk_on(graph, 'nodes', feat, k, descending=descending, idx=idx) return _topk_on(graph, 'nodes', feat, k,
descending=descending, sortby=sortby,
ntype_or_etype=ntype)
def topk_edges(graph, feat, k, descending=True, idx=None): def topk_edges(graph, feat, k, *, descending=True, sortby=None, etype=None):
"""Return graph-wise top-k edge features of field :attr:`feat` in """Perform a graph-wise top-k on node features :attr:`feat` in
:attr:`graph` ranked by keys at given index :attr:`idx`. If :attr:`graph` by feature at index :attr:`sortby`. If :attr:
:attr:`descending` is set to False, return the k smallest elements `descending` is set to False, return the k smallest elements instead.
instead.
If idx is set to None, the function would return top-k value of all If :attr:`sortby` is set to None, the function would perform top-k on
indices, which is equivalent to calling all dimensions independently, equivalent to calling
:code:`torch.topk(graph.edata[feat], dim=0)` :code:`torch.topk(graph.edata[feat], dim=0)`.
for each example of the input graph.
Parameters Parameters
---------- ----------
...@@ -1031,25 +657,24 @@ def topk_edges(graph, feat, k, descending=True, idx=None): ...@@ -1031,25 +657,24 @@ def topk_edges(graph, feat, k, descending=True, idx=None):
feat : str feat : str
The feature field. The feature field.
k : int k : int
The k in "top-k". The k in "top-k"
descending : bool descending : bool
Controls whether to return the largest or smallest elements. Controls whether to return the largest or smallest elements.
idx : int or None, defaults to None sortby : int, optional
The key index we sort :attr:`feat` on, if set to None, we sort Sort according to which feature. If is None, all features are sorted independently.
the whole :attr:`feat`. etype : str, typle of str, optional
Edge type. Can be omitted if there is only one edge type in the graph.
Returns Returns
------- -------
tuple of tensors sorted_feat : Tensor
The first tensor returns top-k edge features of each single graph of A tensor with shape :math:`(B, K, D)`, where
the input graph:
a tensor with shape :math:`(B, K, D)` would be returned, where
:math:`B` is the batch size of the input graph.
The second tensor returns the top-k edge indices of each single graph
of the input graph:
a tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if` idx
is set to None) would be returned, where
:math:`B` is the batch size of the input graph. :math:`B` is the batch size of the input graph.
sorted_idx : Tensor
A tensor with shape :math:`(B, K)`(:math:`(B, K, D)` if sortby
is set to None), where
:math:`B` is the batch size of the input graph, :math:`D`
is the feature size.
Examples Examples
-------- --------
...@@ -1060,9 +685,7 @@ def topk_edges(graph, feat, k, descending=True, idx=None): ...@@ -1060,9 +685,7 @@ def topk_edges(graph, feat, k, descending=True, idx=None):
Create two :class:`~dgl.DGLGraph` objects and initialize their Create two :class:`~dgl.DGLGraph` objects and initialize their
edge features. edge features.
>>> g1 = dgl.DGLGraph() # Graph 1 >>> g1 = dgl.graph(([0, 1, 2, 3], [1, 2, 3, 0])) # Graph 1
>>> g1.add_nodes(4)
>>> g1.add_edges([0, 1, 2, 3], [1, 2, 3, 0])
>>> g1.edata['h'] = th.rand(4, 5) >>> g1.edata['h'] = th.rand(4, 5)
>>> g1.edata['h'] >>> g1.edata['h']
tensor([[0.0297, 0.8307, 0.9140, 0.6702, 0.3346], tensor([[0.0297, 0.8307, 0.9140, 0.6702, 0.3346],
...@@ -1070,9 +693,7 @@ def topk_edges(graph, feat, k, descending=True, idx=None): ...@@ -1070,9 +693,7 @@ def topk_edges(graph, feat, k, descending=True, idx=None):
[0.0880, 0.6515, 0.4451, 0.7507, 0.5297], [0.0880, 0.6515, 0.4451, 0.7507, 0.5297],
[0.5171, 0.6379, 0.2695, 0.8954, 0.5197]]) [0.5171, 0.6379, 0.2695, 0.8954, 0.5197]])
>>> g2 = dgl.DGLGraph() # Graph 2 >>> g2 = dgl.graph(([0, 1, 2, 3, 4], [1, 2, 3, 4, 0])) # Graph 2
>>> g2.add_nodes(5)
>>> g2.add_edges([0, 1, 2, 3, 4], [1, 2, 3, 4, 0])
>>> g2.edata['h'] = th.rand(5, 5) >>> g2.edata['h'] = th.rand(5, 5)
>>> g2.edata['h'] >>> g2.edata['h']
tensor([[0.3168, 0.3174, 0.5303, 0.0804, 0.3808], tensor([[0.3168, 0.3174, 0.5303, 0.0804, 0.3808],
...@@ -1083,49 +704,46 @@ def topk_edges(graph, feat, k, descending=True, idx=None): ...@@ -1083,49 +704,46 @@ def topk_edges(graph, feat, k, descending=True, idx=None):
Top-k over edge attribute :attr:`h` in a batched graph. Top-k over edge attribute :attr:`h` in a batched graph.
>>> bg = dgl.batch([g1, g2], edge_attrs='h') >>> bg = dgl.batch([g1, g2], edata=['h'])
>>> dgl.topk_edges(bg, 'h', 3) >>> dgl.topk_edges(bg, 'h', 3)
(tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997], (tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997],
[0.5171, 0.6515, 0.9140, 0.7507, 0.5297], [0.5171, 0.6515, 0.9140, 0.7507, 0.5297],
[0.0880, 0.6379, 0.4451, 0.6893, 0.5197]], [0.0880, 0.6379, 0.4451, 0.6893, 0.5197]],
[[0.5065, 0.9105, 0.5692, 0.8489, 0.3872],
[[0.5065, 0.9105, 0.5692, 0.8489, 0.3872], [0.3168, 0.5182, 0.5418, 0.6114, 0.3808],
[0.3168, 0.5182, 0.5418, 0.6114, 0.3808], [0.1931, 0.4954, 0.5303, 0.3934, 0.1458]]]), tensor([[[1, 0, 1, 3, 1],
[0.1931, 0.4954, 0.5303, 0.3934, 0.1458]]]), tensor([[[1, 0, 1, 3, 1], [3, 2, 0, 2, 2],
[3, 2, 0, 2, 2], [2, 3, 2, 1, 3]],
[2, 3, 2, 1, 3]], [[4, 2, 2, 2, 4],
[0, 4, 4, 1, 0],
[[4, 2, 2, 2, 4], [3, 3, 0, 3, 1]]]))
[0, 4, 4, 1, 0],
[3, 3, 0, 3, 1]]]))
Top-k over edge attribute :attr:`h` along index -1 in a batched graph. Top-k over edge attribute :attr:`h` along index -1 in a batched graph.
(used in SortPooling) (used in SortPooling)
>>> dgl.topk_edges(bg, 'h', 3, idx=-1) >>> dgl.topk_edges(bg, 'h', 3, sortby=-1)
(tensor([[[0.5901, 0.3030, 0.9280, 0.6893, 0.7997], (tensor([[[0.5901, 0.3030, 0.9280, 0.6893, 0.7997],
[0.0880, 0.6515, 0.4451, 0.7507, 0.5297], [0.0880, 0.6515, 0.4451, 0.7507, 0.5297],
[0.5171, 0.6379, 0.2695, 0.8954, 0.5197]], [0.5171, 0.6379, 0.2695, 0.8954, 0.5197]],
[[0.5065, 0.5182, 0.5418, 0.1520, 0.3872],
[[0.5065, 0.5182, 0.5418, 0.1520, 0.3872], [0.3168, 0.3174, 0.5303, 0.0804, 0.3808],
[0.3168, 0.3174, 0.5303, 0.0804, 0.3808], [0.1323, 0.2766, 0.4318, 0.6114, 0.1458]]]), tensor([[1, 2, 3],
[0.1323, 0.2766, 0.4318, 0.6114, 0.1458]]]), tensor([[1, 2, 3], [4, 0, 1]]))
[4, 0, 1]]))
Top-k over edge attribute :attr:`h` in a single graph. Top-k over edge attribute :attr:`h` in a single graph.
>>> dgl.topk_edges(g1, 'h', 3) >>> dgl.topk_edges(g1, 'h', 3)
(tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997], (tensor([[[0.5901, 0.8307, 0.9280, 0.8954, 0.7997],
[0.5171, 0.6515, 0.9140, 0.7507, 0.5297], [0.5171, 0.6515, 0.9140, 0.7507, 0.5297],
[0.0880, 0.6379, 0.4451, 0.6893, 0.5197]]]), tensor([[[1, 0, 1, 3, 1], [0.0880, 0.6379, 0.4451, 0.6893, 0.5197]]]), tensor([[[1, 0, 1, 3, 1],
[3, 2, 0, 2, 2], [3, 2, 0, 2, 2],
[2, 3, 2, 1, 3]]])) [2, 3, 2, 1, 3]]]))
Notes Notes
----- -----
If an example has :math:`n` edges and :math:`n<k`, in the first If an example has :math:`n` nodes and :math:`n<k`, the ``sorted_feat``
returned tensor the :math:`n+1` to :math:`k`th rows would be padded tensor will pad the :math:`n+1` to :math:`k`th rows with zero;
with all zero; in the second returned tensor, the behavior of :math:`n+1`
to :math:`k`th elements is not defined.
""" """
return _topk_on(graph, 'edges', feat, k, descending=descending, idx=idx) return _topk_on(graph, 'edges', feat, k,
descending=descending, sortby=sortby,
ntype_or_etype=etype)
...@@ -168,7 +168,8 @@ def build_gidx_and_mapping_uv(edge_tuples, num_src, num_dst): ...@@ -168,7 +168,8 @@ def build_gidx_and_mapping_uv(edge_tuples, num_src, num_dst):
Number of ints needed to represent the graph Number of ints needed to represent the graph
""" """
u, v, eid = edge_tuples u, v, eid = edge_tuples
gidx = create_unitgraph_from_coo(2, num_src, num_dst, u, v, 'any') gidx = create_unitgraph_from_coo(2, num_src, num_dst,
u.tousertensor(), v.tousertensor(), ['coo', 'csr', 'csc'])
forward, backward = gidx.get_csr_shuffle_order(0) forward, backward = gidx.get_csr_shuffle_order(0)
eid = eid.tousertensor() eid = eid.tousertensor()
nbits = gidx.bits_needed(0) nbits = gidx.bits_needed(0)
......
...@@ -59,10 +59,11 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False): ...@@ -59,10 +59,11 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False):
if len(g.ntypes) > 1: if len(g.ntypes) > 1:
raise DGLError("Must specify node type when the graph is not homogeneous.") raise DGLError("Must specify node type when the graph is not homogeneous.")
nodes = {g.ntypes[0] : nodes} nodes = {g.ntypes[0] : nodes}
nodes = utils.prepare_tensor_dict(g, nodes, 'nodes')
nodes_all_types = [] nodes_all_types = []
for ntype in g.ntypes: for ntype in g.ntypes:
if ntype in nodes: if ntype in nodes:
nodes_all_types.append(utils.toindex(nodes[ntype], g._idtype_str).todgltensor()) nodes_all_types.append(F.to_dgl_nd(nodes[ntype]))
else: else:
nodes_all_types.append(nd.array([], ctx=nd.cpu())) nodes_all_types.append(nd.array([], ctx=nd.cpu()))
...@@ -75,7 +76,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False): ...@@ -75,7 +76,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False):
fanout_array = [None] * len(g.etypes) fanout_array = [None] * len(g.etypes)
for etype, value in fanout.items(): for etype, value in fanout.items():
fanout_array[g.get_etype_id(etype)] = value fanout_array[g.get_etype_id(etype)] = value
fanout_array = utils.toindex(fanout_array).todgltensor() fanout_array = F.to_dgl_nd(F.tensor(fanout_array, dtype=F.int64))
if prob is None: if prob is None:
prob_arrays = [nd.array([], ctx=nd.cpu())] * len(g.etypes) prob_arrays = [nd.array([], ctx=nd.cpu())] * len(g.etypes)
...@@ -83,7 +84,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False): ...@@ -83,7 +84,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False):
prob_arrays = [] prob_arrays = []
for etype in g.canonical_etypes: for etype in g.canonical_etypes:
if prob in g.edges[etype].data: if prob in g.edges[etype].data:
prob_arrays.append(F.zerocopy_to_dgl_ndarray(g.edges[etype].data[prob])) prob_arrays.append(F.to_dgl_nd(g.edges[etype].data[prob]))
else: else:
prob_arrays.append(nd.array([], ctx=nd.cpu())) prob_arrays.append(nd.array([], ctx=nd.cpu()))
...@@ -92,7 +93,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False): ...@@ -92,7 +93,7 @@ def sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False):
induced_edges = subgidx.induced_edges induced_edges = subgidx.induced_edges
ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes) ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes)
for i, etype in enumerate(ret.canonical_etypes): for i, etype in enumerate(ret.canonical_etypes):
ret.edges[etype].data[EID] = induced_edges[i].tousertensor() ret.edges[etype].data[EID] = induced_edges[i]
return ret return ret
def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False): def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False):
...@@ -140,10 +141,11 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False): ...@@ -140,10 +141,11 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False):
nodes = {g.ntypes[0] : nodes} nodes = {g.ntypes[0] : nodes}
# Parse nodes into a list of NDArrays. # Parse nodes into a list of NDArrays.
nodes = utils.prepare_tensor_dict(g, nodes, 'nodes')
nodes_all_types = [] nodes_all_types = []
for ntype in g.ntypes: for ntype in g.ntypes:
if ntype in nodes: if ntype in nodes:
nodes_all_types.append(utils.toindex(nodes[ntype], g._idtype_str).todgltensor()) nodes_all_types.append(F.to_dgl_nd(nodes[ntype]))
else: else:
nodes_all_types.append(nd.array([], ctx=nd.cpu())) nodes_all_types.append(nd.array([], ctx=nd.cpu()))
...@@ -156,12 +158,12 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False): ...@@ -156,12 +158,12 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False):
k_array = [None] * len(g.etypes) k_array = [None] * len(g.etypes)
for etype, value in k.items(): for etype, value in k.items():
k_array[g.get_etype_id(etype)] = value k_array[g.get_etype_id(etype)] = value
k_array = utils.toindex(k_array).todgltensor() k_array = F.to_dgl_nd(F.tensor(k_array, dtype=F.int64))
weight_arrays = [] weight_arrays = []
for etype in g.canonical_etypes: for etype in g.canonical_etypes:
if weight in g.edges[etype].data: if weight in g.edges[etype].data:
weight_arrays.append(F.zerocopy_to_dgl_ndarray(g.edges[etype].data[weight])) weight_arrays.append(F.to_dgl_nd(g.edges[etype].data[weight]))
else: else:
raise DGLError('Edge weights "{}" do not exist for relation graph "{}".'.format( raise DGLError('Edge weights "{}" do not exist for relation graph "{}".'.format(
weight, etype)) weight, etype))
...@@ -171,7 +173,7 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False): ...@@ -171,7 +173,7 @@ def select_topk(g, k, weight, nodes=None, edge_dir='in', ascending=False):
induced_edges = subgidx.induced_edges induced_edges = subgidx.induced_edges
ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes) ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes)
for i, etype in enumerate(ret.canonical_etypes): for i, etype in enumerate(ret.canonical_etypes):
ret.edges[etype].data[EID] = induced_edges[i].tousertensor() ret.edges[etype].data[EID] = induced_edges[i]
return ret return ret
......
"""Segment aggregation operators implemented using DGL graph."""
from .base import DGLError
from . import backend as F
from . import convert
from . import function as fn
def segment_reduce(seglen, value, reducer='sum'):
"""Segment reduction operator.
It aggregates the value tensor along the first dimension by segments.
The first argument ``seglen`` stores the length of each segment. Its
summation must be equal to the first dimension of the ``value`` tensor.
Zero-length segments are allowed.
Parameters
----------
seglen : Tensor
Segment lengths.
value : Tensor
Value to aggregate.
reducer : str, optional
Aggregation method. Can be 'sum', 'max', 'min', 'mean'.
Returns
-------
Tensor
Aggregated tensor of shape ``(len(seglen), value.shape[1:])``.
Examples
--------
>>> import dgl
>>> import torch as th
>>> val = th.ones(10, 3)
>>> seg = th.tensor([1, 0, 5, 4]) # 4 segments
>>> dgl.segment_reduce(seg, val)
tensor([[1., 1., 1.],
[0., 0., 0.],
[5., 5., 5.],
[4., 4., 4.]])
"""
ctx = F.context(seglen)
# TODO(minjie): a more efficient implementation is to create a graph
# directly from a CSR structure.
u = F.copy_to(F.arange(0, F.shape(value)[0], F.int32), ctx)
v = F.repeat(F.copy_to(F.arange(0, len(seglen), F.int32), ctx),
seglen, dim=0)
if len(u) != len(v):
raise DGLError("Invalid seglen array:", seglen,
". Its summation must be equal to value.shape[0].")
g = convert.bipartite((u, v))
g.srcdata['h'] = value
g.update_all(fn.copy_u('h', 'm'), getattr(fn, reducer)('m', 'h'))
return g.dstdata['h']
def segment_softmax(seglen, value):
"""Performa softmax on each segment.
The first argument ``seglen`` stores the length of each segment. Its
summation must be equal to the first dimension of the ``value`` tensor.
Zero-length segments are allowed.
Parameters
----------
seglen : Tensor
Segment lengths.
value : Tensor
Value to aggregate.
reducer : str, optional
Aggregation method. Can be 'sum', 'max', 'min', 'mean'.
Returns
-------
Tensor
Result tensor of the same shape as the ``value`` tensor.
Examples
--------
>>> import dgl
>>> import torch as th
>>> val = th.ones(10, 3)
>>> seg = th.tensor([1, 0, 5, 4]) # 4 segments
>>> dgl.segment_softmax(seg, val)
tensor([[1.0000, 1.0000, 1.0000],
[0.2000, 0.2000, 0.2000],
[0.2000, 0.2000, 0.2000],
[0.2000, 0.2000, 0.2000],
[0.2000, 0.2000, 0.2000],
[0.2000, 0.2000, 0.2000],
[0.2500, 0.2500, 0.2500],
[0.2500, 0.2500, 0.2500],
[0.2500, 0.2500, 0.2500],
[0.2500, 0.2500, 0.2500]])
"""
value_max = segment_reduce(seglen, value, reducer='max')
value = F.exp(value - F.repeat(value_max, seglen, dim=0))
value_sum = segment_reduce(seglen, value, reducer='sum')
return value / F.repeat(value_sum, seglen, dim=0)
...@@ -5,7 +5,6 @@ from __future__ import absolute_import ...@@ -5,7 +5,6 @@ from __future__ import absolute_import
import dgl.ndarray as nd import dgl.ndarray as nd
from ._ffi.function import _init_api from ._ffi.function import _init_api
from .base import DGLError from .base import DGLError
from .utils import to_dgl_context
from . import backend as F from . import backend as F
def infer_broadcast_shape(op, shp1, shp2): def infer_broadcast_shape(op, shp1, shp2):
...@@ -115,43 +114,44 @@ def _gspmm(gidx, op, reduce_op, u, e): ...@@ -115,43 +114,44 @@ def _gspmm(gidx, op, reduce_op, u, e):
(90,) to (90, 1) for a graph with 90 nodes/edges). (90,) to (90, 1) for a graph with 90 nodes/edges).
""" """
if gidx.number_of_etypes() != 1: if gidx.number_of_etypes() != 1:
raise DGLError("We only support gsddmm on graph with one edge type") raise DGLError("We only support gspmm on graph with one edge type")
use_u = op != 'copy_rhs' use_u = op != 'copy_rhs'
use_e = op != 'copy_lhs' use_e = op != 'copy_lhs'
# deal with scalar features.
expand_u, expand_e = False, False
if use_u: if use_u:
if F.ndim(u) == 1: if F.ndim(u) == 1:
u = F.unsqueeze(u, -1) u = F.unsqueeze(u, -1)
expand_u = True
if use_e: if use_e:
if F.ndim(e) == 1: if F.ndim(e) == 1:
e = F.unsqueeze(e, -1) e = F.unsqueeze(e, -1)
if gidx.number_of_etypes() != 1: expand_e = True
raise DGLError("We only support gspmm on graph with one edge type")
ctx = F.context(u) if use_u else F.context(e) ctx = F.context(u) if use_u else F.context(e)
dtype = F.dtype(u) if use_u else F.dtype(e) dtype = F.dtype(u) if use_u else F.dtype(e)
u_shp = F.shape(u) if use_u else (0,) u_shp = F.shape(u) if use_u else (0,)
e_shp = F.shape(e) if use_e else (0,) e_shp = F.shape(e) if use_e else (0,)
_, dsttype = gidx.metagraph.find_edge(0) _, dsttype = gidx.metagraph.find_edge(0)
v_shp = (gidx.number_of_nodes(dsttype), ) +\ v_shp = (gidx.number_of_nodes(dsttype), ) +\
infer_broadcast_shape(op, u_shp[1:], e_shp[1:]) infer_broadcast_shape(op, u_shp[1:], e_shp[1:])
v = F.zeros(v_shp, dtype, ctx) v = F.zeros(v_shp, dtype, ctx)
use_cmp = reduce_op in ['max', 'min'] use_cmp = reduce_op in ['max', 'min']
arg_u, arg_e = None, None arg_u, arg_e = None, None
ugi = gidx.get_unitgraph(0, to_dgl_context(ctx)) idtype = getattr(F, gidx.dtype)
idtype = getattr(F, ugi.dtype)
if use_cmp: if use_cmp:
if use_u: if use_u:
arg_u = F.zeros(v_shp, idtype, ctx) arg_u = F.zeros(v_shp, idtype, ctx)
if use_e: if use_e:
arg_e = F.zeros(v_shp, idtype, ctx) arg_e = F.zeros(v_shp, idtype, ctx)
if gidx.number_of_edges(0) > 0: if gidx.number_of_edges(0) > 0:
_CAPI_DGLKernelSpMM(ugi, op, reduce_op, _CAPI_DGLKernelSpMM(gidx, op, reduce_op,
to_dgl_nd(u if use_u else None), to_dgl_nd(u if use_u else None),
to_dgl_nd(e if use_e else None), to_dgl_nd(e if use_e else None),
to_dgl_nd_for_write(v), to_dgl_nd_for_write(v),
to_dgl_nd_for_write(arg_u), to_dgl_nd_for_write(arg_u),
to_dgl_nd_for_write(arg_e)) to_dgl_nd_for_write(arg_e))
if (expand_u or not use_u) and (expand_e or not use_e):
v = F.squeeze(v, -1)
return v, (arg_u, arg_e) return v, (arg_u, arg_e)
def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'): def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'):
...@@ -200,13 +200,16 @@ def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'): ...@@ -200,13 +200,16 @@ def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'):
raise DGLError("We only support gsddmm on graph with one edge type") raise DGLError("We only support gsddmm on graph with one edge type")
use_lhs = op != 'copy_rhs' use_lhs = op != 'copy_rhs'
use_rhs = op != 'copy_lhs' use_rhs = op != 'copy_lhs'
# deal with scalar features.
expand_lhs, expand_rhs = False, False
if use_lhs: if use_lhs:
if F.ndim(lhs) == 1: if F.ndim(lhs) == 1:
lhs = F.unsqueeze(lhs, -1) lhs = F.unsqueeze(lhs, -1)
expand_lhs = True
if use_rhs: if use_rhs:
if F.ndim(rhs) == 1: if F.ndim(rhs) == 1:
rhs = F.unsqueeze(rhs, -1) rhs = F.unsqueeze(rhs, -1)
expand_rhs = True
lhs_target = target_mapping[lhs_target] lhs_target = target_mapping[lhs_target]
rhs_target = target_mapping[rhs_target] rhs_target = target_mapping[rhs_target]
ctx = F.context(lhs) if use_lhs else F.context(rhs) ctx = F.context(lhs) if use_lhs else F.context(rhs)
...@@ -217,12 +220,13 @@ def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'): ...@@ -217,12 +220,13 @@ def _gsddmm(gidx, op, lhs, rhs, lhs_target='u', rhs_target='v'):
infer_broadcast_shape(op, lhs_shp[1:], rhs_shp[1:]) infer_broadcast_shape(op, lhs_shp[1:], rhs_shp[1:])
out = F.zeros(out_shp, dtype, ctx) out = F.zeros(out_shp, dtype, ctx)
if gidx.number_of_edges(0) > 0: if gidx.number_of_edges(0) > 0:
ugi = gidx.get_unitgraph(0, to_dgl_context(ctx)) _CAPI_DGLKernelSDDMM(gidx, op,
_CAPI_DGLKernelSDDMM(ugi, op,
to_dgl_nd(lhs if use_lhs else None), to_dgl_nd(lhs if use_lhs else None),
to_dgl_nd(rhs if use_rhs else None), to_dgl_nd(rhs if use_rhs else None),
to_dgl_nd_for_write(out), to_dgl_nd_for_write(out),
lhs_target, rhs_target) lhs_target, rhs_target)
if (expand_lhs or not use_lhs) and (expand_rhs or not use_rhs):
out = F.squeeze(out, -1)
return out return out
_init_api("dgl.sparse") _init_api("dgl.sparse")
...@@ -7,43 +7,43 @@ import numpy as np ...@@ -7,43 +7,43 @@ import numpy as np
from scipy import sparse from scipy import sparse
from ._ffi.function import _init_api from ._ffi.function import _init_api
from .graph import DGLGraph from .base import EID, NID, dgl_warning, DGLError, is_internal_column
from . import convert
from .heterograph import DGLHeteroGraph, DGLBlock from .heterograph import DGLHeteroGraph, DGLBlock
from . import ndarray as nd from . import ndarray as nd
from . import backend as F from . import backend as F
from .graph_index import from_coo from . import utils, batch
from .graph_index import _get_halo_subgraph_inner_node
from .graph import unbatch
from .convert import graph, bipartite, heterograph
from . import utils
from .base import EID, NID, DGLError, is_internal_column
from . import ndarray as nd
from .partition import metis_partition_assignment as hetero_metis_partition_assignment from .partition import metis_partition_assignment as hetero_metis_partition_assignment
from .partition import partition_graph_with_halo as hetero_partition_graph_with_halo from .partition import partition_graph_with_halo as hetero_partition_graph_with_halo
from .partition import metis_partition as hetero_metis_partition from .partition import metis_partition as hetero_metis_partition
# TO BE DEPRECATED
from .graph import DGLGraph as DGLGraphStale
from .graph_index import _get_halo_subgraph_inner_node
__all__ = [ __all__ = [
'line_graph', 'line_graph',
'line_heterograph',
'khop_adj', 'khop_adj',
'khop_graph', 'khop_graph',
'reverse', 'reverse',
'reverse_heterograph',
'to_simple_graph',
'to_bidirected', 'to_bidirected',
'to_bidirected_stale', 'to_bidirected_stale',
'laplacian_lambda_max', 'laplacian_lambda_max',
'knn_graph', 'knn_graph',
'segmented_knn_graph', 'segmented_knn_graph',
'add_edges',
'add_nodes',
'remove_edges',
'remove_nodes',
'add_self_loop', 'add_self_loop',
'remove_self_loop', 'remove_self_loop',
'metapath_reachable_graph', 'metapath_reachable_graph',
'compact_graphs', 'compact_graphs',
'to_block', 'to_block',
'to_simple', 'to_simple',
'to_simple_graph',
'in_subgraph', 'in_subgraph',
'out_subgraph', 'out_subgraph',
'remove_edges',
'as_immutable_graph', 'as_immutable_graph',
'as_heterograph'] 'as_heterograph']
...@@ -102,8 +102,7 @@ def knn_graph(x, k): ...@@ -102,8 +102,7 @@ def knn_graph(x, k):
(F.asnumpy(F.zeros_like(dst) + 1), (F.asnumpy(dst), F.asnumpy(src))), (F.asnumpy(F.zeros_like(dst) + 1), (F.asnumpy(dst), F.asnumpy(src))),
shape=(n_samples * n_points, n_samples * n_points)) shape=(n_samples * n_points, n_samples * n_points))
g = DGLGraph(adj, readonly=True) return convert.graph(adj)
return g
#pylint: disable=invalid-name #pylint: disable=invalid-name
def segmented_knn_graph(x, k, segs): def segmented_knn_graph(x, k, segs):
...@@ -145,7 +144,7 @@ def segmented_knn_graph(x, k, segs): ...@@ -145,7 +144,7 @@ def segmented_knn_graph(x, k, segs):
src = F.reshape(src, (-1,)) src = F.reshape(src, (-1,))
adj = sparse.csr_matrix((F.asnumpy(F.zeros_like(dst) + 1), (F.asnumpy(dst), F.asnumpy(src)))) adj = sparse.csr_matrix((F.asnumpy(F.zeros_like(dst) + 1), (F.asnumpy(dst), F.asnumpy(src))))
g = DGLGraph(adj, readonly=True) g = convert.graph(adj)
return g return g
def to_bidirected(g, readonly=None, copy_ndata=True, def to_bidirected(g, readonly=None, copy_ndata=True,
...@@ -262,7 +261,7 @@ def to_bidirected(g, readonly=None, copy_ndata=True, ...@@ -262,7 +261,7 @@ def to_bidirected(g, readonly=None, copy_ndata=True,
u, v = g.edges(form='uv', order='eid', etype=c_etype) u, v = g.edges(form='uv', order='eid', etype=c_etype)
subgs[c_etype] = (F.cat([u, v], dim=0), F.cat([v, u], dim=0)) subgs[c_etype] = (F.cat([u, v], dim=0), F.cat([v, u], dim=0))
new_g = heterograph(subgs) new_g = convert.heterograph(subgs)
else: else:
subgs = {} subgs = {}
for c_etype in canonical_etypes: for c_etype in canonical_etypes:
...@@ -273,7 +272,7 @@ def to_bidirected(g, readonly=None, copy_ndata=True, ...@@ -273,7 +272,7 @@ def to_bidirected(g, readonly=None, copy_ndata=True,
u, v = g.edges(form='uv', order='eid', etype=c_etype) u, v = g.edges(form='uv', order='eid', etype=c_etype)
subgs[c_etype] = (F.cat([u, v], dim=0), F.cat([v, u], dim=0)) subgs[c_etype] = (F.cat([u, v], dim=0), F.cat([v, u], dim=0))
new_g = heterograph(subgs) new_g = convert.heterograph(subgs)
# handle features # handle features
if copy_ndata: if copy_ndata:
...@@ -299,27 +298,6 @@ def to_bidirected(g, readonly=None, copy_ndata=True, ...@@ -299,27 +298,6 @@ def to_bidirected(g, readonly=None, copy_ndata=True,
def line_graph(g, backtracking=True, shared=False): def line_graph(g, backtracking=True, shared=False):
"""Return the line graph of this graph. """Return the line graph of this graph.
Parameters
----------
g : dgl.DGLGraph
The input graph.
backtracking : bool, optional
Whether the returned line graph is backtracking.
shared : bool, optional
Whether the returned line graph shares representations with `self`.
Returns
-------
DGLGraph
The line graph of this graph.
"""
graph_data = g._graph.line_graph(backtracking)
node_frame = g._edge_frame if shared else None
return DGLGraph(graph_data, node_frame)
def line_heterograph(g, backtracking=True):
"""Return the line graph of this graph.
The graph should be an directed homogeneous graph. Aother type of graphs The graph should be an directed homogeneous graph. Aother type of graphs
are not supported right now. are not supported right now.
...@@ -327,8 +305,13 @@ def line_heterograph(g, backtracking=True): ...@@ -327,8 +305,13 @@ def line_heterograph(g, backtracking=True):
Parameters Parameters
---------- ----------
backtracking : bool g : DGLGraph
Input graph.
backtracking : bool, optional
Whether the pair of (v, u) (u, v) edges are treated as linked. Default True. Whether the pair of (v, u) (u, v) edges are treated as linked. Default True.
shared : bool, optional
Whether to copy the edge features of the original graph as the node features
of the result line graph.
Returns Returns
------- -------
...@@ -357,12 +340,15 @@ def line_heterograph(g, backtracking=True): ...@@ -357,12 +340,15 @@ def line_heterograph(g, backtracking=True):
... (tensor([0, 1, 2, 4]), tensor([4, 0, 3, 1])) ... (tensor([0, 1, 2, 4]), tensor([4, 0, 3, 1]))
""" """
assert g.is_homograph(), \ assert g.is_homogeneous(), \
'line_heterograph only support directed homogeneous graph right now' 'line_heterograph only support directed homogeneous graph right now'
lg = DGLHeteroGraph(_CAPI_DGLHeteroLineGraph(g._graph, backtracking))
if shared:
# copy edge features
lg.ndata.update(g.edata)
return lg
hgidx = _CAPI_DGLHeteroLineGraph(g._graph, backtracking) DGLHeteroGraph.line_graph = line_graph
hg = DGLHeteroGraph(hgidx, g._etypes, g._ntypes)
return hg
def khop_adj(g, k): def khop_adj(g, k):
"""Return the matrix of :math:`A^k` where :math:`A` is the adjacency matrix of :math:`g`, """Return the matrix of :math:`A^k` where :math:`A` is the adjacency matrix of :math:`g`,
...@@ -456,82 +442,9 @@ def khop_graph(g, k): ...@@ -456,82 +442,9 @@ def khop_graph(g, k):
col = np.repeat(adj_k.col, multiplicity) col = np.repeat(adj_k.col, multiplicity)
# TODO(zihao): we should support creating multi-graph from scipy sparse matrix # TODO(zihao): we should support creating multi-graph from scipy sparse matrix
# in the future. # in the future.
return DGLGraph(from_coo(n, row, col, True)) return convert.graph((row, col), num_nodes=n)
def reverse(g, copy_ndata=False, copy_edata=False):
"""Return the reverse of a graph
The reverse (also called converse, transpose) of a directed graph is another directed
graph on the same nodes with edges reversed in terms of direction.
Given a :class:`dgl.DGLGraph` object, we return another :class:`dgl.DGLGraph` object
representing its reverse.
Parameters
----------
g : dgl.DGLGraph
The input graph.
copy_ndata: bool, optional
If True, node attributes are copied from the original graph to the reversed graph.
Otherwise the reversed graph will not be initialized with node attributes.
copy_edata: bool, optional
If True, edge attributes are copied from the original graph to the reversed graph.
Otherwise the reversed graph will not have edge attributes.
Return
------
dgl.DGLGraph
The reversed graph.
Notes
-----
* We do not dynamically update the topology of a graph once that of its reverse changes.
This can be particularly problematic when the node/edge attrs are shared. For example,
if the topology of both the original graph and its reverse get changed independently,
you can get a mismatched node/edge feature.
Examples
--------
Create a graph to reverse.
>>> import dgl
>>> import torch as th
>>> g = dgl.DGLGraph()
>>> g.add_nodes(3)
>>> g.add_edges([0, 1, 2], [1, 2, 0])
>>> g.ndata['h'] = th.tensor([[0.], [1.], [2.]])
>>> g.edata['h'] = th.tensor([[3.], [4.], [5.]])
Reverse the graph and examine its structure.
>>> rg = g.reverse(copy_ndata=True, copy_edata=True)
>>> print(rg)
DGLGraph with 3 nodes and 3 edges.
Node data: {'h': Scheme(shape=(1,), dtype=torch.float32)}
Edge data: {'h': Scheme(shape=(1,), dtype=torch.float32)}
The edges are reversed now.
>>> rg.has_edges_between([1, 2, 0], [0, 1, 2])
tensor([1, 1, 1])
Reversed edges have the same feature as the original ones.
>>> g.edges[[0, 2], [1, 0]].data['h'] == rg.edges[[1, 0], [0, 2]].data['h']
tensor([[1],
[1]], dtype=torch.uint8)
The node/edge features of the reversed graph share memory with the original
graph, which is helpful for both forward computation and back propagation.
>>> g.ndata['h'] = g.ndata['h'] + 1
>>> rg.ndata['h']
tensor([[1.],
[2.],
[3.]])
"""
g_reversed = DGLGraph()
g_reversed.add_nodes(g.number_of_nodes())
g_edges = g.all_edges(order='eid')
g_reversed.add_edges(g_edges[1], g_edges[0])
g_reversed._batch_num_nodes = g._batch_num_nodes
g_reversed._batch_num_edges = g._batch_num_edges
if copy_ndata:
g_reversed._node_frame = g._node_frame
if copy_edata:
g_reversed._edge_frame = g._edge_frame
return g_reversed
def reverse_heterograph(g, copy_ndata=True, copy_edata=False): def reverse(g, copy_ndata=True, copy_edata=False, *, share_ndata=None, share_edata=None):
r"""Return the reverse of a graph. r"""Return the reverse of a graph.
The reverse (also called converse, transpose) of a graph with edges The reverse (also called converse, transpose) of a graph with edges
...@@ -649,6 +562,14 @@ def reverse_heterograph(g, copy_ndata=True, copy_edata=False): ...@@ -649,6 +562,14 @@ def reverse_heterograph(g, copy_ndata=True, copy_edata=False):
>>> rg.edges['plays'].data >>> rg.edges['plays'].data
{} {}
""" """
if share_ndata is not None:
dgl_warning('share_ndata argument has been renamed to copy_ndata.')
copy_ndata = share_ndata
if share_edata is not None:
dgl_warning('share_edata argument has been renamed to copy_edata.')
copy_edata = share_edata
if g.is_block:
raise DGLError('Reversing a block graph is not allowed.')
# TODO(0.5 release, xiangsx) need to handle BLOCK # TODO(0.5 release, xiangsx) need to handle BLOCK
# currently reversing a block results in undefined behavior # currently reversing a block results in undefined behavior
gidx = g._graph.reverse() gidx = g._graph.reverse()
...@@ -672,12 +593,12 @@ def reverse_heterograph(g, copy_ndata=True, copy_edata=False): ...@@ -672,12 +593,12 @@ def reverse_heterograph(g, copy_ndata=True, copy_edata=False):
return new_g return new_g
DGLHeteroGraph.reverse = reverse_heterograph DGLHeteroGraph.reverse = reverse
def to_simple_graph(g): def to_simple_graph(g):
"""Convert the graph to a simple graph with no multi-edge. """Convert the graph to a simple graph with no multi-edge.
The function generates a new *readonly* graph with no node/edge feature. DEPRECATED: renamed to dgl.to_simple
Parameters Parameters
---------- ----------
...@@ -689,8 +610,8 @@ def to_simple_graph(g): ...@@ -689,8 +610,8 @@ def to_simple_graph(g):
DGLGraph DGLGraph
A simple graph. A simple graph.
""" """
gidx = _CAPI_DGLToSimpleGraph(g._graph) dgl_warning('dgl.to_simple_graph is renamed to dgl.to_simple in v0.5.')
return DGLGraph(gidx, readonly=True) return to_simple(g)
def to_bidirected_stale(g, readonly=True): def to_bidirected_stale(g, readonly=True):
"""Convert the graph to a bidirected graph. """Convert the graph to a bidirected graph.
...@@ -733,7 +654,7 @@ def to_bidirected_stale(g, readonly=True): ...@@ -733,7 +654,7 @@ def to_bidirected_stale(g, readonly=True):
newgidx = _CAPI_DGLToBidirectedImmutableGraph(g._graph) newgidx = _CAPI_DGLToBidirectedImmutableGraph(g._graph)
else: else:
newgidx = _CAPI_DGLToBidirectedMutableGraph(g._graph) newgidx = _CAPI_DGLToBidirectedMutableGraph(g._graph)
return DGLGraph(newgidx) return DGLGraphStale(newgidx)
def laplacian_lambda_max(g): def laplacian_lambda_max(g):
"""Return the largest eigenvalue of the normalized symmetric laplacian of g. """Return the largest eigenvalue of the normalized symmetric laplacian of g.
...@@ -762,7 +683,7 @@ def laplacian_lambda_max(g): ...@@ -762,7 +683,7 @@ def laplacian_lambda_max(g):
>>> dgl.laplacian_lambda_max(g) >>> dgl.laplacian_lambda_max(g)
[1.809016994374948] [1.809016994374948]
""" """
g_arr = unbatch(g) g_arr = batch.unbatch(g)
rst = [] rst = []
for g_i in g_arr: for g_i in g_arr:
n = g_i.number_of_nodes() n = g_i.number_of_nodes()
...@@ -803,7 +724,6 @@ def metapath_reachable_graph(g, metapath): ...@@ -803,7 +724,6 @@ def metapath_reachable_graph(g, metapath):
A homogeneous or bipartite graph. A homogeneous or bipartite graph.
""" """
adj = 1 adj = 1
index_dtype = g._idtype_str
for etype in metapath: for etype in metapath:
adj = adj * g.adj(etype=etype, scipy_fmt='csr', transpose=True) adj = adj * g.adj(etype=etype, scipy_fmt='csr', transpose=True)
...@@ -812,83 +732,490 @@ def metapath_reachable_graph(g, metapath): ...@@ -812,83 +732,490 @@ def metapath_reachable_graph(g, metapath):
dsttype = g.to_canonical_etype(metapath[-1])[2] dsttype = g.to_canonical_etype(metapath[-1])[2]
if srctype == dsttype: if srctype == dsttype:
assert adj.shape[0] == adj.shape[1] assert adj.shape[0] == adj.shape[1]
new_g = graph(adj, ntype=srctype, index_dtype=index_dtype) new_g = convert.graph(adj, ntype=srctype, idtype=g.idtype, device=g.device)
else: else:
new_g = bipartite(adj, utype=srctype, vtype=dsttype, index_dtype=index_dtype) new_g = convert.bipartite(adj, utype=srctype, vtype=dsttype,
idtype=g.idtype, device=g.device)
# copy srcnode features
for key, value in g.nodes[srctype].data.items(): for key, value in g.nodes[srctype].data.items():
new_g.nodes[srctype].data[key] = value new_g.nodes[srctype].data[key] = value
# copy dstnode features
if srctype != dsttype: if srctype != dsttype:
for key, value in g.nodes[dsttype].data.items(): for key, value in g.nodes[dsttype].data.items():
new_g.nodes[dsttype].data[key] = value new_g.nodes[dsttype].data[key] = value
return new_g return new_g
def add_self_loop(g): def add_nodes(g, num, data=None, ntype=None):
"""Return a new graph containing all the edges in the input graph plus self loops r"""Add new nodes of the same node type.
of every nodes. A new graph with newly added nodes is returned.
No duplicate self loop will be added for nodes already having self loops.
Self-loop edges id are not preserved. All self-loop edges would be added at the end. Parameters
----------
num : int
Number of nodes to add.
data : dict, optional
Feature data of the added nodes.
ntype : str, optional
The type of the new nodes. Can be omitted if there is
only one node type in the graph.
Return
------
DGLHeteroGraph
The graph with newly added nodes.
Notes
-----
* If the key of ``data`` does not contain some existing feature fields,
those features for the new nodes will be filled with zeros).
* If the key of ``data`` contains new feature fields, those features for
the old nodes will be filled zeros).
Examples Examples
--------- --------
>>> g = DGLGraph() The following example uses PyTorch backend.
>>> g.add_nodes(5) >>> import dgl
>>> g.add_edges([0, 1, 2], [1, 1, 2]) >>> import torch
>>> new_g = dgl.transform.add_self_loop(g) # Nodes 0, 3, 4 don't have self-loop
>>> new_g.edges() **Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
(tensor([0, 0, 1, 2, 3, 4]), tensor([1, 0, 1, 2, 3, 4]))
>>> g = dgl.graph((torch.tensor([0, 1]), torch.tensor([1, 2])))
>>> g.num_nodes()
3
>>> g = dgl.add_nodes(g, 2)
>>> g.num_nodes()
5
If the graph has some node features and new nodes are added without
features, their features will be created by initializers defined
with :func:`set_n_initializer`.
>>> g.ndata['h'] = torch.ones(5, 1)
>>> g = dgl.add_nodes(g, 1)
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [1.], [1.], [0.]])
We can also assign features for the new nodes in adding new nodes.
>>> g = dgl.add_nodes(g, 1, {'h': torch.ones(1, 1), 'w': torch.ones(1, 1)})
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [1.], [1.], [0.], [1.]])
Since ``data`` contains new feature fields, the features for old nodes
will be created by initializers defined with :func:`set_n_initializer`.
>>> g.ndata['w']
tensor([[0.], [0.], [0.], [0.], [0.], [0.], [1.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g = dgl.add_nodes(g, 2)
DGLError: Node type name must be specified
if there are more than one node types.
>>> g.num_nodes('user')
3
>>> g = dgl.add_nodes(g, 2, ntype='user')
>>> g.num_nodes('user')
5
See Also
--------
remove_nodes
add_edges
remove_edges
"""
g = g.clone()
g.add_nodes(num, data=data, ntype=ntype)
return g
def add_edges(g, u, v, data=None, etype=None):
r"""Add multiple new edges for the specified edge type.
A new graph with newly added edges is returned.
The i-th new edge will be from ``u[i]`` to ``v[i]``.
Parameters Parameters
------------ ----------
g: DGLGraph u : int, tensor, numpy.ndarray, list
Source node IDs, ``u[i]`` gives the source node for the i-th new edge.
v : int, tensor, numpy.ndarray, list
Destination node IDs, ``v[i]`` gives the destination node for the i-th new edge.
data : dict, optional
Feature data of the added edges. The i-th row of the feature data
corresponds to the i-th new edge.
etype : str or tuple of str, optional
The type of the new edges. Can be omitted if there is
only one edge type in the graph.
Returns Return
------
DGLHeteroGraph
The graph with newly added edges.
Notes
-----
* If end nodes of adding edges does not exists, add_nodes is invoked
to add new nodes. The node features of the new nodes will be created
by initializers defined with :func:`set_n_initializer` (default
initializer fills zeros). In certain cases, it is recommanded to
add_nodes first and then add_edges.
* If the key of ``data`` does not contain some existing feature fields,
those features for the new edges will be created by initializers
defined with :func:`set_n_initializer` (default initializer fills zeros).
* If the key of ``data`` contains new feature fields, those features for
the old edges will be created by initializers defined with
:func:`set_n_initializer` (default initializer fills zeros).
Examples
-------- --------
DGLGraph
The following example uses PyTorch backend.
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Edge Type**
>>> g = dgl.graph((torch.tensor([0, 1]), torch.tensor([1, 2])))
>>> g.num_edges()
2
>>> g = dgl.add_edges(g, torch.tensor([1, 3]), torch.tensor([0, 1]))
>>> g.num_edges()
4
Since ``u`` or ``v`` contains a non-existing node ID, the nodes are
added implicitly.
>>> g.num_nodes()
4
If the graph has some edge features and new edges are added without
features, their features will be created by initializers defined
with :func:`set_n_initializer`.
>>> g.edata['h'] = torch.ones(4, 1)
>>> g = dgl.add_edges(g, torch.tensor([1]), torch.tensor([1]))
>>> g.edata['h']
tensor([[1.], [1.], [1.], [1.], [0.]])
We can also assign features for the new edges in adding new edges.
>>> g = dgl.add_edges(g, torch.tensor([0, 0]), torch.tensor([2, 2]),
>>> {'h': torch.tensor([[1.], [2.]]), 'w': torch.ones(2, 1)})
>>> g.edata['h']
tensor([[1.], [1.], [1.], [1.], [0.], [1.], [2.]])
Since ``data`` contains new feature fields, the features for old edges
will be created by initializers defined with :func:`set_n_initializer`.
>>> g.edata['w']
tensor([[0.], [0.], [0.], [0.], [0.], [1.], [1.]])
**Heterogeneous Graphs with Multiple Edge Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g = dgl.add_edges(g, torch.tensor([3]), torch.tensor([3]))
DGLError: Edge type name must be specified
if there are more than one edge types.
>>> g.number_of_edges('plays')
4
>>> g = dgl.add_edges(g, torch.tensor([3]), torch.tensor([3]), etype='plays')
>>> g.number_of_edges('plays')
5
See Also
--------
add_nodes
remove_nodes
remove_edges
""" """
new_g = DGLGraph() g = g.clone()
new_g.add_nodes(g.number_of_nodes()) g.add_edges(u, v, data=data, etype=etype)
src, dst = g.all_edges(order="eid") return g
src = F.zerocopy_to_numpy(src)
dst = F.zerocopy_to_numpy(dst) def remove_edges(g, eids, etype=None):
non_self_edges_idx = src != dst r"""Remove multiple edges with the specified edge type.
nodes = np.arange(g.number_of_nodes()) A new graph with certain edges deleted is returned.
new_g.add_edges(src[non_self_edges_idx], dst[non_self_edges_idx])
new_g.add_edges(nodes, nodes) Nodes will not be removed. After removing edges, the rest
edges will be re-indexed using consecutive integers from 0,
with their relative order preserved.
The features for the removed edges will be removed accordingly.
Parameters
----------
eids : int, tensor, numpy.ndarray, list
IDs for the edges to remove.
etype : str or tuple of str, optional
The type of the edges to remove. Can be omitted if there is
only one edge type in the graph.
Return
------
DGLHeteroGraph
The graph with edges deleted.
Examples
--------
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Edge Type**
>>> g = dgl.graph((torch.tensor([0, 0, 2]), torch.tensor([0, 1, 2])))
>>> g.edata['he'] = torch.arange(3).float().reshape(-1, 1)
>>> g = dgl.remove_edges(g, torch.tensor([0, 1]))
>>> g
Graph(num_nodes=3, num_edges=1,
ndata_schemes={}
edata_schemes={'he': Scheme(shape=(1,), dtype=torch.float32)})
>>> g.edges('all')
(tensor([2]), tensor([2]), tensor([0]))
>>> g.edata['he']
tensor([[2.]])
**Heterogeneous Graphs with Multiple Edge Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g = dgl.remove_edges(g, torch.tensor([0, 1]))
DGLError: Edge type name must be specified
if there are more than one edge types.
>>> g = dgl.remove_edges(g, torch.tensor([0, 1]), 'plays')
>>> g.edges('all', etype='plays')
(tensor([0, 1]), tensor([0, 0]), tensor([0, 1]))
See Also
--------
add_nodes
add_edges
remove_nodes
"""
g = g.clone()
g.remove_edges(eids, etype=etype)
return g
def remove_nodes(g, nids, ntype=None):
r"""Remove multiple nodes with the specified node type.
A new graph with certain nodes deleted is returned.
Edges that connect to the nodes will be removed as well. After removing
nodes and edges, the rest nodes and edges will be re-indexed using
consecutive integers from 0, with their relative order preserved.
The features for the removed nodes/edges will be removed accordingly.
The features for the removed nodes/edges will be removed accordingly.
Parameters
----------
nids : int, tensor, numpy.ndarray, list
Nodes to remove.
ntype : str, optional
The type of the nodes to remove. Can be omitted if there is
only one node type in the graph.
Return
------
DGLHeteroGraph
The graph with nodes deleted.
Examples
--------
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
>>> g = dgl.graph((torch.tensor([0, 0, 2]), torch.tensor([0, 1, 2])))
>>> g.ndata['hv'] = torch.arange(3).float().reshape(-1, 1)
>>> g.edata['he'] = torch.arange(3).float().reshape(-1, 1)
>>> g = dgl.remove_nodes(g, torch.tensor([0, 1]))
>>> g
Graph(num_nodes=1, num_edges=1,
ndata_schemes={'hv': Scheme(shape=(1,), dtype=torch.float32)}
edata_schemes={'he': Scheme(shape=(1,), dtype=torch.float32)})
>>> g.ndata['hv']
tensor([[2.]])
>>> g.edata['he']
tensor([[2.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1])),
>>> ('developer', 'develops', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g = dgl.remove_nodes(g, torch.tensor([0, 1]))
DGLError: Node type name must be specified
if there are more than one node types.
>>> g = dgl.remove_nodes(g, torch.tensor([0, 1]), ntype='game')
>>> g.num_nodes('user')
3
>>> g.num_nodes('game')
0
>>> g.num_edges('plays')
0
See Also
--------
add_nodes
add_edges
remove_edges
"""
g = g.clone()
g.remove_nodes(nids, ntype=ntype)
return g
def add_self_loop(g, etype=None):
r""" Add self loop for each node in the graph.
A new graph with self-loop is returned.
Since **selfloop is not well defined for unidirectional
bipartite graphs**, we simply skip the nodes corresponding
to unidirectional bipartite graphs.
Return
------
DGLHeteroGraph
The graph with self-loop.
Notes
-----
* It is recommanded to ``remove_self_loop`` before invoking
``add_self_loop``.
* Features for the new edges (self-loop edges) will be created
by initializers defined with :func:`set_n_initializer`
(default initializer fills zeros).
Examples
--------
>>> import dgl
>>> import torch
**Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
>>> g = dgl.graph((torch.tensor([0, 0, 2]), torch.tensor([2, 1, 0])))
>>> g.ndata['hv'] = torch.arange(3).float().reshape(-1, 1)
>>> g.edata['he'] = torch.arange(3).float().reshape(-1, 1)
>>> g = dgl.add_self_loop(g)
>>> g
Graph(num_nodes=3, num_edges=6,
ndata_schemes={'hv': Scheme(shape=(1,), dtype=torch.float32)}
edata_schemes={'he': Scheme(shape=(1,), dtype=torch.float32)})
>>> g.edata['he']
tensor([[0.],
[1.],
[2.],
[0.],
[0.],
[0.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
('user', 'follows', 'user'): (torch.tensor([1, 2]),
torch.tensor([0, 1])),
('user', 'plays', 'game'): (torch.tensor([0, 1]),
torch.tensor([0, 1]))})
>>> g = dgl.add_self_loop(g, etype='follows')
>>> g
Graph(num_nodes={'user': 3, 'game': 2},
num_edges={('user', 'plays', 'game'): 2, ('user', 'follows', 'user'): 5},
metagraph=[('user', 'user'), ('user', 'game')])
"""
etype = g.to_canonical_etype(etype)
if etype[0] != etype[2]:
raise DGLError(
'add_self_loop does not support unidirectional bipartite graphs: {}.' \
'Please make sure the types of head node and tail node are identical.' \
''.format(etype))
nodes = g.nodes(etype[0])
new_g = add_edges(g, nodes, nodes, etype=etype)
return new_g return new_g
def remove_self_loop(g): DGLHeteroGraph.add_self_loop = add_self_loop
"""Return a new graph with all self-loop edges removed
def remove_self_loop(g, etype=None):
r""" Remove self loops for each node in the graph.
A new graph with self-loop removed is returned.
If there are multiple self loops for a certain node,
all of them will be removed.
Parameters
----------
etype : str or tuple of str, optional
The type of the edges to remove. Can be omitted if there is
only one edge type in the graph.
Examples Examples
--------- ---------
>>> g = DGLGraph() >>> import dgl
>>> g.add_nodes(5) >>> import torch
>>> g.add_edges([0, 1, 2], [1, 1, 2])
>>> new_g = dgl.transform.remove_self_loop(g) # Nodes 1, 2 have self-loop
>>> new_g.edges()
(tensor([0]), tensor([1]))
Parameters **Homogeneous Graphs or Heterogeneous Graphs with A Single Node Type**
------------
g: DGLGraph
Returns >>> g = dgl.graph((torch.tensor([0, 0, 0, 1]), torch.tensor([1, 0, 0, 2])),
idtype=idtype, device=F.ctx())
>>> g.edata['he'] = torch.arange(4).float().reshape(-1, 1)
>>> g = dgl.remove_self_loop(g)
>>> g
Graph(num_nodes=3, num_edges=2,
edata_schemes={'he': Scheme(shape=(2,), dtype=torch.float32)})
>>> g.edata['he']
tensor([[0.],[3.]])
**Heterogeneous Graphs with Multiple Node Types**
>>> g = dgl.heterograph({
>>> ('user', 'follows', 'user'): (torch.tensor([0, 1, 1, 1, 2]),
>>> torch.tensor([0, 0, 1, 1, 1])),
>>> ('user', 'plays', 'game'): (torch.tensor([0, 1]),
>>> torch.tensor([0, 1]))
>>> })
>>> g = dgl.remove_self_loop(g)
>>> g.num_nodes('user')
3
>>> g.num_nodes('game')
2
>>> g.num_edges('follows')
2
>>> g.num_edges('plays')
2
See Also
-------- --------
DGLGraph add_self_loop
""" """
new_g = DGLGraph() etype = g.to_canonical_etype(etype)
new_g.add_nodes(g.number_of_nodes()) if etype[0] != etype[2]:
src, dst = g.all_edges(order="eid") raise DGLError(
src = F.zerocopy_to_numpy(src) 'remove_self_loop does not support unidirectional bipartite graphs: {}.' \
dst = F.zerocopy_to_numpy(dst) 'Please make sure the types of head node and tail node are identical.' \
non_self_edges_idx = src != dst ''.format(etype))
new_g.add_edges(src[non_self_edges_idx], dst[non_self_edges_idx]) u, v = g.edges(form='uv', order='eid', etype=etype)
self_loop_eids = F.tensor(F.nonzero_1d(u == v), dtype=F.dtype(u))
new_g = remove_edges(g, self_loop_eids, etype=etype)
return new_g return new_g
DGLHeteroGraph.remove_self_loop = remove_self_loop
def reorder_nodes(g, new_node_ids): def reorder_nodes(g, new_node_ids):
""" Generate a new graph with new node Ids. """ Generate a new graph with new node Ids.
...@@ -914,7 +1241,7 @@ def reorder_nodes(g, new_node_ids): ...@@ -914,7 +1241,7 @@ def reorder_nodes(g, new_node_ids):
and F.asnumpy(sorted_ids[-1]) == g.number_of_nodes() - 1, \ and F.asnumpy(sorted_ids[-1]) == g.number_of_nodes() - 1, \
"The new node Ids are incorrect." "The new node Ids are incorrect."
new_gidx = _CAPI_DGLReorderGraph(g._graph, new_node_ids.todgltensor()) new_gidx = _CAPI_DGLReorderGraph(g._graph, new_node_ids.todgltensor())
new_g = DGLGraph(new_gidx) new_g = DGLGraphStale(new_gidx)
new_g.ndata['orig_id'] = idx new_g.ndata['orig_id'] = idx
return new_g return new_g
...@@ -981,7 +1308,7 @@ def partition_graph_with_halo(g, node_part, extra_cached_hops, reshuffle=False): ...@@ -981,7 +1308,7 @@ def partition_graph_with_halo(g, node_part, extra_cached_hops, reshuffle=False):
# This creaets a subgraph from subgraphs returned from the CAPI above. # This creaets a subgraph from subgraphs returned from the CAPI above.
def create_subgraph(subg, induced_nodes, induced_edges): def create_subgraph(subg, induced_nodes, induced_edges):
subg1 = DGLGraph(graph_data=subg.graph, readonly=True) subg1 = DGLGraphStale(graph_data=subg.graph, readonly=True)
subg1.ndata[NID] = induced_nodes.tousertensor() subg1.ndata[NID] = induced_nodes.tousertensor()
subg1.edata[EID] = induced_edges.tousertensor() subg1.edata[EID] = induced_edges.tousertensor()
return subg1 return subg1
...@@ -1246,20 +1573,22 @@ def compact_graphs(graphs, always_preserve=None): ...@@ -1246,20 +1573,22 @@ def compact_graphs(graphs, always_preserve=None):
return_single = True return_single = True
if len(graphs) == 0: if len(graphs) == 0:
return [] return []
if graphs[0].is_block:
raise DGLError('Compacting a block graph is not allowed.')
# Ensure the node types are ordered the same. # Ensure the node types are ordered the same.
# TODO(BarclayII): we ideally need to remove this constraint. # TODO(BarclayII): we ideally need to remove this constraint.
ntypes = graphs[0].ntypes ntypes = graphs[0].ntypes
graph_dtype = graphs[0]._idtype_str idtype = graphs[0].idtype
graph_ctx = graphs[0]._graph.ctx device = graphs[0].device
for g in graphs: for g in graphs:
assert ntypes == g.ntypes, \ assert ntypes == g.ntypes, \
("All graphs should have the same node types in the same order, got %s and %s" % ("All graphs should have the same node types in the same order, got %s and %s" %
ntypes, g.ntypes) ntypes, g.ntypes)
assert graph_dtype == g._idtype_str, "Expect graph data type to be {}, but got {}".format( assert idtype == g.idtype, "Expect graph data type to be {}, but got {}".format(
graph_dtype, g._idtype_str) idtype, g.idtype)
assert graph_ctx == g._graph.ctx, "Expect graph device to be {}, but got {}".format( assert device == g.device, "Expect graph device to be {}, but got {}".format(
graph_ctx, g._graph.ctx) device, g.device)
# Process the dictionary or tensor of "always preserve" nodes # Process the dictionary or tensor of "always preserve" nodes
if always_preserve is None: if always_preserve is None:
...@@ -1269,19 +1598,18 @@ def compact_graphs(graphs, always_preserve=None): ...@@ -1269,19 +1598,18 @@ def compact_graphs(graphs, always_preserve=None):
raise ValueError("Node type must be given if multiple node types exist.") raise ValueError("Node type must be given if multiple node types exist.")
always_preserve = {ntypes[0]: always_preserve} always_preserve = {ntypes[0]: always_preserve}
always_preserve = utils.prepare_tensor_dict(graphs[0], always_preserve, 'always_preserve')
always_preserve_nd = [] always_preserve_nd = []
for ntype in ntypes: for ntype in ntypes:
nodes = always_preserve.get(ntype, None) nodes = always_preserve.get(ntype, None)
if nodes is None: if nodes is None:
nodes = nd.empty([0], graph_dtype, graph_ctx) nodes = F.copy_to(F.tensor([], idtype), device)
else: always_preserve_nd.append(F.to_dgl_nd(nodes))
nodes = F.zerocopy_to_dgl_ndarray(nodes)
always_preserve_nd.append(nodes)
# Compact and construct heterographs # Compact and construct heterographs
new_graph_indexes, induced_nodes = _CAPI_DGLCompactGraphs( new_graph_indexes, induced_nodes = _CAPI_DGLCompactGraphs(
[g._graph for g in graphs], always_preserve_nd) [g._graph for g in graphs], always_preserve_nd)
induced_nodes = [F.zerocopy_from_dgl_ndarray(nodes) for nodes in induced_nodes] induced_nodes = [F.from_dgl_nd(nodes) for nodes in induced_nodes]
new_graphs = [ new_graphs = [
DGLHeteroGraph(new_graph_index, graph.ntypes, graph.etypes) DGLHeteroGraph(new_graph_index, graph.ntypes, graph.etypes)
...@@ -1446,7 +1774,7 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True, copy_ndata=True, copy_e ...@@ -1446,7 +1774,7 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True, copy_ndata=True, copy_e
g._graph, dst_nodes_nd, include_dst_in_src) g._graph, dst_nodes_nd, include_dst_in_src)
# The new graph duplicates the original node types to SRC and DST sets. # The new graph duplicates the original node types to SRC and DST sets.
new_ntypes = ([ntype for ntype in g.ntypes], [ntype for ntype in g.ntypes]) new_ntypes = (g.ntypes, g.ntypes)
new_graph = DGLBlock(new_graph_index, new_ntypes, g.etypes) new_graph = DGLBlock(new_graph_index, new_ntypes, g.etypes)
assert new_graph.is_unibipartite # sanity check assert new_graph.is_unibipartite # sanity check
...@@ -1494,55 +1822,6 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True, copy_ndata=True, copy_e ...@@ -1494,55 +1822,6 @@ def to_block(g, dst_nodes=None, include_dst_in_src=True, copy_ndata=True, copy_e
return new_graph return new_graph
def remove_edges(g, edge_ids):
"""Return a new graph with given edge IDs removed.
The nodes are preserved.
Parameters
----------
graph : DGLHeteroGraph
The graph
edge_ids : Tensor or dict[etypes, Tensor]
The edge IDs for each edge type.
Returns
-------
DGLHeteroGraph
The new graph.
The edge ID mapping from the new graph to the original graph is stored as
``dgl.EID`` on edge features.
"""
if not isinstance(edge_ids, Mapping):
if len(g.etypes) != 1:
raise ValueError(
"Graph has more than one edge type; specify a dict for edge_id instead.")
edge_ids = {g.canonical_etypes[0]: edge_ids}
edge_ids_nd = [nd.NULL[g._idtype_str]] * len(g.etypes)
for key, value in edge_ids.items():
if value.dtype != g.idtype:
# if didn't check, this function still works, but returns wrong result
raise utils.InconsistentDtypeException("Expect edge id tensors({}) to have \
the same index type as graph({})".format(value.dtype, g.idtype))
edge_ids_nd[g.get_etype_id(key)] = F.zerocopy_to_dgl_ndarray(value)
new_graph_index, induced_eids_nd = _CAPI_DGLRemoveEdges(g._graph, edge_ids_nd)
new_graph = DGLHeteroGraph(new_graph_index, g.ntypes, g.etypes)
for i, canonical_etype in enumerate(g.canonical_etypes):
data = induced_eids_nd[i]
if len(data) == 0:
# Empty means that either
# (1) no edges are removed and edges are not shuffled.
# (2) all edges are removed.
# The following statement deals with both cases.
new_graph.edges[canonical_etype].data[EID] = F.arange(
0, new_graph.number_of_edges(canonical_etype))
else:
new_graph.edges[canonical_etype].data[EID] = F.zerocopy_from_dgl_ndarray(data)
return new_graph
def in_subgraph(g, nodes): def in_subgraph(g, nodes):
"""Extract the subgraph containing only the in edges of the given nodes. """Extract the subgraph containing only the in edges of the given nodes.
...@@ -1564,14 +1843,17 @@ def in_subgraph(g, nodes): ...@@ -1564,14 +1843,17 @@ def in_subgraph(g, nodes):
DGLHeteroGraph DGLHeteroGraph
The subgraph. The subgraph.
""" """
if g.is_block:
raise DGLError('Extracting subgraph of a block graph is not allowed.')
if not isinstance(nodes, dict): if not isinstance(nodes, dict):
if len(g.ntypes) > 1: if len(g.ntypes) > 1:
raise DGLError("Must specify node type when the graph is not homogeneous.") raise DGLError("Must specify node type when the graph is not homogeneous.")
nodes = {g.ntypes[0] : nodes} nodes = {g.ntypes[0] : nodes}
nodes = utils.prepare_tensor_dict(g, nodes, 'nodes')
nodes_all_types = [] nodes_all_types = []
for ntype in g.ntypes: for ntype in g.ntypes:
if ntype in nodes: if ntype in nodes:
nodes_all_types.append(utils.toindex(nodes[ntype], g._idtype_str).todgltensor()) nodes_all_types.append(F.to_dgl_nd(nodes[ntype]))
else: else:
nodes_all_types.append(nd.NULL[g._idtype_str]) nodes_all_types.append(nd.NULL[g._idtype_str])
...@@ -1579,7 +1861,7 @@ def in_subgraph(g, nodes): ...@@ -1579,7 +1861,7 @@ def in_subgraph(g, nodes):
induced_edges = subgidx.induced_edges induced_edges = subgidx.induced_edges
ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes) ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes)
for i, etype in enumerate(ret.canonical_etypes): for i, etype in enumerate(ret.canonical_etypes):
ret.edges[etype].data[EID] = induced_edges[i].tousertensor() ret.edges[etype].data[EID] = induced_edges[i]
return ret return ret
def out_subgraph(g, nodes): def out_subgraph(g, nodes):
...@@ -1603,14 +1885,17 @@ def out_subgraph(g, nodes): ...@@ -1603,14 +1885,17 @@ def out_subgraph(g, nodes):
DGLHeteroGraph DGLHeteroGraph
The subgraph. The subgraph.
""" """
if g.is_block:
raise DGLError('Extracting subgraph of a block graph is not allowed.')
if not isinstance(nodes, dict): if not isinstance(nodes, dict):
if len(g.ntypes) > 1: if len(g.ntypes) > 1:
raise DGLError("Must specify node type when the graph is not homogeneous.") raise DGLError("Must specify node type when the graph is not homogeneous.")
nodes = {g.ntypes[0] : nodes} nodes = {g.ntypes[0] : nodes}
nodes = utils.prepare_tensor_dict(g, nodes, 'nodes')
nodes_all_types = [] nodes_all_types = []
for ntype in g.ntypes: for ntype in g.ntypes:
if ntype in nodes: if ntype in nodes:
nodes_all_types.append(utils.toindex(nodes[ntype], g._idtype_str).todgltensor()) nodes_all_types.append(F.to_dgl_nd(nodes[ntype]))
else: else:
nodes_all_types.append(nd.NULL[g._idtype_str]) nodes_all_types.append(nd.NULL[g._idtype_str])
...@@ -1618,7 +1903,7 @@ def out_subgraph(g, nodes): ...@@ -1618,7 +1903,7 @@ def out_subgraph(g, nodes):
induced_edges = subgidx.induced_edges induced_edges = subgidx.induced_edges
ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes) ret = DGLHeteroGraph(subgidx.graph, g.ntypes, g.etypes)
for i, etype in enumerate(ret.canonical_etypes): for i, etype in enumerate(ret.canonical_etypes):
ret.edges[etype].data[EID] = induced_edges[i].tousertensor() ret.edges[etype].data[EID] = induced_edges[i]
return ret return ret
def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True, copy_edata=False): def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True, copy_edata=False):
...@@ -1775,6 +2060,8 @@ def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True ...@@ -1775,6 +2060,8 @@ def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True
{('user', 'wins', 'user'): tensor([1, 2, 1, 1]) {('user', 'wins', 'user'): tensor([1, 2, 1, 1])
('user', 'plays', 'game'): tensor([1, 1, 1])} ('user', 'plays', 'game'): tensor([1, 1, 1])}
""" """
if g.is_block:
raise DGLError('Cannot convert a block graph to a simple graph.')
simple_graph_index, counts, edge_maps = _CAPI_DGLToSimpleHetero(g._graph) simple_graph_index, counts, edge_maps = _CAPI_DGLToSimpleHetero(g._graph)
simple_graph = DGLHeteroGraph(simple_graph_index, g.ntypes, g.etypes) simple_graph = DGLHeteroGraph(simple_graph_index, g.ntypes, g.etypes)
counts = [F.zerocopy_from_dgl_ndarray(count) for count in counts] counts = [F.zerocopy_from_dgl_ndarray(count) for count in counts]
...@@ -1814,50 +2101,24 @@ def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True ...@@ -1814,50 +2101,24 @@ def to_simple(g, return_counts='count', writeback_mapping=False, copy_ndata=True
DGLHeteroGraph.to_simple = to_simple DGLHeteroGraph.to_simple = to_simple
def as_heterograph(g, ntype='_U', etype='_E'): def as_heterograph(g, ntype='_U', etype='_E'): # pylint: disable=unused-argument
"""Convert a DGLGraph to a DGLHeteroGraph with one node and edge type. """Convert a DGLGraph to a DGLHeteroGraph with one node and edge type.
Node and edge features are preserved. Returns 64 bits graph DEPRECATED: DGLGraph and DGLHeteroGraph have been merged. This function will
do nothing and can be removed safely in all cases.
Parameters
----------
g : DGLGraph
The graph
ntype : str, optional
The node type name
etype : str, optional
The edge type name
Returns
-------
DGLHeteroGraph
The heterograph.
""" """
hgi = _CAPI_DGLAsHeteroGraph(g._graph) dgl_warning('DEPRECATED: DGLGraph and DGLHeteroGraph have been merged in v0.5.\n'
hg = DGLHeteroGraph(hgi, [ntype], [etype]) '\tdgl.as_heterograph will do nothing and can be removed safely in all cases.')
hg.ndata.update(g.ndata) return g
hg.edata.update(g.edata)
return hg
def as_immutable_graph(hg): def as_immutable_graph(hg):
"""Convert a DGLHeteroGraph with one node and edge type into a DGLGraph. """Convert a DGLHeteroGraph with one node and edge type into a DGLGraph.
Node and edge features are preserved. DEPRECATED: DGLGraph and DGLHeteroGraph have been merged. This function will
do nothing and can be removed safely in all cases.
Parameters
----------
g : DGLHeteroGraph
The heterograph
Returns
-------
DGLGraph
The graph.
""" """
gidx = _CAPI_DGLAsImmutableGraph(hg._graph) dgl_warning('DEPRECATED: DGLGraph and DGLHeteroGraph have been merged in v0.5.\n'
g = DGLGraph(gidx) '\tdgl.as_immutable_graph will do nothing and can be removed safely in all cases.')
g.ndata.update(hg.ndata) return hg
g.edata.update(hg.edata)
return g
_init_api("dgl.transform") _init_api("dgl.transform")
"""Internal utilities."""
from .internal import *
from .data import *
from .checks import *
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment