Unverified Commit e452179c authored by Mufei Li's avatar Mufei Li Committed by GitHub
Browse files

[Deprecation] Dataset Attributes (#4666)



* Update from master (#4584)

* [Example][Refactor] Refactor graphsage multigpu and full-graph example (#4430)

* Add refactors for multi-gpu and full-graph example

* Fix format

* Update

* Update

* Update

* [Cleanup] Remove async_transferer (#4505)

* Remove async_transferer

* remove test

* Remove AsyncTransferer
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>

* [Cleanup] Remove duplicate entries of CUB submodule   (issue# 4395) (#4499)

* remove third_part/cub

* remove from third_party
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>

* [Bug] Enable turn on/off libxsmm at runtime (#4455)

* enable turn on/off libxsmm at runtime by adding a global config and related API
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>

* [Feature] Unify the cuda stream used in core library (#4480)

* Use an internal cuda stream for CopyDataFromTo

* small fix white space

* Fix to compile

* Make stream optional in copydata for compile

* fix lint issue

* Update cub functions to use internal stream

* Lint check

* Update CopyTo/CopyFrom/CopyFromTo to use internal stream

* Address comments

* Fix backward CUDA stream

* Avoid overloading CopyFromTo()

* Minor comment update

* Overload copydatafromto in cuda device api
Co-authored-by: default avatarxiny <xiny@nvidia.com>

* [Feature] Added exclude_self and output_batch to knn graph construction (Issues #4323 #4316) (#4389)

* * Added "exclude_self" and "output_batch" options to knn_graph and segmented_knn_graph
* Updated out-of-date comments on remove_edges and remove_self_loop, since they now preserve batch information

* * Changed defaults on new knn_graph and segmented_knn_graph function parameters, for compatibility; pytorch/test_geometry.py was failing

* * Added test to ensure dgl.remove_self_loop function correctly updates batch information

* * Added new knn_graph and segmented_knn_graph parameters to dgl.nn.KNNGraph and dgl.nn.SegmentedKNNGraph

* * Formatting

* * Oops, I missed the one in segmented_knn_graph when I fixed the similar thing in knn_graph

* * Fixed edge case handling when invalid k specified, since it still needs to be handled consistently for tests to pass
* Fixed context of batch info, since it must match the context of the input position data for remove_self_loop to succeed

* * Fixed batch info resulting from knn_graph when output_batch is true, for case of 3D input tensor, representing multiple segments

* * Added testing of new exclude_self and output_batch parameters on knn_graph and segmented_knn_graph, and their wrappers, KNNGraph and SegmentedKNNGraph, into the test_knn_cuda test

* * Added doc comments for new parameters

* * Added correct handling for uncommon case of k or more coincident points when excluding self edges in knn_graph and segmented_knn_graph
* Added test cases for more than k coincident points

* * Updated doc comments for output_batch parameters for clarity

* * Linter formatting fixes

* * Extracted out common function for test_knn_cpu and test_knn_cuda, to add the new test cases to test_knn_cpu

* * Rewording in doc comments

* * Removed output_batch parameter from knn_graph and segmented_knn_graph, in favour of always setting the batch information, except in knn_graph if x is a 2D tensor
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [CI] only known devs are authorized to trigger CI (#4518)

* [CI] only known devs are authorized to trigger CI

* fix if author is null

* add comments

* [Readability] Auto fix setup.py and update-version.py (#4446)

* Auto fix update-version

* Auto fix setup.py

* Auto fix update-version

* Auto fix setup.py

* [Doc] Change random.py to random_partition.py in guide on distributed partition pipeline (#4438)

* Update distributed-preprocessing.rst

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* fix unpinning when tensoradaptor is not available (#4450)

* [Doc] fix print issue in tutorial (#4459)

* [Example][Refactor] Refactor RGCN example (#4327)

* Refactor full graph entity classification

* Refactor rgcn with sampling

* README update

* Update

* Results update

* Respect default setting of self_loop=false in entity.py

* Update

* Update README

* Update for multi-gpu

* Update

* [doc] fix invalid link in user guide (#4468)

* [Example] directional_GSN for ogbg-molpcba (#4405)

* version-1

* version-2

* version-3

* update examples/README

* Update .gitignore

* update performance in README, delete scripts

* 1st approving review

* 2nd approving review
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>

* Clarify the message name, which is 'm'. (#4462)
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* [Refactor] Auto fix view.py. (#4461)
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [Example] SEAL for OGBL (#4291)

* [Example] SEAL for OGBL

* update index

* update

* fix readme typo

* add seal sampler

* modify set ops

* prefetch

* efficiency test

* update

* optimize

* fix ScatterAdd dtype issue

* update sampler style

* update
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>

* [CI] use https instead of http (#4488)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block() (#4487)

* [BugFix] fix crash due to incorrect dtype in dgl.to_block()

* fix test failure in TF

* [Feature] Make TensorAdapter Stream Aware (#4472)

* Allocate tensors in DGL's current stream

* make tensoradaptor stream-aware

* replace TAemtpy with cpu allocator

* fix typo

* try fix cpu allocation

* clean header

* redirect AllocDataSpace as well

* resolve comments

* [Build][Doc] Specify the sphinx version (#4465)
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* reformat

* reformat

* Auto fix update-version

* Auto fix setup.py

* reformat

* reformat
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarMufei Li <mufeili1996@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>

* Move mock version of dgl_sparse library to DGL main repo (#4524)

* init

* Add api doc for sparse library

* support op btwn matrices with differnt sparsity

* Fixed docstring

* addresses comments

* lint check

* change keyword format to fmt
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>

* [DistPart] expose timeout config for process group (#4532)

* [DistPart] expose timeout config for process group

* refine code

* Update tools/distpartitioning/data_proc_pipeline.py
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>

* [Feature] Import PyTorch's CUDA stream management (#4503)

* add set_stream

* add .record_stream for NDArray and HeteroGraph

* refactor dgl stream Python APIs

* test record_stream

* add unit test for record stream

* use pytorch's stream

* fix lint

* fix cpu build

* address comments

* address comments

* add record stream tests for dgl.graph

* record frames and update dataloder

* add docstring

* update frame

* add backend check for record_stream

* remove CUDAThreadEntry::stream

* record stream for newly created formats

* fix bug

* fix cpp test

* fix None c_void_p to c_handle

* [examples]educe memory consumption (#4558)

* [examples]educe memory consumption

* reffine help message

* refine

* [Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)

* Added cugraph nightly scripts

* Removed nvcr.io//nvidia/pytorch:22.04-py3 reference
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>

* Revert "[Feature][REVIEW] Enable DGL cugaph nightly CI  (#4525)" (#4563)

This reverts commit ec171c64

.

* [Misc] Add flake8 lint workflow. (#4566)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.

* Add flake8 annotation in workflow.

* remove

* add

* clean up
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Try use official pylint workflow. (#4568)

* polish update_version

* update pylint workflow.

* add

* revert.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [CI] refine stage logic (#4565)

* [CI] refine stage logic

* refine

* refine

* remove (#4570)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Add Pylint workflow for flake8. (#4571)

* remove

* Add pylint.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Update the python version in Pylint workflow for flake8. (#4572)

* remove

* Add pylint.

* Change the python version for pylint.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4574)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Misc] Use another workflow. (#4575)

* Update pylint.

* Use another workflow.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint. (#4576)
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* Update pylint.yml

* Update pylint.yml

* Delete pylint.yml

* [Misc]Add pyproject.toml for autopep8 & black. (#4543)

* Add pyproject.toml for autopep8.

* Add pyproject.toml for autopep8.
Co-authored-by: default avatarSteve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>

* [Feature] Bump DLPack to v0.7 and decouple DLPack from the core library (#4454)

* rename `DLContext` to `DGLContext`

* rename `kDLGPU` to `kDLCUDA`

* replace DLTensor with DGLArray

* fix linting

* Unify DGLType and DLDataType to DGLDataType

* Fix FFI

* rename DLDeviceType to DGLDeviceType

* decouple dlpack from the core library

* fix bug

* fix lint

* fix merge

* fix build

* address comments

* rename dl_converter to dlpack_convert

* remove redundant comments
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarnv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>
Co-authored-by: default avatarIsrat Nisa <neesha295@gmail.com>
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarpeizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: default avatarndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarHongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>
Co-authored-by: default avatarVibhu Jawa <vibhujawa@gmail.com>

* [Deprecation] Dataset Attributes (#4546)

* Update

* CI

* CI

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* [Example] Bug Fix (#4665)

* Update

* CI

* CI

* Update

* Update
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>

* Update
Co-authored-by: default avatarChang Liu <chang.liu@utexas.edu>
Co-authored-by: default avatarnv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com>
Co-authored-by: default avatarXin Yao <xiny@nvidia.com>
Co-authored-by: default avatarXin Yao <yaox12@outlook.com>
Co-authored-by: default avatarIsrat Nisa <neesha295@gmail.com>
Co-authored-by: default avatarIsrat Nisa <nisisrat@amazon.com>
Co-authored-by: default avatarpeizhou001 <110809584+peizhou001@users.noreply.github.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-19-194.ap-northeast-1.compute.internal>
Co-authored-by: default avatarndickson-nvidia <99772994+ndickson-nvidia@users.noreply.github.com>
Co-authored-by: default avatarMinjie Wang <wmjlyjemaine@gmail.com>
Co-authored-by: default avatarRhett Ying <85214957+Rhett-Ying@users.noreply.github.com>
Co-authored-by: default avatarHongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal>
Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-9-26.ap-northeast-1.compute.internal>
Co-authored-by: default avatarZhiteng Li <55398076+ZHITENGLI@users.noreply.github.com>
Co-authored-by: default avatarrudongyu <ru_dongyu@outlook.com>
Co-authored-by: default avatarQuan Gan <coin2028@hotmail.com>
Co-authored-by: default avatarVibhu Jawa <vibhujawa@gmail.com>
parent f846d902
......@@ -23,18 +23,18 @@ def main(args):
# load and preprocess dataset
data = load_data(args)
g = data[0]
features = torch.FloatTensor(data.features)
labels = torch.LongTensor(data.labels)
features = torch.FloatTensor(g.ndata['feat'])
labels = torch.LongTensor(g.ndata['label'])
if hasattr(torch, 'BoolTensor'):
train_mask = torch.BoolTensor(data.train_mask)
val_mask = torch.BoolTensor(data.val_mask)
test_mask = torch.BoolTensor(data.test_mask)
train_mask = torch.BoolTensor(g.ndata['train_mask'])
val_mask = torch.BoolTensor(g.ndata['val_mask'])
test_mask = torch.BoolTensor(g.ndata['test_mask'])
else:
train_mask = torch.ByteTensor(data.train_mask)
val_mask = torch.ByteTensor(data.val_mask)
test_mask = torch.ByteTensor(data.test_mask)
train_mask = torch.ByteTensor(g.ndata['train_mask'])
val_mask = torch.ByteTensor(g.ndata['val_mask'])
test_mask = torch.ByteTensor(g.ndata['test_mask'])
in_feats = features.shape[1]
n_classes = data.num_labels
n_classes = data.num_classes
n_edges = g.number_of_edges()
if args.gpu < 0:
......
......@@ -251,31 +251,6 @@ class CitationGraphDataset(DGLBuiltinDataset):
We preserve these properties for compatability.
"""
@property
def train_mask(self):
deprecate_property('dataset.train_mask', 'g.ndata[\'train_mask\']')
return F.asnumpy(self._g.ndata['train_mask'])
@property
def val_mask(self):
deprecate_property('dataset.val_mask', 'g.ndata[\'val_mask\']')
return F.asnumpy(self._g.ndata['val_mask'])
@property
def test_mask(self):
deprecate_property('dataset.test_mask', 'g.ndata[\'test_mask\']')
return F.asnumpy(self._g.ndata['test_mask'])
@property
def labels(self):
deprecate_property('dataset.label', 'g.ndata[\'label\']')
return F.asnumpy(self._g.ndata['label'])
@property
def features(self):
deprecate_property('dataset.feat', 'g.ndata[\'feat\']')
return self._g.ndata['feat']
@property
def reverse_edge(self):
return self._reverse_edge
......@@ -306,43 +281,6 @@ def _sample_mask(idx, l):
class CoraGraphDataset(CitationGraphDataset):
r""" Cora citation network dataset.
.. deprecated:: 0.5.0
- ``graph`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
- ``train_mask`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
>>> train_mask = graph.ndata['train_mask']
- ``val_mask`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
>>> val_mask = graph.ndata['val_mask']
- ``test_mask`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
>>> test_mask = graph.ndata['test_mask']
- ``labels`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
>>> labels = graph.ndata['label']
- ``feat`` is deprecated, it is replaced by:
>>> dataset = CoraGraphDataset()
>>> graph = dataset[0]
>>> feat = graph.ndata['feat']
Nodes mean paper and edges mean citation
relationships. Each node has a predefined
feature with 1433 dimensions. The dataset is
......@@ -383,18 +321,6 @@ class CoraGraphDataset(CitationGraphDataset):
----------
num_classes: int
Number of label classes
graph: networkx.DiGraph
Graph structure
train_mask: numpy.ndarray
Mask of training nodes
val_mask: numpy.ndarray
Mask of validation nodes
test_mask: numpy.ndarray
Mask of test nodes
labels: numpy.ndarray
Ground truth labels of each node
features: Tensor
Node features
Notes
-----
......@@ -454,43 +380,6 @@ class CoraGraphDataset(CitationGraphDataset):
class CiteseerGraphDataset(CitationGraphDataset):
r""" Citeseer citation network dataset.
.. deprecated:: 0.5.0
- ``graph`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
- ``train_mask`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
>>> train_mask = graph.ndata['train_mask']
- ``val_mask`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
>>> val_mask = graph.ndata['val_mask']
- ``test_mask`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
>>> test_mask = graph.ndata['test_mask']
- ``labels`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
>>> labels = graph.ndata['label']
- ``feat`` is deprecated, it is replaced by:
>>> dataset = CiteseerGraphDataset()
>>> graph = dataset[0]
>>> feat = graph.ndata['feat']
Nodes mean scientific publications and edges
mean citation relationships. Each node has a
predefined feature with 3703 dimensions. The
......@@ -531,18 +420,6 @@ class CiteseerGraphDataset(CitationGraphDataset):
----------
num_classes: int
Number of label classes
graph: networkx.DiGraph
Graph structure
train_mask: numpy.ndarray
Mask of training nodes
val_mask: numpy.ndarray
Mask of validation nodes
test_mask: numpy.ndarray
Mask of test nodes
labels: numpy.ndarray
Ground truth labels of each node
features: Tensor
Node features
Notes
-----
......@@ -605,43 +482,6 @@ class CiteseerGraphDataset(CitationGraphDataset):
class PubmedGraphDataset(CitationGraphDataset):
r""" Pubmed citation network dataset.
.. deprecated:: 0.5.0
- ``graph`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
- ``train_mask`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
>>> train_mask = graph.ndata['train_mask']
- ``val_mask`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
>>> val_mask = graph.ndata['val_mask']
- ``test_mask`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
>>> test_mask = graph.ndata['test_mask']
- ``labels`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
>>> labels = graph.ndata['label']
- ``feat`` is deprecated, it is replaced by:
>>> dataset = PubmedGraphDataset()
>>> graph = dataset[0]
>>> feat = graph.ndata['feat']
Nodes mean scientific publications and edges
mean citation relationships. Each node has a
predefined feature with 500 dimensions. The
......@@ -682,18 +522,6 @@ class PubmedGraphDataset(CitationGraphDataset):
----------
num_classes: int
Number of label classes
graph: networkx.DiGraph
Graph structure
train_mask: numpy.ndarray
Mask of training nodes
val_mask: numpy.ndarray
Mask of validation nodes
test_mask: numpy.ndarray
Mask of test nodes
labels: numpy.ndarray
Ground truth labels of each node
features: Tensor
Node features
Notes
-----
......
......@@ -106,11 +106,6 @@ class GNNBenchmarkDataset(DGLBuiltinDataset):
"""Number of classes."""
raise NotImplementedError
@property
def data(self):
deprecate_property('dataset.data', 'dataset[0]')
return self._data
def __getitem__(self, idx):
r""" Get graph by index
......@@ -142,13 +137,6 @@ class GNNBenchmarkDataset(DGLBuiltinDataset):
class CoraFullDataset(GNNBenchmarkDataset):
r"""CORA-Full dataset for node classification task.
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is repalced by:
>>> dataset = CoraFullDataset()
>>> graph = dataset[0]
Extended Cora dataset. Nodes represent paper and edges represent citations.
Reference: `<https://github.com/shchur/gnn-benchmark#datasets>`_
......@@ -179,8 +167,6 @@ class CoraFullDataset(GNNBenchmarkDataset):
----------
num_classes : int
Number of classes for each node.
data : list
A list of DGLGraph objects
Examples
--------
......@@ -211,13 +197,6 @@ class CoraFullDataset(GNNBenchmarkDataset):
class CoauthorCSDataset(GNNBenchmarkDataset):
r""" 'Computer Science (CS)' part of the Coauthor dataset for node classification task.
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is repalced by:
>>> dataset = CoauthorCSDataset()
>>> graph = dataset[0]
Coauthor CS and Coauthor Physics are co-authorship graphs based on the Microsoft Academic Graph
from the KDD Cup 2016 challenge. Here, nodes are authors, that are connected by an edge if they
co-authored a paper; node features represent paper keywords for each author’s papers, and class
......@@ -251,8 +230,6 @@ class CoauthorCSDataset(GNNBenchmarkDataset):
----------
num_classes : int
Number of classes for each node.
data : list
A list of DGLGraph objects
Examples
--------
......@@ -283,13 +260,6 @@ class CoauthorCSDataset(GNNBenchmarkDataset):
class CoauthorPhysicsDataset(GNNBenchmarkDataset):
r""" 'Physics' part of the Coauthor dataset for node classification task.
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is repalced by:
>>> dataset = CoauthorPhysicsDataset()
>>> graph = dataset[0]
Coauthor CS and Coauthor Physics are co-authorship graphs based on the Microsoft Academic Graph
from the KDD Cup 2016 challenge. Here, nodes are authors, that are connected by an edge if they
co-authored a paper; node features represent paper keywords for each author’s papers, and class
......@@ -323,8 +293,6 @@ class CoauthorPhysicsDataset(GNNBenchmarkDataset):
----------
num_classes : int
Number of classes for each node.
data : list
A list of DGLGraph objects
Examples
--------
......@@ -355,13 +323,6 @@ class CoauthorPhysicsDataset(GNNBenchmarkDataset):
class AmazonCoBuyComputerDataset(GNNBenchmarkDataset):
r""" 'Computer' part of the AmazonCoBuy dataset for node classification task.
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is repalced by:
>>> dataset = AmazonCoBuyComputerDataset()
>>> graph = dataset[0]
Amazon Computers and Amazon Photo are segments of the Amazon co-purchase graph [McAuley et al., 2015],
where nodes represent goods, edges indicate that two goods are frequently bought together, node
features are bag-of-words encoded product reviews, and class labels are given by the product category.
......@@ -394,8 +355,6 @@ class AmazonCoBuyComputerDataset(GNNBenchmarkDataset):
----------
num_classes : int
Number of classes for each node.
data : list
A list of DGLGraph objects
Examples
--------
......@@ -426,13 +385,6 @@ class AmazonCoBuyComputerDataset(GNNBenchmarkDataset):
class AmazonCoBuyPhotoDataset(GNNBenchmarkDataset):
r"""AmazonCoBuy dataset for node classification task.
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is repalced by:
>>> dataset = AmazonCoBuyPhotoDataset()
>>> graph = dataset[0]
Amazon Computers and Amazon Photo are segments of the Amazon co-purchase graph [McAuley et al., 2015],
where nodes represent goods, edges indicate that two goods are frequently bought together, node
features are bag-of-words encoded product reviews, and class labels are given by the product category.
......@@ -465,8 +417,6 @@ class AmazonCoBuyPhotoDataset(GNNBenchmarkDataset):
----------
num_classes : int
Number of classes for each node.
data : list
A list of DGLGraph objects
Examples
--------
......
......@@ -14,13 +14,6 @@ __all__ = ['KarateClubDataset', 'KarateClub']
class KarateClubDataset(DGLDataset):
r""" Karate Club dataset for Node Classification
.. deprecated:: 0.5.0
- ``data`` is deprecated, it is replaced by:
>>> dataset = KarateClubDataset()
>>> g = dataset[0]
Zachary's karate club is a social network of a university
karate club, described in the paper "An Information Flow
Model for Conflict and Fission in Small Groups" by Wayne W. Zachary.
......@@ -45,8 +38,6 @@ class KarateClubDataset(DGLDataset):
----------
num_classes : int
Number of node classes
data : list
A list of :class:`dgl.DGLGraph` objects
Examples
--------
......@@ -73,11 +64,6 @@ class KarateClubDataset(DGLDataset):
"""Number of classes."""
return 2
@property
def data(self):
deprecate_property('dataset.data', 'dataset[0]')
return self._data
def __getitem__(self, idx):
r""" Get graph object
......
......@@ -191,21 +191,6 @@ class KnowledgeGraphDataset(DGLBuiltinDataset):
def save_name(self):
return self.name + '_dgl_graph'
@property
def train(self):
deprecate_property('dataset.train', 'g.edata[\'train_mask\']')
return self._train
@property
def valid(self):
deprecate_property('dataset.valid', 'g.edata[\'val_mask\']')
return self._valid
@property
def test(self):
deprecate_property('dataset.test', 'g.edata[\'test_mask\']')
return self._test
def _read_dictionary(filename):
d = {}
with open(filename, 'r+') as f:
......@@ -344,35 +329,6 @@ def build_knowledge_graph(num_nodes, num_rels, train, valid, test, reverse=True)
class FB15k237Dataset(KnowledgeGraphDataset):
r"""FB15k237 link prediction dataset.
.. deprecated:: 0.5.0
- ``train`` is deprecated, it is replaced by:
>>> dataset = FB15k237Dataset()
>>> graph = dataset[0]
>>> train_mask = graph.edata['train_mask']
>>> train_idx = th.nonzero(train_mask, as_tuple=False).squeeze()
>>> src, dst = graph.find_edges(train_idx)
>>> rel = graph.edata['etype'][train_idx]
- ``valid`` is deprecated, it is replaced by:
>>> dataset = FB15k237Dataset()
>>> graph = dataset[0]
>>> val_mask = graph.edata['val_mask']
>>> val_idx = th.nonzero(val_mask, as_tuple=False).squeeze()
>>> src, dst = graph.find_edges(val_idx)
>>> rel = graph.edata['etype'][val_idx]
- ``test`` is deprecated, it is replaced by:
>>> dataset = FB15k237Dataset()
>>> graph = dataset[0]
>>> test_mask = graph.edata['test_mask']
>>> test_idx = th.nonzero(test_mask, as_tuple=False).squeeze()
>>> src, dst = graph.find_edges(test_idx)
>>> rel = graph.edata['etype'][test_idx]
FB15k-237 is a subset of FB15k where inverse
relations are removed. When creating the dataset,
a reverse edge with reversed relation types are
......@@ -411,12 +367,6 @@ class FB15k237Dataset(KnowledgeGraphDataset):
Number of nodes
num_rels: int
Number of relation types
train: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the training graph
valid: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the validation graph
test: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the test graph
Examples
----------
......@@ -484,35 +434,6 @@ class FB15k237Dataset(KnowledgeGraphDataset):
class FB15kDataset(KnowledgeGraphDataset):
r"""FB15k link prediction dataset.
.. deprecated:: 0.5.0
- ``train`` is deprecated, it is replaced by:
>>> dataset = FB15kDataset()
>>> graph = dataset[0]
>>> train_mask = graph.edata['train_mask']
>>> train_idx = th.nonzero(train_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(train_idx)
>>> rel = graph.edata['etype'][train_idx]
- ``valid`` is deprecated, it is replaced by:
>>> dataset = FB15kDataset()
>>> graph = dataset[0]
>>> val_mask = graph.edata['val_mask']
>>> val_idx = th.nonzero(val_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(val_idx)
>>> rel = graph.edata['etype'][val_idx]
- ``test`` is deprecated, it is replaced by:
>>> dataset = FB15kDataset()
>>> graph = dataset[0]
>>> test_mask = graph.edata['test_mask']
>>> test_idx = th.nonzero(test_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(test_idx)
>>> rel = graph.edata['etype'][test_idx]
The FB15K dataset was introduced in `Translating Embeddings for Modeling
Multi-relational Data <http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf>`_.
It is a subset of Freebase which contains about
......@@ -554,12 +475,6 @@ class FB15kDataset(KnowledgeGraphDataset):
Number of nodes
num_rels: int
Number of relation types
train: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the training graph
valid: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the validation graph
test: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the test graph
Examples
----------
......@@ -627,35 +542,6 @@ class FB15kDataset(KnowledgeGraphDataset):
class WN18Dataset(KnowledgeGraphDataset):
r""" WN18 link prediction dataset.
.. deprecated:: 0.5.0
- ``train`` is deprecated, it is replaced by:
>>> dataset = WN18Dataset()
>>> graph = dataset[0]
>>> train_mask = graph.edata['train_mask']
>>> train_idx = th.nonzero(train_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(train_idx)
>>> rel = graph.edata['etype'][train_idx]
- ``valid`` is deprecated, it is replaced by:
>>> dataset = WN18Dataset()
>>> graph = dataset[0]
>>> val_mask = graph.edata['val_mask']
>>> val_idx = th.nonzero(val_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(val_idx)
>>> rel = graph.edata['etype'][val_idx]
- ``test`` is deprecated, it is replaced by:
>>> dataset = WN18Dataset()
>>> graph = dataset[0]
>>> test_mask = graph.edata['test_mask']
>>> test_idx = th.nonzero(test_mask, as_tuple=False).squeeze()
>>> src, dst = graph.edges(test_idx)
>>> rel = graph.edata['etype'][test_idx]
The WN18 dataset was introduced in `Translating Embeddings for Modeling
Multi-relational Data <http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf>`_.
It included the full 18 relations scraped from
......@@ -696,12 +582,6 @@ class WN18Dataset(KnowledgeGraphDataset):
Number of nodes
num_rels: int
Number of relation types
train: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the training graph
valid: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the validation graph
test: numpy.ndarray
A numpy array of triplets (src, rel, dst) for the test graph
Examples
----------
......
......@@ -15,48 +15,6 @@ from ..transforms import reorder_graph
class RedditDataset(DGLBuiltinDataset):
r""" Reddit dataset for community detection (node classification)
.. deprecated:: 0.5.0
- ``graph`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
- ``num_labels`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> num_classes = dataset.num_classes
- ``train_mask`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
>>> train_mask = graph.ndata['train_mask']
- ``val_mask`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
>>> val_mask = graph.ndata['val_mask']
- ``test_mask`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
>>> test_mask = graph.ndata['test_mask']
- ``features`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
>>> features = graph.ndata['feat']
- ``labels`` is deprecated, it is replaced by:
>>> dataset = RedditDataset()
>>> graph = dataset[0]
>>> labels = graph.ndata['label']
This is a graph dataset from Reddit posts made in the month of September, 2014.
The node label in this case is the community, or “subreddit”, that a post belongs to.
The authors sampled 50 large communities and built a post-to-post graph, connecting
......@@ -95,20 +53,6 @@ class RedditDataset(DGLBuiltinDataset):
----------
num_classes : int
Number of classes for each node
graph : :class:`dgl.DGLGraph`
Graph of the dataset
num_labels : int
Number of classes for each node
train_mask: numpy.ndarray
Mask of training nodes
val_mask: numpy.ndarray
Mask of validation nodes
test_mask: numpy.ndarray
Mask of test nodes
features : Tensor
Node features
labels : Tensor
Node labels
Examples
--------
......@@ -202,41 +146,6 @@ class RedditDataset(DGLBuiltinDataset):
r"""Number of classes for each node."""
return 41
@property
def num_labels(self):
deprecate_property('dataset.num_labels', 'dataset.num_classes')
return self.num_classes
@property
def graph(self):
deprecate_property('dataset.graph', 'dataset[0]')
return self._graph
@property
def train_mask(self):
deprecate_property('dataset.train_mask', 'graph.ndata[\'train_mask\']')
return F.asnumpy(self._graph.ndata['train_mask'])
@property
def val_mask(self):
deprecate_property('dataset.val_mask', 'graph.ndata[\'val_mask\']')
return F.asnumpy(self._graph.ndata['val_mask'])
@property
def test_mask(self):
deprecate_property('dataset.test_mask', 'graph.ndata[\'test_mask\']')
return F.asnumpy(self._graph.ndata['test_mask'])
@property
def features(self):
deprecate_property('dataset.features', 'graph.ndata[\'feat\']')
return self._graph.ndata['feat']
@property
def labels(self):
deprecate_property('dataset.labels', 'graph.ndata[\'label\']')
return self._graph.ndata['label']
def __getitem__(self, idx):
r""" Get graph by index
......
......@@ -22,16 +22,6 @@ __all__ = ['SST', 'SSTDataset']
class SSTDataset(DGLBuiltinDataset):
r"""Stanford Sentiment Treebank dataset.
.. deprecated:: 0.5.0
- ``trees`` is deprecated, it is replaced by:
>>> dataset = SSTDataset()
>>> for tree in dataset:
.... # your code here
- ``num_vocabs`` is deprecated, it is replaced by ``vocab_size``.
Each sample is the constituency tree of a sentence. The leaf nodes
represent words. The word is a int value stored in the ``x`` feature field.
The non-leaf node has a special value ``PAD_WORD`` in the ``x`` field.
......@@ -74,16 +64,12 @@ class SSTDataset(DGLBuiltinDataset):
----------
vocab : OrderedDict
Vocabulary of the dataset
trees : list
A list of DGLGraph objects
num_classes : int
Number of classes for each node
pretrained_emb: Tensor
Pretrained glove embedding with respect the vocabulary.
vocab_size : int
The size of the vocabulary
num_vocabs : int
The size of the vocabulary
Notes
-----
......@@ -224,11 +210,6 @@ class SSTDataset(DGLBuiltinDataset):
if os.path.exists(emb_path):
self._pretrained_emb = load_info(emb_path)['embed']
@property
def trees(self):
deprecate_property('dataset.trees', '[dataset[i] for i in len(dataset)]')
return self._trees
@property
def vocab(self):
r""" Vocabulary
......@@ -270,11 +251,6 @@ class SSTDataset(DGLBuiltinDataset):
r"""Number of graphs in the dataset."""
return len(self._trees)
@property
def num_vocabs(self):
deprecate_property('dataset.num_vocabs', 'dataset.vocab_size')
return self.vocab_size
@property
def vocab_size(self):
r"""Vocabulary size."""
......
......@@ -97,7 +97,7 @@ from dgl.data import citation_graph as citegrh
data = citegrh.load_cora()
G = data[0]
labels = th.tensor(data.labels)
labels = th.tensor(G.ndata['label'])
# find all the nodes labeled with class 0
label0_nodes = th.nonzero(labels == 0, as_tuple=False).squeeze()
......
......@@ -303,11 +303,9 @@ import networkx as nx
def load_cora_data():
data = citegrh.load_cora()
features = torch.FloatTensor(data.features)
labels = torch.LongTensor(data.labels)
mask = torch.BoolTensor(data.train_mask)
g = data[0]
return g, features, labels, mask
mask = torch.BoolTensor(g.ndata['train_mask'])
return g, g.ndata['feat'], g.ndata['label'], mask
##############################################################################
# The training loop is exactly the same as in the GCN tutorial.
......
......@@ -69,8 +69,8 @@ SSTBatch = namedtuple('SSTBatch', ['graph', 'mask', 'wordid', 'label'])
# The non-leaf nodes have a special word PAD_WORD. The sentiment
# label is stored in the "y" feature field.
trainset = SSTDataset(mode='tiny') # the "tiny" set has only five trees
tiny_sst = trainset.trees
num_vocabs = trainset.num_vocabs
tiny_sst = [tr for tr in trainset]
num_vocabs = trainset.vocab_size
num_classes = trainset.num_classes
vocab = trainset.vocab # vocabulary dict: key -> id
......@@ -337,7 +337,7 @@ weight_decay = 1e-4
epochs = 10
# create the model
model = TreeLSTM(trainset.num_vocabs,
model = TreeLSTM(trainset.vocab_size,
x_size,
h_size,
trainset.num_classes,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment