Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
dgl
Commits
9d63f3ea
Unverified
Commit
9d63f3ea
authored
Jul 11, 2023
by
Rhett Ying
Committed by
GitHub
Jul 11, 2023
Browse files
[GraphBolt] call preprocess function first when init OnDiskDataset (#5982)
parent
6efd2ca1
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
1 deletion
+13
-1
python/dgl/graphbolt/impl/ondisk_dataset.py
python/dgl/graphbolt/impl/ondisk_dataset.py
+13
-1
No files found.
python/dgl/graphbolt/impl/ondisk_dataset.py
View file @
9d63f3ea
...
...
@@ -13,7 +13,16 @@ from .torch_based_feature_store import (
TorchBasedFeatureStore
,
)
__all__
=
[
"OnDiskDataset"
]
__all__
=
[
"OnDiskDataset"
,
"preprocess_ondisk_dataset"
]
def
preprocess_ondisk_dataset
(
metadata_path
:
str
)
->
str
:
"""Preprocess the on-disk dataset."""
# [TODO]
print
(
"Start to preprocess the on-disk dataset."
)
new_metadata_path
=
metadata_path
print
(
"Finish preprocessing the on-disk dataset."
)
return
new_metadata_path
class
OnDiskDataset
(
Dataset
):
...
...
@@ -71,6 +80,9 @@ class OnDiskDataset(Dataset):
"""
def
__init__
(
self
,
path
:
str
)
->
None
:
# Always call the preprocess function first. If already preprocessed,
# the function will return the original path directly.
path
=
preprocess_ondisk_dataset
(
path
)
with
open
(
path
,
"r"
)
as
f
:
self
.
_meta
=
OnDiskMetaData
.
parse_raw
(
f
.
read
(),
proto
=
"yaml"
)
self
.
_dataset_name
=
self
.
_meta
.
dataset_name
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment