Unverified Commit b400af48 authored by Hongzhi (Steve), Chen's avatar Hongzhi (Steve), Chen Committed by GitHub
Browse files

[Graphbolt] Use dynamic buffer size for shuffle. (#6040)


Co-authored-by: default avatarUbuntu <ubuntu@ip-172-31-28-63.ap-northeast-1.compute.internal>
parent 80d16efa
...@@ -186,7 +186,10 @@ class MinibatchSampler(IterDataPipe): ...@@ -186,7 +186,10 @@ class MinibatchSampler(IterDataPipe):
# Shuffle before batch. # Shuffle before batch.
if self._shuffle: if self._shuffle:
# `torchdata.datapipes.iter.Shuffler` works with stream too. # `torchdata.datapipes.iter.Shuffler` works with stream too.
data_pipe = data_pipe.shuffle() # To ensure randomness, make sure the buffer size is at least 10
# times the batch size.
buffer_size = max(10000, 10 * self._batch_size)
data_pipe = data_pipe.shuffle(buffer_size=buffer_size)
# Batch. # Batch.
data_pipe = data_pipe.batch( data_pipe = data_pipe.batch(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment