"git@developer.sourcefind.cn:modelzoo/resnet50_tensorflow.git" did not exist on "3b7bc2682e3e1ece76cf30ce2cb0cf22417dffb5"
Commit 5856878d authored by Reed's avatar Reed Committed by Taylor Robie
Browse files

Fix crash caused by race in the async process. (#5250)

When constructing the evaluation records, data_async_generation.py would copy the records into the final directory. The main process would wait until the eval records existed. However, the main process would sometimes read the eval records before they were fully copied, causing a DataLossError.
parent dda23ecf
...@@ -329,8 +329,7 @@ def _construct_eval_record(cache_paths, eval_batch_size): ...@@ -329,8 +329,7 @@ def _construct_eval_record(cache_paths, eval_batch_size):
items=items[i, :] items=items[i, :]
) )
writer.write(batch_bytes) writer.write(batch_bytes)
tf.gfile.Copy(intermediate_fpath, dest_fpath) tf.gfile.Rename(intermediate_fpath, dest_fpath)
tf.gfile.Remove(intermediate_fpath)
log_msg("Eval TFRecords file successfully constructed.") log_msg("Eval TFRecords file successfully constructed.")
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment