Use argument for preprocessing workers in run_summairzation (#15394)

c98a6ac2 · Sylvain Gugger · GitHub · db079567 · c98a6ac2
Unverified Commit c98a6ac2 authored Jan 28, 2022 by Sylvain Gugger Committed by GitHub Jan 28, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

examples/pytorch/summarization/run_summarization_no_trainer.py ...les/pytorch/summarization/run_summarization_no_trainer.py +1 -0

No files found.
--- a/examples/pytorch/summarization/run_summarization_no_trainer.py
+++ b/examples/pytorch/summarization/run_summarization_no_trainer.py
@@ -443,6 +443,7 @@ def main():
        processed_datasets = raw_datasets.map(
            preprocess_function,
            batched=True,
+            num_proc=args.preprocessing_num_workers,
            remove_columns=column_names,
            load_from_cache_file=not args.overwrite_cache,
            desc="Running tokenizer on dataset",