Unverified Commit c98a6ac2 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Use argument for preprocessing workers in run_summairzation (#15394)

parent db079567
......@@ -443,6 +443,7 @@ def main():
processed_datasets = raw_datasets.map(
preprocess_function,
batched=True,
num_proc=args.preprocessing_num_workers,
remove_columns=column_names,
load_from_cache_file=not args.overwrite_cache,
desc="Running tokenizer on dataset",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment