Commit f7afe29e authored by Zhaoheng Ni's avatar Zhaoheng Ni Committed by Facebook GitHub Bot
Browse files

Disable multiprocessing when dumping features in hubert preprocessing (#2311)

Summary:
The multi-processing works well on MFCC features. However, it sometimes makes the script hang when dumping HuBERT features. Change it to for-loop resolves the issue.

Pull Request resolved: https://github.com/pytorch/audio/pull/2311

Reviewed By: mthrok

Differential Revision: D35393813

Pulled By: nateanl

fbshipit-source-id: afdc14557a1102b20ecd5fafba0964a913250a11
parent 11328d23
......@@ -8,7 +8,6 @@ The script includes:
"""
import logging
from argparse import ArgumentParser, RawTextHelpFormatter
from multiprocessing import Pool
from pathlib import Path
import torch
......@@ -99,9 +98,8 @@ def main(args):
feat_dir.mkdir()
for split in ["train", "valid"]:
p = Pool(args.num_rank)
inputs = [
(
for rank in range(1, args.num_rank + 1):
dump_features(
tsv_dir / f"{args.dataset}_{split}.tsv",
feat_dir,
split,
......@@ -113,11 +111,6 @@ def main(args):
args.checkpoint_path,
16_000,
)
for rank in range(1, args.num_rank + 1)
]
_ = p.starmap(dump_features, inputs)
p.close()
p.join()
# Fit KMeans clustering model
learn_kmeans(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment