Commit f7afe29e authored by Zhaoheng Ni's avatar Zhaoheng Ni Committed by Facebook GitHub Bot
Browse files

Disable multiprocessing when dumping features in hubert preprocessing (#2311)

Summary:
The multi-processing works well on MFCC features. However, it sometimes makes the script hang when dumping HuBERT features. Change it to for-loop resolves the issue.

Pull Request resolved: https://github.com/pytorch/audio/pull/2311

Reviewed By: mthrok

Differential Revision: D35393813

Pulled By: nateanl

fbshipit-source-id: afdc14557a1102b20ecd5fafba0964a913250a11
parent 11328d23
...@@ -8,7 +8,6 @@ The script includes: ...@@ -8,7 +8,6 @@ The script includes:
""" """
import logging import logging
from argparse import ArgumentParser, RawTextHelpFormatter from argparse import ArgumentParser, RawTextHelpFormatter
from multiprocessing import Pool
from pathlib import Path from pathlib import Path
import torch import torch
...@@ -99,9 +98,8 @@ def main(args): ...@@ -99,9 +98,8 @@ def main(args):
feat_dir.mkdir() feat_dir.mkdir()
for split in ["train", "valid"]: for split in ["train", "valid"]:
p = Pool(args.num_rank) for rank in range(1, args.num_rank + 1):
inputs = [ dump_features(
(
tsv_dir / f"{args.dataset}_{split}.tsv", tsv_dir / f"{args.dataset}_{split}.tsv",
feat_dir, feat_dir,
split, split,
...@@ -113,11 +111,6 @@ def main(args): ...@@ -113,11 +111,6 @@ def main(args):
args.checkpoint_path, args.checkpoint_path,
16_000, 16_000,
) )
for rank in range(1, args.num_rank + 1)
]
_ = p.starmap(dump_features, inputs)
p.close()
p.join()
# Fit KMeans clustering model # Fit KMeans clustering model
learn_kmeans( learn_kmeans(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment