Unverified Commit 2e1b8bc2 authored by zhoukz's avatar zhoukz Committed by GitHub
Browse files

[Model][Bugfix] Fix MiDashengLM audio encoder mask by removing incorrect `logical_not` (#25925)


Signed-off-by: zhoukz's avatarzhoukz <me@zhoukz.com>
parent e47433b3
...@@ -426,8 +426,7 @@ class DashengAudioTransformer(nn.Module): ...@@ -426,8 +426,7 @@ class DashengAudioTransformer(nn.Module):
assert x_length.ndim == 1, "Lengths are of size (B,)" assert x_length.ndim == 1, "Lengths are of size (B,)"
scaled_lengths = (x_length / (self.hop_length * 4)).long() scaled_lengths = (x_length / (self.hop_length * 4)).long()
mask = self._to_mask(max_length=t, lengths=scaled_lengths) mask = self._to_mask(max_length=t, lengths=scaled_lengths)
split_masks = mask.logical_not().split(target_length_in_patches, split_masks = mask.split(target_length_in_patches, dim=-1)
dim=-1)
else: else:
mask = None mask = None
split_masks = [None] * len(input_splits) split_masks = [None] * len(input_splits)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment