Add layer normalization to wav2vec2 large+ pretrained models (#2873)
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2873 The original fairseq implementation had an extra layer normalization preprocessings for large/xlarge models. https://github.com/facebookresearch/fairseq/blob/fcca32258c8e8bcc9f9890bf4714fa2f96b6b3e1/fairseq/data/audio/hubert_dataset.py#L355-L357 This commit modifies the pre-trained model bundle to include this preprocessing to the impacted pre-trained models listed bellow. For the sake of keeping the interface identical to the other models, since the additional preprocessing is rather simple, the returned pre-trained model instance is modified ot include the preprocess, instead of adding a method for preprocessing. - WAV2VEC2_LARGE_LV60K - WAV2VEC2_ASR_LARGE_LV60K_10M - WAV2VEC2_ASR_LARGE_LV60K_100H - WAV2VEC2_ASR_LARGE_LV60K_960H - WAV2VEC2_XLSR53 - HUBERT_LARGE - HUBERT_XLARGE - HUBERT_ASR_LARGE - HUBERT_ASR_XLARGE - WAVLM_LARGE Reviewed By: nateanl Differential Revision: D41520183 fbshipit-source-id: 83d72fe692e8b9fc25df144deb4ca946fcd09615
Showing
Please register or sign in to comment