"...experiment/git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "2815fb1f22e36854463d3ede0a9da4aa7435739a"
Unverified Commit 604a21b1 authored by Patrick von Platen's avatar Patrick von Platen Committed by GitHub
Browse files

[Docs] Improve docs for MMS loading of other languages (#24292)



* Improve docs

* Apply suggestions from code review

* upload readme

* Apply suggestions from code review
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
parent e6122c3f
...@@ -44,11 +44,51 @@ MMS's architecture is based on the Wav2Vec2 model, so one can refer to [Wav2Vec2 ...@@ -44,11 +44,51 @@ MMS's architecture is based on the Wav2Vec2 model, so one can refer to [Wav2Vec2
The original code can be found [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms). The original code can be found [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms).
## Inference ## Loading
By default MMS loads adapter weights for English. If you want to load adapter weights of another language
make sure to specify `target_lang=<your-chosen-target-lang>` as well as `"ignore_mismatched_sizes=True`.
The `ignore_mismatched_sizes=True` keyword has to be passed to allow the language model head to be resized according
to the vocabulary of the specified language.
Similarly, the processor should be loaded with the same target language
```py
from transformers import Wav2Vec2ForCTC, AutoProcessor
model_id = "facebook/mms-1b-all"
target_lang = "fra"
processor = AutoProcessor.from_pretrained(model_id, target_lang=target_lang)
model = Wav2Vec2ForCTC.from_pretrained(model_id, target_lang=target_lang, ignore_mismatched_sizes=True)
```
<Tip>
You can safely ignore a warning such as:
By default MMS loads adapter weights for English, but those can be easily switched out for another language. ```text
Let's look at an example. Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/mms-1b-all and are newly initialized because the shapes did not match:
- lm_head.bias: found shape torch.Size([154]) in the checkpoint and torch.Size([314]) in the model instantiated
- lm_head.weight: found shape torch.Size([154, 1280]) in the checkpoint and torch.Size([314, 1280]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
```
</Tip>
If you want to use the ASR pipeline, you can load your chosen target language as such:
```py
from transformers import pipeline
model_id = "facebook/mms-1b-all"
target_lang = "fra"
pipe = pipeline(model=model_id, model_kwargs={"target_lang": "fra", "ignore_mismatched_sizes": True})
```
## Inference
Next, let's look at how we can run MMS in inference and change adapter layers after having called [`~PretrainedModel.from_pretrained`]
First, we load audio data in different languages using the [Datasets](https://github.com/huggingface/datasets). First, we load audio data in different languages using the [Datasets](https://github.com/huggingface/datasets).
```py ```py
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment