Update README.md

f88322b7 · Suraj Patil · GitHub · a82d2592 · f88322b7
Unverified Commit f88322b7 authored Jun 13, 2022 by Suraj Patil Committed by GitHub Jun 13, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 34 additions and 0 deletions

README.md README.md +34 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -186,6 +186,40 @@ image_pil = PIL.Image.fromarray(image_processed[0])
 image_pil.save("test.png")
 ```
+**Text to speech with BDDM**
+_Follow the isnstructions [here](https://pytorch.org/hub/nvidia_deeplearningexamples_tacotron2/) to load tacotron2 model._
+```python
+import torch
+from diffusers import BDDM, DiffusionPipeline
+torch_device = "cuda"
+# load the BDDM pipeline
+bddm = DiffusionPipeline.from_pretrained("fusing/diffwave-vocoder")
+# load tacotron2 to get the mel spectograms
+tacotron2 = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_tacotron2', model_math='fp16')
+tacotron2 = tacotron2.to(torch_device).eval()
+text = "Hello world, I missed you so much."
+utils = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_tts_utils')
+sequences, lengths = utils.prepare_input_sequence([text])
+with torch.no_grad():
+    mel, _, _ = tacotron2.infer(sequences, lengths)
+generator = torch.manual_seed(0)
+audio = bddm(mel, generator, torch_device)
+from scipy.io.wavfile import write as wavwrite
+sampling_rate = 22050
+wavwrite("generated_audio.wav", sampling_rate, audio.squeeze().cpu().numpy())
+```
 ## Library structure:
 ```