Add quick inde

a87db05c · Yoach Lacombe · 0e5f2734 · a87db05c · a87db05c
Commit a87db05c authored Apr 09, 2024 by Yoach Lacombe
Hide whitespace changes
Inline Side-by-side

Showing with 33 additions and 27 deletions

README.md README.md +32 -26

training/TRAINING.md training/TRAINING.md +1 -1

No files found.
--- a/README.md
+++ b/README.md
 # Parler-TTS

-[[Paper we reproduce]](https://arxiv.org/abs/2402.01912)
-[[Models]](https://huggingface.co/parler-tts)
-[[Training Code]](training)
-[[Interactive Demo]](https://huggingface.co/spaces/parler-tts/parler_tts_mini)
-
 > [!IMPORTANT]
 > We're proud to release Parler-TTS v0.1, our first 300M parameter model, trained on 10.5K hours of audio data.
 > In the coming weeks, we'll be working on scaling up to 50k hours of data, in preparation for the v1 model.
@@ -15,6 +10,15 @@ Contrarily to other TTS models, Parler-TTS is a **fully open-source** release. A

 This repository contains the inference and training code for Parler-TTS. It is designed to accompany the [Data-Speech](https://github.com/huggingface/dataspeech) repository for dataset annotation.

+
+## 📖 Quick Index
+* [Installation](#installation)
+* [Usage](#usage)
+* [Training](#training)
+* [Demo](https://huggingface.co/spaces/parler-tts/parler_tts_mini)
+* [Model weights and datasets](https://huggingface.co/parler-tts)
+
+
 ## Usage

 > [!TIP]
@@ -44,7 +48,7 @@ audio_arr = generation.cpu().numpy().squeeze()
 sf.write("parler_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```

-## Installation steps
+## Installation

 Parler-TTS has light-weight dependencies and can be installed in one line:

@@ -66,26 +70,6 @@ Special thanks to:
 - Descript for the [DAC codec model](https://github.com/descriptinc/descript-audio-codec)
 - Hugging Face 🤗 for providing compute resources and time to explore!

-## Contribution
-
-Contributions are welcome, as the project offers many possibilities for improvement and exploration.
-
-Namely, we're looking at ways to improve both quality and speed:
- Datasets:
-    - Train on more data
-    - Add more features such as accents
- Training:
-    - Add PEFT compatibility to do Lora fine-tuning.
-    - Add possibility to train without description column.
-    - Add notebook training.
-    - Explore multilingual training.
-    - Explore mono-speaker finetuning.
-    - Explore more architectures.
- Optimization:
-    - Compilation and static cache
-    - Support to FA2 and SDPA
- Evaluation:
-    - Add more evaluation metrics

 ## Citation

@@ -112,3 +96,25 @@ If you found this repository useful, please consider citing this work and also t
      primaryClass={cs.SD}
 }
 ```
+
+## Contribution
+
+Contributions are welcome, as the project offers many possibilities for improvement and exploration.
+
+Namely, we're looking at ways to improve both quality and speed:
+- Datasets:
+    - Train on more data
+    - Add more features such as accents
+- Training:
+    - Add PEFT compatibility to do Lora fine-tuning.
+    - Add possibility to train without description column.
+    - Add notebook training.
+    - Explore multilingual training.
+    - Explore mono-speaker finetuning.
+    - Explore more architectures.
+- Optimization:
+    - Compilation and static cache
+    - Support to FA2 and SDPA
+- Evaluation:
+    - Add more evaluation metrics
+
--- a/training/TRAINING.md
+++ b/training/TRAINING.md
@@ -207,5 +207,5 @@ Thus, the script generalises to any number of training datasets.


 > [!IMPORTANT]
-> Starting training a new model from scratch can easily be overwhelming, here how the training of v0.01 looked like: [logs](https://api.wandb.ai/links/ylacombe/ea449l81)
+> Starting training a new model from scratch can easily be overwhelming,so here's what training looked like for v0.1: [logs](https://api.wandb.ai/links/ylacombe/ea449l81)