init

2025/04/10 15:55:52

init
2025/04/10 15:55:52
033f82a9 · guobj · ef72564b · 033f82a9 · ef72564b · 033f82a9
Commit 033f82a9 authored Apr 10, 2025 by guobj
20 changed files
--- a/LICENSE
+++ b/LICENSE
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
-# kokoro_pytorch
-
--- a/README_origin.md
+++ b/README_origin.md
+# kokoro
+
+An inference library for [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M). You can [`pip install kokoro`](https://pypi.org/project/kokoro/).
+
+> **Kokoro** is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.
+
+### Usage
+You can run this basic cell on [Google Colab](https://colab.research.google.com/). [Listen to samples](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/SAMPLES.md).
+```py
+!pip install -q kokoro>=0.9.4 soundfile
+!apt-get -qq -y install espeak-ng > /dev/null 2>&1
+from kokoro import KPipeline
+from IPython.display import display, Audio
+import soundfile as sf
+import torch
+pipeline = KPipeline(lang_code='a')
+text = '''
+[Kokoro](/kˈOkəɹO/) is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, [Kokoro](/kˈOkəɹO/) can be deployed anywhere from production environments to personal projects.
+'''
+generator = pipeline(text, voice='af_heart')
+for i, (gs, ps, audio) in enumerate(generator):
+    print(i, gs, ps)
+    display(Audio(data=audio, rate=24000, autoplay=i==0))
+    sf.write(f'{i}.wav', audio, 24000)
+```
+Under the hood, `kokoro` uses [`misaki`](https://pypi.org/project/misaki/), a G2P library at https://github.com/hexgrad/misaki
+
+### Advanced Usage
+You can run this advanced cell on [Google Colab](https://colab.research.google.com/).
+```py
+# 1️⃣ Install kokoro
+!pip install -q kokoro>=0.9.4 soundfile
+# 2️⃣ Install espeak, used for English OOD fallback and some non-English languages
+!apt-get -qq -y install espeak-ng > /dev/null 2>&1
+
+# 3️⃣ Initalize a pipeline
+from kokoro import KPipeline
+from IPython.display import display, Audio
+import soundfile as sf
+import torch
+# 🇺🇸 'a' => American English, 🇬🇧 'b' => British English
+# 🇪🇸 'e' => Spanish es
+# 🇫🇷 'f' => French fr-fr
+# 🇮🇳 'h' => Hindi hi
+# 🇮🇹 'i' => Italian it
+# 🇯🇵 'j' => Japanese: pip install misaki[ja]
+# 🇧🇷 'p' => Brazilian Portuguese pt-br
+# 🇨🇳 'z' => Mandarin Chinese: pip install misaki[zh]
+pipeline = KPipeline(lang_code='a') # <= make sure lang_code matches voice, reference above.
+
+# This text is for demonstration purposes only, unseen during training
+text = '''
+The sky above the port was the color of television, tuned to a dead channel.
+"It's not like I'm using," Case heard someone say, as he shouldered his way through the crowd around the door of the Chat. "It's like my body's developed this massive drug deficiency."
+It was a Sprawl voice and a Sprawl joke. The Chatsubo was a bar for professional expatriates; you could drink there for a week and never hear two words in Japanese.
+
+These were to have an enormous impact, not only because they were associated with Constantine, but also because, as in so many other areas, the decisions taken by Constantine (or in his name) were to have great significance for centuries to come. One of the main issues was the shape that Christian churches were to take, since there was not, apparently, a tradition of monumental church buildings when Constantine decided to help the Christian church build a series of truly spectacular structures. The main form that these churches took was that of the basilica, a multipurpose rectangular structure, based ultimately on the earlier Greek stoa, which could be found in most of the great cities of the empire. Christianity, unlike classical polytheism, needed a large interior space for the celebration of its religious services, and the basilica aptly filled that need. We naturally do not know the degree to which the emperor was involved in the design of new churches, but it is tempting to connect this with the secular basilica that Constantine completed in the Roman forum (the so-called Basilica of Maxentius) and the one he probably built in Trier, in connection with his residence in the city at a time when he was still caesar.
+
+[Kokoro](/kˈOkəɹO/) is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, [Kokoro](/kˈOkəɹO/) can be deployed anywhere from production environments to personal projects.
+'''
+# text = '「もしおれがただ偶然、そしてこうしようというつもりでなくここに立っているのなら、ちょっとばかり絶望するところだな」と、そんなことが彼の頭に思い浮かんだ。'
+# text = '中國人民不信邪也不怕邪，不惹事也不怕事，任何外國不要指望我們會拿自己的核心利益做交易，不要指望我們會吞下損害我國主權、安全、發展利益的苦果！'
+# text = 'Los partidos políticos tradicionales compiten con los populismos y los movimientos asamblearios.'
+# text = 'Le dromadaire resplendissant déambulait tranquillement dans les méandres en mastiquant de petites feuilles vernissées.'
+# text = 'ट्रांसपोर्टरों की हड़ताल लगातार पांचवें दिन जारी, दिसंबर से इलेक्ट्रॉनिक टोल कलेक्शनल सिस्टम'
+# text = "Allora cominciava l'insonnia, o un dormiveglia peggiore dell'insonnia, che talvolta assumeva i caratteri dell'incubo."
+# text = 'Elabora relatórios de acompanhamento cronológico para as diferentes unidades do Departamento que propõem contratos.'
+
+# 4️⃣ Generate, display, and save audio files in a loop.
+generator = pipeline(
+    text, voice='af_heart', # <= change voice here
+    speed=1, split_pattern=r'\n+'
+)
+# Alternatively, load voice tensor directly:
+# voice_tensor = torch.load('path/to/voice.pt', weights_only=True)
+# generator = pipeline(
+#     text, voice=voice_tensor,
+#     speed=1, split_pattern=r'\n+'
+# )
+
+for i, (gs, ps, audio) in enumerate(generator):
+    print(i)  # i => index
+    print(gs) # gs => graphemes/text
+    print(ps) # ps => phonemes
+    display(Audio(data=audio, rate=24000, autoplay=i==0))
+    sf.write(f'{i}.wav', audio, 24000) # save each audio file
+```
+
+### Windows Installation
+To install espeak-ng on Windows:
+1. Go to [espeak-ng releases](https://github.com/espeak-ng/espeak-ng/releases)
+2. Click on **Latest release** 
+3. Download the appropriate `*.msi` file (e.g. **espeak-ng-20191129-b702b03-x64.msi**)
+4. Run the downloaded installer
+
+For advanced configuration and usage on Windows, see the [official espeak-ng Windows guide](https://github.com/espeak-ng/espeak-ng/blob/master/docs/guide.md)
+
+### Conda Environment
+Use the following conda `environment.yml` if you're facing any dependency issues.
+```yaml
+name: kokoro
+channels:
+  - defaults
+dependencies:
+  - python==3.9       
+  - libstdcxx~=12.4.0 # Needed to load espeak correctly. Try removing this if you're facing issues with Espeak fallback. 
+  - pip:
+      - kokoro>=0.3.1
+      - soundfile
+      - misaki[en]
+```
+
+### Acknowledgements
+- 🛠️ [@yl4579](https://huggingface.co/yl4579) for architecting StyleTTS 2.
+- 🏆 [@Pendrokar](https://huggingface.co/Pendrokar) for adding Kokoro as a contender in the TTS Spaces Arena.
+- 📊 Thank you to everyone who contributed synthetic training data.
+- ❤️ Special thanks to all compute sponsors.
+- 👾 Discord server: https://discord.gg/QuGxSWBfQy
+- 🪽 Kokoro is a Japanese word that translates to "heart" or "spirit". Kokoro is also a [character in the Terminator franchise](https://terminator.fandom.com/wiki/Kokoro) along with [Misaki](https://github.com/hexgrad/misaki?tab=readme-ov-file#acknowledgements).
+
+<img src="https://static0.gamerantimages.com/wordpress/wp-content/uploads/2024/08/terminator-zero-41-1.jpg" width="400" alt="kokoro" />
--- a/demo/README.md
+++ b/demo/README.md
+---
+title: Kokoro TTS
+emoji: ❤️
+colorFrom: indigo
+colorTo: pink
+sdk: gradio
+sdk_version: 5.12.0
+app_file: app.py
+pinned: true
+license: apache-2.0
+short_description: Upgraded to v1.0!
+disable_embedding: true
+---
+
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
\ No newline at end of file
--- a/demo/app.py
+++ b/demo/app.py
+import spaces
+from kokoro import KModel, KPipeline
+import gradio as gr
+import os
+import random
+import torch
+
+CUDA_AVAILABLE = torch.cuda.is_available()
+models = {gpu: KModel().to('cuda' if gpu else 'cpu').eval() for gpu in [False] + ([True] if CUDA_AVAILABLE else [])}
+pipelines = {lang_code: KPipeline(lang_code=lang_code, model=False) for lang_code in 'ab'}
+pipelines['a'].g2p.lexicon.golds['kokoro'] = 'kˈOkəɹO'
+pipelines['b'].g2p.lexicon.golds['kokoro'] = 'kˈQkəɹQ'
+
+@spaces.GPU(duration=30)
+def forward_gpu(ps, ref_s, speed):
+    return models[True](ps, ref_s, speed)
+
+def generate_first(text, voice='af_heart', speed=1, use_gpu=CUDA_AVAILABLE):
+    pipeline = pipelines[voice[0]]
+    pack = pipeline.load_voice(voice)
+    use_gpu = use_gpu and CUDA_AVAILABLE
+    for _, ps, _ in pipeline(text, voice, speed):
+        ref_s = pack[len(ps)-1]
+        try:
+            if use_gpu:
+                audio = forward_gpu(ps, ref_s, speed)
+            else:
+                audio = models[False](ps, ref_s, speed)
+        except gr.exceptions.Error as e:
+            if use_gpu:
+                gr.Warning(str(e))
+                gr.Info('Retrying with CPU. To avoid this error, change Hardware to CPU.')
+                audio = models[False](ps, ref_s, speed)
+            else:
+                raise gr.Error(e)
+        return (24000, audio.numpy()), ps
+    return None, ''
+
+# Arena API
+def predict(text, voice='af_heart', speed=1):
+    return generate_first(text, voice, speed, use_gpu=False)[0]
+
+def tokenize_first(text, voice='af_heart'):
+    pipeline = pipelines[voice[0]]
+    for _, ps, _ in pipeline(text, voice):
+        return ps
+    return ''
+
+def generate_all(text, voice='af_heart', speed=1, use_gpu=CUDA_AVAILABLE):
+    pipeline = pipelines[voice[0]]
+    pack = pipeline.load_voice(voice)
+    use_gpu = use_gpu and CUDA_AVAILABLE
+    first = True
+    for _, ps, _ in pipeline(text, voice, speed):
+        ref_s = pack[len(ps)-1]
+        try:
+            if use_gpu:
+                audio = forward_gpu(ps, ref_s, speed)
+            else:
+                audio = models[False](ps, ref_s, speed)
+        except gr.exceptions.Error as e:
+            if use_gpu:
+                gr.Warning(str(e))
+                gr.Info('Switching to CPU')
+                audio = models[False](ps, ref_s, speed)
+            else:
+                raise gr.Error(e)
+        yield 24000, audio.numpy()
+        if first:
+            first = False
+            yield 24000, torch.zeros(1).numpy()
+
+with open('en.txt', 'r') as r:
+    random_quotes = [line.strip() for line in r]
+
+def get_random_quote():
+    return random.choice(random_quotes)
+
+def get_gatsby():
+    with open('gatsby5k.md', 'r') as r:
+        return r.read().strip()
+
+def get_frankenstein():
+    with open('frankenstein5k.md', 'r') as r:
+        return r.read().strip()
+
+CHOICES = {
+'🇺🇸 🚺 Heart ❤️': 'af_heart',
+'🇺🇸 🚺 Bella 🔥': 'af_bella',
+'🇺🇸 🚺 Nicole 🎧': 'af_nicole',
+'🇺🇸 🚺 Aoede': 'af_aoede',
+'🇺🇸 🚺 Kore': 'af_kore',
+'🇺🇸 🚺 Sarah': 'af_sarah',
+'🇺🇸 🚺 Nova': 'af_nova',
+'🇺🇸 🚺 Sky': 'af_sky',
+'🇺🇸 🚺 Alloy': 'af_alloy',
+'🇺🇸 🚺 Jessica': 'af_jessica',
+'🇺🇸 🚺 River': 'af_river',
+'🇺🇸 🚹 Michael': 'am_michael',
+'🇺🇸 🚹 Fenrir': 'am_fenrir',
+'🇺🇸 🚹 Puck': 'am_puck',
+'🇺🇸 🚹 Echo': 'am_echo',
+'🇺🇸 🚹 Eric': 'am_eric',
+'🇺🇸 🚹 Liam': 'am_liam',
+'🇺🇸 🚹 Onyx': 'am_onyx',
+'🇺🇸 🚹 Santa': 'am_santa',
+'🇺🇸 🚹 Adam': 'am_adam',
+'🇬🇧 🚺 Emma': 'bf_emma',
+'🇬🇧 🚺 Isabella': 'bf_isabella',
+'🇬🇧 🚺 Alice': 'bf_alice',
+'🇬🇧 🚺 Lily': 'bf_lily',
+'🇬🇧 🚹 George': 'bm_george',
+'🇬🇧 🚹 Fable': 'bm_fable',
+'🇬🇧 🚹 Lewis': 'bm_lewis',
+'🇬🇧 🚹 Daniel': 'bm_daniel',
+}
+for v in CHOICES.values():
+    pipelines[v[0]].load_voice(v)
+
+TOKEN_NOTE = '''
+💡 Customize pronunciation with Markdown link syntax and /slashes/ like `[Kokoro](/kˈOkəɹO/)`
+
+💬 To adjust intonation, try punctuation `;:,.!?—…"()“”` or stress `ˈ` and `ˌ`
+
+⬇️ Lower stress `[1 level](-1)` or `[2 levels](-2)`
+
+⬆️ Raise stress 1 level `[or](+2)` 2 levels (only works on less stressed, usually short words)
+'''
+
+with gr.Blocks() as generate_tab:
+    out_audio = gr.Audio(label='Output Audio', interactive=False, streaming=False, autoplay=True)
+    generate_btn = gr.Button('Generate', variant='primary')
+    with gr.Accordion('Output Tokens', open=True):
+        out_ps = gr.Textbox(interactive=False, show_label=False, info='Tokens used to generate the audio, up to 510 context length.')
+        tokenize_btn = gr.Button('Tokenize', variant='secondary')
+        gr.Markdown(TOKEN_NOTE)
+        predict_btn = gr.Button('Predict', variant='secondary', visible=False)
+
+STREAM_NOTE = ['⚠️ There is an unknown Gradio bug that might yield no audio the first time you click `Stream`.']
+STREAM_NOTE = '\n\n'.join(STREAM_NOTE)
+
+with gr.Blocks() as stream_tab:
+    out_stream = gr.Audio(label='Output Audio Stream', interactive=False, streaming=True, autoplay=True)
+    with gr.Row():
+        stream_btn = gr.Button('Stream', variant='primary')
+        stop_btn = gr.Button('Stop', variant='stop')
+    with gr.Accordion('Note', open=True):
+        gr.Markdown(STREAM_NOTE)
+        gr.DuplicateButton()
+
+API_OPEN = True
+with gr.Blocks() as app:
+    with gr.Row():
+        with gr.Column():
+            text = gr.Textbox(label='Input Text', info=f"Arbitrarily many characters supported")
+            with gr.Row():
+                voice = gr.Dropdown(list(CHOICES.items()), value='af_heart', label='Voice', info='Quality and availability vary by language')
+                use_gpu = gr.Dropdown(
+                    [('ZeroGPU 🚀', True), ('CPU 🐌', False)],
+                    value=CUDA_AVAILABLE,
+                    label='Hardware',
+                    info='GPU is usually faster, but has a usage quota',
+                    interactive=CUDA_AVAILABLE
+                )
+            speed = gr.Slider(minimum=0.5, maximum=2, value=1, step=0.1, label='Speed')
+            random_btn = gr.Button('🎲 Random Quote 💬', variant='secondary')
+            with gr.Row():
+                gatsby_btn = gr.Button('🥂 Gatsby 📕', variant='secondary')
+                frankenstein_btn = gr.Button('💀 Frankenstein 📗', variant='secondary')
+        with gr.Column():
+            gr.TabbedInterface([generate_tab, stream_tab], ['Generate', 'Stream'])
+    random_btn.click(fn=get_random_quote, inputs=[], outputs=[text])
+    gatsby_btn.click(fn=get_gatsby, inputs=[], outputs=[text])
+    frankenstein_btn.click(fn=get_frankenstein, inputs=[], outputs=[text])
+    generate_btn.click(fn=generate_first, inputs=[text, voice, speed, use_gpu], outputs=[out_audio, out_ps])
+    tokenize_btn.click(fn=tokenize_first, inputs=[text, voice], outputs=[out_ps])
+    stream_event = stream_btn.click(fn=generate_all, inputs=[text, voice, speed, use_gpu], outputs=[out_stream])
+    stop_btn.click(fn=None, cancels=stream_event)
+    predict_btn.click(fn=predict, inputs=[text, voice, speed], outputs=[out_audio])
+
+if __name__ == '__main__':
+    app.queue(api_open=API_OPEN).launch(server_name="0.0.0.0", server_port=40001, show_api=API_OPEN)
--- a/demo/en.txt
+++ b/demo/en.txt
--- a/demo/frankenstein5k.md
+++ b/demo/frankenstein5k.md
+You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which you have regarded with such evil forebodings. I arrived here yesterday, and my first task is to assure my dear sister of my welfare and increasing confidence in the success of my undertaking.
+
+I am already far north of London, and as I walk in the streets of Petersburgh, I feel a cold northern breeze play upon my cheeks, which braces my nerves and fills me with delight. Do you understand this feeling? This breeze, which has travelled from the regions towards which I am advancing, gives me a foretaste of those icy climes. Inspirited by this wind of promise, my daydreams become more fervent and vivid. I try in vain to be persuaded that the pole is the seat of frost and desolation; it ever presents itself to my imagination as the region of beauty and delight. There, Margaret, the sun is for ever visible, its broad disk just skirting the horizon and diffusing a perpetual splendour. There—for with your leave, my sister, I will put some trust in preceding navigators—there snow and frost are banished; and, sailing over a calm sea, we may be wafted to a land surpassing in wonders and in beauty every region hitherto discovered on the habitable globe. Its productions and features may be without example, as the phenomena of the heavenly bodies undoubtedly are in those undiscovered solitudes. What may not be expected in a country of eternal light? I may there discover the wondrous power which attracts the needle and may regulate a thousand celestial observations that require only this voyage to render their seeming eccentricities consistent for ever. I shall satiate my ardent curiosity with the sight of a part of the world never before visited, and may tread a land never before imprinted by the foot of man. These are my enticements, and they are sufficient to conquer all fear of danger or death and to induce me to commence this laborious voyage with the joy a child feels when he embarks in a little boat, with his holiday mates, on an expedition of discovery up his native river. But supposing all these conjectures to be false, you cannot contest the inestimable benefit which I shall confer on all mankind, to the last generation, by discovering a passage near the pole to those countries, to reach which at present so many months are requisite; or by ascertaining the secret of the magnet, which, if at all possible, can only be effected by an undertaking such as mine.
+
+These reflections have dispelled the agitation with which I began my letter, and I feel my heart glow with an enthusiasm which elevates me to heaven, for nothing contributes so much to tranquillise the mind as a steady purpose—a point on which the soul may fix its intellectual eye. This expedition has been the favourite dream of my early years. I have read with ardour the accounts of the various voyages which have been made in the prospect of arriving at the North Pacific Ocean through the seas which surround the pole. You may remember that a history of all the voyages made for purposes of discovery composed the whole of our good Uncle Thomas’s library. My education was neglected, yet I was passionately fond of reading. These volumes were my study day and night, and my familiarity with them increased that regret which I had felt, as a child, on learning that my father’s dying injunction had forbidden my uncle to allow me to embark in a seafaring life.
+
+These visions faded when I perused, for the first time, those poets whose effusions entranced my soul and lifted it to heaven. I also became a poet and for one year lived in a paradise of my own creation; I imagined that I also might obtain a niche in the temple where the names of Homer and Shakespeare are consecrated. You are well acquainted with my failure and how heavily I bore the disappointment. But just at that time I inherited the fortune of my cousin, and my thoughts were turned into the channel of their earlier bent.
+
+Six years have passed since I resolved on my present undertaking. I can, even now, remember the hour from which I dedicated myself to this great enterprise. I commenced by inuring my body to hardship. I accompanied the whale-fishers on several expeditions to the North Sea; I voluntarily endured cold, famine, thirst, and want of sleep; I often worked harder than the common sailors during the day and devoted my nights to the study of mathematics, the theory of medicine, and those branches of physical science from which a naval adventurer might derive the greatest practical advantage. Twice I actually hired myself as an under-mate in a Greenland whaler, and acquitted myself to admiration. I must own I felt a little proud when my captain offered me the second dignity in the vessel and entreated me to remain with the greatest earnestness, so valuable did he consider my services.
+
+And now, dear Margaret, do I not deserve to accomplish some great purpose?
\ No newline at end of file
--- a/demo/gatsby5k.md
+++ b/demo/gatsby5k.md
+In my younger and more vulnerable years my father gave me some advice that I’ve been turning over in my mind ever since.
+
+“Whenever you feel like criticizing anyone,” he told me, “just remember that all the people in this world haven’t had the advantages that you’ve had.”
+
+He didn’t say any more, but we’ve always been unusually communicative in a reserved way, and I understood that he meant a great deal more than that. In consequence, I’m inclined to reserve all judgements, a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores. The abnormal mind is quick to detect and attach itself to this quality when it appears in a normal person, and so it came about that in college I was unjustly accused of being a politician, because I was privy to the secret griefs of wild, unknown men. Most of the confidences were unsought—frequently I have feigned sleep, preoccupation, or a hostile levity when I realized by some unmistakable sign that an intimate revelation was quivering on the horizon; for the intimate revelations of young men, or at least the terms in which they express them, are usually plagiaristic and marred by obvious suppressions. Reserving judgements is a matter of infinite hope. I am still a little afraid of missing something if I forget that, as my father snobbishly suggested, and I snobbishly repeat, a sense of the fundamental decencies is parcelled out unequally at birth.
+
+And, after boasting this way of my tolerance, I come to the admission that it has a limit. Conduct may be founded on the hard rock or the wet marshes, but after a certain point I don’t care what it’s founded on. When I came back from the East last autumn I felt that I wanted the world to be in uniform and at a sort of moral attention forever; I wanted no more riotous excursions with privileged glimpses into the human heart. Only Gatsby, the man who gives his name to this book, was exempt from my reaction—Gatsby, who represented everything for which I have an unaffected scorn. If personality is an unbroken series of successful gestures, then there was something gorgeous about him, some heightened sensitivity to the promises of life, as if he were related to one of those intricate machines that register earthquakes ten thousand miles away. This responsiveness had nothing to do with that flabby impressionability which is dignified under the name of the “creative temperament”—it was an extraordinary gift for hope, a romantic readiness such as I have never found in any other person and which it is not likely I shall ever find again. No—Gatsby turned out all right at the end; it is what preyed on Gatsby, what foul dust floated in the wake of his dreams that temporarily closed out my interest in the abortive sorrows and short-winded elations of men.
+
+My family have been prominent, well-to-do people in this Middle Western city for three generations. The Carraways are something of a clan, and we have a tradition that we’re descended from the Dukes of Buccleuch, but the actual founder of my line was my grandfather’s brother, who came here in fifty-one, sent a substitute to the Civil War, and started the wholesale hardware business that my father carries on today.
+
+I never saw this great-uncle, but I’m supposed to look like him—with special reference to the rather hard-boiled painting that hangs in father’s office. I graduated from New Haven in 1915, just a quarter of a century after my father, and a little later I participated in that delayed Teutonic migration known as the Great War. I enjoyed the counter-raid so thoroughly that I came back restless. Instead of being the warm centre of the world, the Middle West now seemed like the ragged edge of the universe—so I decided to go East and learn the bond business. Everybody I knew was in the bond business, so I supposed it could support one more single man. All my aunts and uncles talked it over as if they were choosing a prep school for me, and finally said, “Why—[ye-es](/jˈɛ ɛs/),” with very grave, hesitant faces. Father agreed to finance me for a year, and after various delays I came East, permanently, I thought, in the spring of twenty-two.
+
+The practical thing was to find rooms in the city, but it was a warm season, and I had just left a country of wide lawns and friendly trees, so when a young man at the office suggested that we take a house together in a commuting town, it sounded like a great idea. He found the house, a weather-beaten cardboard bungalow at eighty a month, but at the last minute the firm ordered him to Washington, and I went out to the country alone. I had a dog—at least I had him for a few days until he ran away—and an old Dodge and a Finnish woman, who made my bed and cooked breakfast and muttered Finnish wisdom to herself over the electric stove.
+
+It was lonely for a day or so until one morning some man, more recently arrived than I, stopped me on the road.
+
+“How do you get to West Egg village?” he asked helplessly.
\ No newline at end of file
--- a/demo/packages.txt
+++ b/demo/packages.txt
+espeak-ng
\ No newline at end of file
--- a/demo/requirements.txt
+++ b/demo/requirements.txt
+kokoro>=0.7.13
+gradio
+pip
--- a/examples/device_examples.py
+++ b/examples/device_examples.py
+"""
+Quick example to show how device selection can be controlled, and was checked
+"""
+import time
+from kokoro import KPipeline
+from loguru import logger
+
+def generate_audio(pipeline, text):
+    for _, _, audio in pipeline(text, voice='af_bella'):
+        samples = audio.shape[0] if audio is not None else 0
+        assert samples > 0, "No audio generated"
+        return samples
+
+def time_synthesis(device=None):
+    try:
+        start = time.perf_counter()
+        pipeline = KPipeline(lang_code='a', device=device)
+        samples = generate_audio(pipeline, "The quick brown fox jumps over the lazy dog.")
+        ms = (time.perf_counter() - start) * 1000
+        logger.info(f"✓ {device or 'auto':<6} | {ms:>5.1f}ms total | {samples:>6,d} samples")
+    except RuntimeError as e:
+        logger.error(f"✗ {'cuda' if 'CUDA' in str(e) else device or 'auto':<6} | {'not available' if 'CUDA' in str(e) else str(e)}")
+
+def compare_shared_model():
+    try:
+        start = time.perf_counter()
+        en_us = KPipeline(lang_code='a')
+        en_uk = KPipeline(lang_code='a', model=en_us.model)
+        
+        for pipeline in [en_us, en_uk]:
+            generate_audio(pipeline, "Testing model reuse.")
+                
+        ms = (time.perf_counter() - start) * 1000
+        logger.info(f"✓ reuse  | {ms:>5.1f}ms for both models")
+    except Exception as e:
+        logger.error(f"✗ reuse  | {str(e)}")
+
+if __name__ == '__main__':
+    logger.info("Device Selection & Performance")
+    logger.info("-" * 40)
+    time_synthesis()
+    time_synthesis('cuda')
+    time_synthesis('cpu') 
+    logger.info("-" * 40)
+    compare_shared_model()
\ No newline at end of file
--- a/examples/export.py
+++ b/examples/export.py
+import argparse
+import os
+import torch
+import onnx
+import onnxruntime as ort
+import sounddevice as sd
+
+from kokoro import KModel, KPipeline
+from kokoro.model import KModelForONNX
+
+def export_onnx(model, output):
+    onnx_file = output + "/" + "kokoro.onnx"
+
+    input_ids = torch.randint(1, 100, (48,)).numpy()
+    input_ids = torch.LongTensor([[0, *input_ids, 0]])
+    style = torch.randn(1, 256)
+    speed = torch.randint(1, 10, (1,)).int()
+
+    torch.onnx.export(
+        model, 
+        args = (input_ids, style, speed), 
+        f = onnx_file, 
+        export_params = True, 
+        verbose = True, 
+        input_names = [ 'input_ids', 'style', 'speed' ], 
+        output_names = [ 'waveform', 'duration' ],
+        opset_version = 17, 
+        dynamic_axes = {
+            'input_ids': { 1: 'input_ids_len' }, 
+            'waveform': { 0: 'num_samples' }, 
+        }, 
+        do_constant_folding = True, 
+    )
+
+    print('export kokoro.onnx ok!')
+
+    onnx_model = onnx.load(onnx_file)
+    onnx.checker.check_model(onnx_model)
+    print('onnx check ok!')
+
+def load_input_ids(pipeline, text):
+    if pipeline.lang_code in 'ab':
+        _, tokens = pipeline.g2p(text)
+        for gs, ps, tks in pipeline.en_tokenize(tokens):
+            if not ps:
+                continue
+    else:
+        ps, _ = pipeline.g2p(text)
+
+    if len(ps) > 510:
+        ps = ps[:510]
+
+    input_ids = list(filter(lambda i: i is not None, map(lambda p: pipeline.model.vocab.get(p), ps)))
+    print(f"text: {text} -> phonemes: {ps} -> input_ids: {input_ids}")
+    input_ids = torch.LongTensor([[0, *input_ids, 0]]).to(pipeline.model.device)
+    return ps, input_ids
+
+def load_voice(pipeline, voice, phonemes):
+    pack = pipeline.load_voice(voice).to('cpu')
+    return pack[len(phonemes) - 1]
+
+def load_sample(model):
+    pipeline = KPipeline(lang_code='a', model=model.kmodel, device='cpu')
+    text = '''
+    In today's fast-paced tech world, building software applications has never been easier — thanks to AI-powered coding assistants.'
+    '''
+    text = '''
+    The sky above the port was the color of television, tuned to a dead channel.
+    '''
+    voice = 'checkpoints/voices/af_heart.pt'
+
+    pipeline = KPipeline(lang_code='z', model=model.kmodel, device='cpu')
+    text = '''
+    2月15日晚，猫眼专业版数据显示，截至发稿，《哪吒之魔童闹海》（或称《哪吒2》）今日票房已达7.8亿元，累计票房（含预售）超过114亿元。
+    '''
+    voice = 'checkpoints/voices/zf_xiaoxiao.pt'
+
+    phonemes, input_ids = load_input_ids(pipeline, text)
+    style = load_voice(pipeline, voice, phonemes)
+    speed = torch.IntTensor([1])
+
+    return input_ids, style, speed
+
+def inference_onnx(model, output):
+    onnx_file = output + "/" + "kokoro.onnx"
+    session = ort.InferenceSession(onnx_file)
+
+    input_ids, style, speed = load_sample(model)
+
+    outputs = session.run(None, {
+        'input_ids': input_ids.numpy(), 
+        'style': style.numpy(), 
+        'speed': speed.numpy(), 
+    })
+
+    output = torch.from_numpy(outputs[0])
+    print(f'output: {output.shape}')
+    print(output)
+
+    audio = output.numpy()
+    sd.play(audio, 24000)
+    sd.wait()
+
+def check_model(model):
+    input_ids, style, speed = load_sample(model)
+    output, duration = model(input_ids, style, speed)
+
+    print(f'output: {output.shape}')
+    print(f'duration: {duration.shape}')
+    print(output)
+
+    audio = output.numpy()
+    sd.play(audio, 24000)
+    sd.wait()
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser("Export kokoro Model to ONNX", add_help=True)
+    parser.add_argument("--inference", "-t", help="test kokoro.onnx model", action="store_true")
+    parser.add_argument("--check", "-m", help="check kokoro model", action="store_true")
+    parser.add_argument(
+        "--config_file", "-c", type=str, default="checkpoints/config.json", help="path to config file"
+    )
+    parser.add_argument(
+        "--checkpoint_path", "-p", type=str, default="checkpoints/kokoro-v1_0.pth", help="path to checkpoint file"
+    )
+    parser.add_argument(
+        "--output_dir", "-o", type=str, default="onnx", help="output directory"
+    )
+
+    args = parser.parse_args()
+
+    # cfg
+    config_file = args.config_file  # change the path of the model config file
+    checkpoint_path = args.checkpoint_path  # change the path of the model
+    output_dir = args.output_dir
+    
+    # make dir
+    os.makedirs(output_dir, exist_ok=True)
+
+    kmodel = KModel(config=config_file, model=checkpoint_path, disable_complex=True)
+    model = KModelForONNX(kmodel).eval()
+
+    if args.inference:
+        inference_onnx(model, output_dir)
+    elif args.check:
+        check_model(model)
+    else:
+        export_onnx(model, output_dir)
--- a/examples/phoneme_example.py
+++ b/examples/phoneme_example.py
+from kokoro import KPipeline, KModel
+import torch
+from scipy.io import wavfile
+
+def save_audio(audio: torch.Tensor, filename: str):
+    """Helper function to save audio tensor as WAV file"""
+    if audio is not None:
+        # Ensure audio is on CPU and in the right format
+        audio_cpu = audio.cpu().numpy()
+        
+        # Save using scipy.io.wavfile
+        wavfile.write(
+            filename,
+            24000,  # Kokoro uses 24kHz sample rate
+            audio_cpu
+        )
+        print(f"Audio saved as '{filename}'")
+    else:
+        print("No audio was generated")
+
+def main():
+    # Initialize pipeline with American English
+    pipeline = KPipeline(lang_code='a')
+    
+    # The phoneme string for:
+    # "How are you today? I am doing reasonably well, thank you for asking"
+    phonemes = "hˌW ɑɹ ju tədˈA? ˌI ɐm dˈuɪŋ ɹˈizənəbli wˈɛl, θˈæŋk ju fɔɹ ˈæskɪŋ"
+    
+    try:
+        print("\nExample 1: Using generate_from_tokens with raw phonemes")
+        results = list(pipeline.generate_from_tokens(
+            tokens=phonemes,
+            voice="af_bella",
+            speed=1.0
+        ))
+        if results:
+            save_audio(results[0].audio, 'phoneme_output_new.wav')
+        
+        # Example 2: Using generate_from_tokens with pre-processed tokens
+        print("\nExample 2: Using generate_from_tokens with pre-processed tokens")
+        #  get the tokens through G2P or any other method
+        text = "How are you today? I am doing reasonably well, thank you for asking"
+        _, tokens = pipeline.g2p(text)
+        
+        # Then generate from tokens
+        for result in pipeline.generate_from_tokens(
+            tokens=tokens,
+            voice="af_bella",
+            speed=1.0
+        ):
+            # Each result may contain timestamps if available
+            if result.tokens:
+                for token in result.tokens:
+                    if hasattr(token, 'start_ts') and hasattr(token, 'end_ts'):
+                        print(f"Token: {token.text} ({token.start_ts:.2f}s - {token.end_ts:.2f}s)")
+            save_audio(result.audio, f'token_output_{hash(result.phonemes)}.wav')
+            
+    except Exception as e:
+        print(f"An error occurred: {str(e)}")
+
+if __name__ == "__main__":
+    main()
\ No newline at end of file
--- a/kokoro.js/.gitignore
+++ b/kokoro.js/.gitignore
+node_modules/
+dist
+types
+LICENSE
--- a/kokoro.js/.prettierignore
+++ b/kokoro.js/.prettierignore
+dist
+types
--- a/kokoro.js/README.md
+++ b/kokoro.js/README.md
+# Kokoro TTS
+
+<p align="center">
+    <a href="https://www.npmjs.com/package/kokoro-js"><img alt="NPM" src="https://img.shields.io/npm/v/kokoro-js"></a>
+    <a href="https://www.npmjs.com/package/kokoro-js"><img alt="NPM Downloads" src="https://img.shields.io/npm/dw/kokoro-js"></a>
+    <a href="https://www.jsdelivr.com/package/npm/kokoro-js"><img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/kokoro-js"></a>
+    <a href="https://github.com/hexgrad/kokoro/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/hexgrad/kokoro?color=blue"></a>
+    <a href="https://huggingface.co/spaces/webml-community/kokoro-webgpu"><img alt="Demo" src="https://img.shields.io/badge/Hugging_Face-demo-green"></a>
+</p>
+
+Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out). This JavaScript library allows the model to be run 100% locally in the browser thanks to [🤗 Transformers.js](https://huggingface.co/docs/transformers.js). Try it out using our [online demo](https://huggingface.co/spaces/webml-community/kokoro-webgpu)!
+
+## Usage
+
+First, install the `kokoro-js` library from [NPM](https://npmjs.com/package/kokoro-js) using:
+
+```bash
+npm i kokoro-js
+```
+
+You can then generate speech as follows:
+
+```js
+import { KokoroTTS } from "kokoro-js";
+
+const model_id = "onnx-community/Kokoro-82M-v1.0-ONNX";
+const tts = await KokoroTTS.from_pretrained(model_id, {
+  dtype: "q8", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
+  device: "wasm", // Options: "wasm", "webgpu" (web) or "cpu" (node). If using "webgpu", we recommend using dtype="fp32".
+});
+
+const text = "Life is like a box of chocolates. You never know what you're gonna get.";
+const audio = await tts.generate(text, {
+  // Use `tts.list_voices()` to list all available voices
+  voice: "af_heart",
+});
+audio.save("audio.wav");
+```
+
+Or if you'd prefer to stream the output, you can do that with:
+
+```js
+import { KokoroTTS, TextSplitterStream } from "kokoro-js";
+
+const model_id = "onnx-community/Kokoro-82M-v1.0-ONNX";
+const tts = await KokoroTTS.from_pretrained(model_id, {
+  dtype: "fp32", // Options: "fp32", "fp16", "q8", "q4", "q4f16"
+  // device: "webgpu", // Options: "wasm", "webgpu" (web) or "cpu" (node).
+});
+
+// First, set up the stream
+const splitter = new TextSplitterStream();
+const stream = tts.stream(splitter);
+(async () => {
+  let i = 0;
+  for await (const { text, phonemes, audio } of stream) {
+    console.log({ text, phonemes });
+    audio.save(`audio-${i++}.wav`);
+  }
+})();
+
+// Next, add text to the stream. Note that the text can be added at different times.
+// For this example, let's pretend we're consuming text from an LLM, one word at a time.
+const text = "Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects. It can even run 100% locally in your browser, powered by Transformers.js!";
+const tokens = text.match(/\s*\S+/g);
+for (const token of tokens) {
+  splitter.push(token);
+  await new Promise((resolve) => setTimeout(resolve, 10));
+}
+
+// Finally, close the stream to signal that no more text will be added.
+splitter.close();
+
+// Alternatively, if you'd like to keep the stream open, but flush any remaining text, you can use the `flush` method.
+// splitter.flush();
+```
+
+## Voices/Samples
+
+> [!TIP]
+> You can find samples for each of the voices in the [model card](https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX#samples) on Hugging Face.
+
+### American English
+
+| Name         | Traits | Target Quality | Training Duration | Overall Grade |
+| ------------ | ------ | -------------- | ----------------- | ------------- |
+| **af_heart** | 🚺❤️   |                |                   | **A**         |
+| af_alloy     | 🚺     | B              | MM minutes        | C             |
+| af_aoede     | 🚺     | B              | H hours           | C+            |
+| af_bella     | 🚺🔥   | **A**          | **HH hours**      | **A-**        |
+| af_jessica   | 🚺     | C              | MM minutes        | D             |
+| af_kore      | 🚺     | B              | H hours           | C+            |
+| af_nicole    | 🚺🎧   | B              | **HH hours**      | B-            |
+| af_nova      | 🚺     | B              | MM minutes        | C             |
+| af_river     | 🚺     | C              | MM minutes        | D             |
+| af_sarah     | 🚺     | B              | H hours           | C+            |
+| af_sky       | 🚺     | B              | _M minutes_ 🤏    | C-            |
+| am_adam      | 🚹     | D              | H hours           | F+            |
+| am_echo      | 🚹     | C              | MM minutes        | D             |
+| am_eric      | 🚹     | C              | MM minutes        | D             |
+| am_fenrir    | 🚹     | B              | H hours           | C+            |
+| am_liam      | 🚹     | C              | MM minutes        | D             |
+| am_michael   | 🚹     | B              | H hours           | C+            |
+| am_onyx      | 🚹     | C              | MM minutes        | D             |
+| am_puck      | 🚹     | B              | H hours           | C+            |
+| am_santa     | 🚹     | C              | _M minutes_ 🤏    | D-            |
+
+### British English
+
+| Name        | Traits | Target Quality | Training Duration | Overall Grade |
+| ----------- | ------ | -------------- | ----------------- | ------------- |
+| bf_alice    | 🚺     | C              | MM minutes        | D             |
+| bf_emma     | 🚺     | B              | **HH hours**      | B-            |
+| bf_isabella | 🚺     | B              | MM minutes        | C             |
+| bf_lily     | 🚺     | C              | MM minutes        | D             |
+| bm_daniel   | 🚹     | C              | MM minutes        | D             |
+| bm_fable    | 🚹     | B              | MM minutes        | C             |
+| bm_george   | 🚹     | B              | MM minutes        | C             |
+| bm_lewis    | 🚹     | C              | H hours           | D+            |
--- a/kokoro.js/demo/.gitignore
+++ b/kokoro.js/demo/.gitignore
+# Logs
+logs
+*.log
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+pnpm-debug.log*
+lerna-debug.log*
+
+node_modules
+dist
+dist-ssr
+*.local
+
+# Editor directories and files
+.vscode/*
+!.vscode/extensions.json
+.idea
+.DS_Store
+*.suo
+*.ntvs*
+*.njsproj
+*.sln
+*.sw?
--- a/kokoro.js/demo/README.md
+++ b/kokoro.js/demo/README.md
+---
+title: Kokoro Text-to-Speech
+emoji: 🗣️
+colorFrom: indigo
+colorTo: purple
+sdk: static
+pinned: false
+license: apache-2.0
+short_description: High-quality speech synthesis powered by Kokoro TTS
+header: mini
+models:
+  - onnx-community/Kokoro-82M-ONNX
+custom_headers:
+  cross-origin-embedder-policy: require-corp
+  cross-origin-opener-policy: same-origin
+  cross-origin-resource-policy: cross-origin
+---
+
+# Kokoro Text-to-Speech
+
+A simple React + Vite application for running [Kokoro](https://github.com/hexgrad/kokoro), a frontier text-to-speech model for its size. The model runs 100% locally in the browser using [kokoro-js](https://www.npmjs.com/package/kokoro-js) and [🤗 Transformers.js](https://www.npmjs.com/package/@huggingface/transformers)!
+
+## Getting Started
+
+Follow the steps below to set up and run the application.
+
+### 1. Clone the Repository
+
+```sh
+git clone https://github.com/hexgrad/kokoro.git
+```
+
+### 2. Build the Dependencies
+
+```sh
+cd kokoro/kokoro.js
+npm i
+npm run build
+```
+
+### 3. Setup the Demo Project
+
+Note this depends on build output from the previous step.
+
+```sh
+cd demo
+npm i
+```
+
+### 4. Start the Development Server
+
+```sh
+npm run dev
+```
+
+The application should now be running locally. Open your browser and go to [http://localhost:5173](http://localhost:5173) to see it in action.
--- a/kokoro.js/demo/eslint.config.js
+++ b/kokoro.js/demo/eslint.config.js
+import js from "@eslint/js";
+import globals from "globals";
+import react from "eslint-plugin-react";
+import reactHooks from "eslint-plugin-react-hooks";
+import reactRefresh from "eslint-plugin-react-refresh";
+
+export default [
+  { ignores: ["dist"] },
+  {
+    files: ["**/*.{js,jsx}"],
+    languageOptions: {
+      ecmaVersion: 2020,
+      globals: globals.browser,
+      parserOptions: {
+        ecmaVersion: "latest",
+        ecmaFeatures: { jsx: true },
+        sourceType: "module",
+      },
+    },
+    settings: { react: { version: "18.3" } },
+    plugins: {
+      react,
+      "react-hooks": reactHooks,
+      "react-refresh": reactRefresh,
+    },
+    rules: {
+      ...js.configs.recommended.rules,
+      ...react.configs.recommended.rules,
+      ...react.configs["jsx-runtime"].rules,
+      ...reactHooks.configs.recommended.rules,
+      "react/jsx-no-target-blank": "off",
+      "react-refresh/only-export-components": ["warn", { allowConstantExport: true }],
+    },
+  },
+];
--- a/kokoro.js/demo/index.html
+++ b/kokoro.js/demo/index.html
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <link rel="icon" type="image/svg+xml" href="/hf-logo.svg" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>Kokoro Text-to-Speech</title>
+  </head>
+  <body>
+    <div id="root"></div>
+    <script type="module" src="/src/main.jsx"></script>
+  </body>
+</html>