Update README.md

d5f7a605 · Song · GitHub · f22c2a35 · d5f7a605
Unverified Commit d5f7a605 authored Aug 01, 2023 by Song Committed by GitHub Aug 01, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

tinychat/README.md tinychat/README.md +1 -1

No files found.
--- a/tinychat/README.md
+++ b/tinychat/README.md
 # TinyChat: Efficient and Minimal Chatbot with AWQ

-We introduce TinyChat, a cutting-edge chatbot interface designed for minimal resource consumption and fast inference speed on GPU platforms. It allows for seamless deployment on consumer-level GPUs such as 3090/4090 and low-power edge devices like the NVIDIA Jetson Orin, empowering users with a responsive conversational experience like never before.
+We introduce TinyChat, a fast GPU inference library for LLMs quantized by AWQ (W4A16). It allows real-time LLM deployment on consumer-level GPUs such as 3090/4090 and low-power edge devices like the NVIDIA Jetson Orin, empowering users with a responsive conversational experience on the edge. It runs LLaMA2-7B at 8.7ms/token on 4090 and 75.1ms/token on Jetson Orin.