- 25 Oct, 2023 1 commit
-
-
RunningLeon authored
* add * import fire in main * wrap to speed up fire cli * update * update docs * update docs * fix * resolve commennts * resolve confict and add test for cli
-
- 24 Oct, 2023 1 commit
-
-
Chen Xin authored
* fix crash * update profile_generation.py * format * use self.bos_id * remove sys_instruct
-
- 11 Sep, 2023 1 commit
-
-
Lyu Han authored
* tmp * add demo for codellama inference * update * update * update * update codellama.md * export rope_theta * update * update doc * fix client.py * define SamplingParam * rollback 'end' * rotary_emb_base to rotary_embedding_base * change to baichuan2-7b
-
- 07 Aug, 2023 2 commits
- 23 Jul, 2023 1 commit
-
-
lvhan028 authored
* refactor model.py and support baichuan-7b * remove model_name * remove hard session_len * export tokenizer.py to target dir * remove model_name from client * remove model_name * update * correct throughput equation * fix session.response * update serving.md * update readme * update according to review comments * update * update * update * update
-
- 19 Jul, 2023 1 commit
-
-
lvhan028 authored
-
- 12 Jul, 2023 1 commit
-
-
lvhan028 authored
* add docstring * update * update * fix according to review results
-
- 05 Jul, 2023 1 commit
-
-
lvhan028 authored
* update internlm model * update * update * update * update * update temperature, topk and top_p * update * update * loosen log level
-
- 30 Jun, 2023 2 commits
- 25 Jun, 2023 1 commit
-
-
lvhan028 authored
* remove constraints on model name * remove duplicate model converter * add profile * get eos and bos from server * update stop_words * update sequence_length when the last generated token is eos_id * fix * fix * check-in models * valicate model_name * make stop_words as property * debug profiling * better stats * fix assistant reponse * update profile serving * update * update
-
- 20 Jun, 2023 1 commit
-
-
lvhan028 authored
* add scripts for deploying llama family models via fastertransformer * fix * fix * set symlinks True when copying triton models templates * pack model repository for triton inference server * add exception * fix * update config.pbtxt and launching scripts
-
- 18 Jun, 2023 1 commit
-
-
lvhan028 authored
-