Add profile (#15)
* remove constraints on model name * remove duplicate model converter * add profile * get eos and bos from server * update stop_words * update sequence_length when the last generated token is eos_id * fix * fix * check-in models * valicate model_name * make stop_words as property * debug profiling * better stats * fix assistant reponse * update profile serving * update * update
Showing
benchmark/profile_serving.py
0 → 100644
llmdeploy/model.py
0 → 100644
Please register or sign in to comment