- 22 Mar, 2024 1 commit
-
-
zhouxiang authored
-
- 18 Dec, 2023 1 commit
-
-
AllentDan authored
* launch gradio server directly with hf model * end session * end session * fix api_server backend for gradio * fix out of boundary index * remove log
-
- 22 Nov, 2023 1 commit
-
-
Chen Xin authored
* turbomind support export model params * fix overflow * support turbomind.from_pretrained * fix tp * support AutoModel * support load kv qparams * update auto_awq * udpate docstring * export lmdeploy version * update doc * remove download_hf_repo * LmdeployForCausalLM -> LmdeployForCausalLM * refactor turbomind.py * update comment * add bfloat16 convert back * support gradio run_locl load hf * support resuful api server load hf * add docs * support loading previous quantized model * adapt pr 690 * udpate docs * not export turbomind config when quantize a model * check model_name when can not get it from config.json * update readme * remove model_name in auto_awq * update * update * udpate * fix build * absolute import
-
- 01 Nov, 2023 1 commit
-
-
AllentDan authored
* make IPv6 compatible, safe run for coroutine interrupting * instance_id -> session_id and fix api_client.py * update doc * remove useless faq * safe ip mapping * update app.py * WIP completion * completion * update doc * disable interactive mode for /v1/chat/completions * docstring * docstring * refactor gradio * update gradio * udpate * update doc * rename * session_id default -1 * missed two files * add a APIClient * add chat func for APIClient * refine * add concurrent function * sequence_start, sequence_end --> interactive_mode * update doc * comments * doc * better text completion * remove /v1/embeddings * comments * deprecate generate and use /v1/interactive/completions * /v1/interactive/completion -> /v1/chat/interactive * embeddings * rename * remove wrong arg description * docstring * fix * update cli * update doc * strict session_len limit condition * pass model args to api_server
-
- 25 Oct, 2023 1 commit
-
-
RunningLeon authored
* add * import fire in main * wrap to speed up fire cli * update * update docs * update docs * fix * resolve commennts * resolve confict and add test for cli
-
- 11 Oct, 2023 1 commit
-
-
AllentDan authored
* make IPv6 compatible, safe run for coroutine interrupting * instance_id -> session_id and fix api_client.py * update doc * remove useless faq * safe ip mapping * update app.py * remove print * update doc
-
- 24 Aug, 2023 1 commit
-
-
AllentDan authored
* app use async engine * add stop logic * app update cancel * app support restful-api * update doc and use the right model name * set doc url root * add comments * add an example * renew_session * update readme.md * resolve comments * Update restful_api.md * Update restful_api.md * Update restful_api.md --------- Co-authored-by:tpoisonooo <khj.application@aliyun.com>
-
- 16 Aug, 2023 1 commit
-
-
AllentDan authored
* import if lib directory exists * only modify app.py
-
- 04 Aug, 2023 1 commit
-
-
AllentDan authored
* use local model for webui * local model for app.py * lint * remove print * add seed * comments * fixed seesion_id * support turbomind batch inference * update app.py * lint and docstring * move webui to serve/gradio * update doc * update doc * update docstring and rmeove print conversition * log * Update docs/zh_cn/build.md Co-authored-by:
Chen Xin <xinchen.tju@gmail.com> * Update docs/en/build.md Co-authored-by:
Chen Xin <xinchen.tju@gmail.com> * use latest gradio * fix * replace partial with InterFace * use host ip instead of coolie --------- Co-authored-by:
Chen Xin <xinchen.tju@gmail.com>
-