Unverified Commit 06125966 authored by RunningLeon's avatar RunningLeon Committed by GitHub
Browse files

Add extra_requires to reduce dependencies (#580)

* update reqs

* update docs

* resolve comments

* upgrade pydantic

* fix rebase

* update doc

* update

* update

* update readme

* update

* add flash-attn
parent 7b20cfdf
......@@ -103,6 +103,14 @@ Install lmdeploy with pip ( python 3.8+) or [from source](./docs/en/build.md)
pip install lmdeploy
```
> **Note**<br />
> `pip install lmdeploy` can only install the runtime required packages. If users want to run codes from modules like `lmdeploy.lite` and `lmdeploy.serve`, they need to install the extra required packages.
> For instance, running `pip install lmdeploy[lite]` would install extra dependencies for `lmdeploy.lite` module.
>
> - `all`: Install lmdeploy with all dependencies in `requirements.txt`
> - `lite`: Install lmdeploy with extra dependencies in `requirements/lite.txt`
> - `serve`: Install lmdeploy with dependencies in `requirements/serve.txt`
### Deploy InternLM
#### Get InternLM model
......@@ -140,6 +148,9 @@ lmdeploy chat turbomind ./workspace
#### Serving with gradio
```shell
# install lmdeploy with extra dependencies
pip install lmdeploy[serve]
lmdeploy serve gradio ./workspace
```
......@@ -150,6 +161,9 @@ lmdeploy serve gradio ./workspace
Launch inference server by:
```shell
# install lmdeploy with extra dependencies
pip install lmdeploy[serve]
lmdeploy serve api_server ./workspace --instance_num 32 --tp 1
```
......@@ -182,6 +196,7 @@ bash workspace/service_docker_up.sh
Then, you can communicate with the inference server by command line,
```shell
python3 -m pip install tritonclient[grpc]
lmdeploy serve triton_client {server_ip_addresss}:33337
```
......
......@@ -104,6 +104,13 @@ TurboMind 的 output token throughput 超过 2000 token/s, 整体比 DeepSpeed
pip install lmdeploy
```
> **Note**<br />
> `pip install lmdeploy`默认安装runtime依赖包,使用lmdeploy的lite和serve功能时,用户需要安装额外依赖包。例如: `pip install lmdeploy[lite]` 会额外安装`lmdeploy.lite`模块的依赖包
>
> - `all`: 安装`lmdeploy`所有依赖包,具体可查看`requirements.txt`
> - `lite`: 额外安装`lmdeploy.lite`模块的依赖包,具体可查看`requirements/lite.txt`
> - `serve`: 额外安装`lmdeploy.serve`模块的依赖包,具体可查看`requirements/serve.txt`
### 部署 InternLM
#### 获取 InternLM 模型
......@@ -140,6 +147,9 @@ lmdeploy chat turbomind ./workspace
#### 启动 gradio server
```shell
# 安装lmdeploy额外依赖
pip install lmdeploy[serve]
lmdeploy serve gradio ./workspace
```
......@@ -150,6 +160,9 @@ lmdeploy serve gradio ./workspace
使用下面的命令启动推理服务:
```shell
# 安装lmdeploy额外依赖
pip install lmdeploy[serve]
lmdeploy serve api_server ./workspace --server_name 0.0.0.0 --server_port ${server_port} --instance_num 32 --tp 1
```
......@@ -182,6 +195,7 @@ bash workspace/service_docker_up.sh
你可以通过命令行方式与推理服务进行对话:
```shell
python3 -m pip install tritonclient[grpc]
lmdeploy serve triton_client {server_ip_addresss}:33337
```
......
......@@ -17,7 +17,7 @@ It may have been caused by the following reasons.
1. You haven't installed lmdeploy's precompiled package. `_turbomind` is the pybind package of c++ turbomind, which involves compilation. It is recommended that you install the precompiled one.
```shell
pip install lmdeploy
pip install lmdeploy[all]
```
2. If you have installed it and still encounter this issue, it is probably because you are executing turbomind-related command in the root directory of lmdeploy source code. Switching to another directory will fix it
......@@ -26,7 +26,7 @@ pip install lmdeploy
### libnccl.so.2 not found
Make sure you have install lmdeploy (>=v0.0.5) through `pip install lmdeploy`.
Make sure you have install lmdeploy (>=v0.0.5) through `pip install lmdeploy[all]`.
If the issue still exists after lmdeploy installation, add the path of `libnccl.so.2` to environment variable LD_LIBRARY_PATH.
......
......@@ -26,7 +26,7 @@ Based on the above table, download the model that meets your requirements. Execu
```shell
# install lmdeploy
python3 -m pip install lmdeploy
python3 -m pip install lmdeploy[all]
# convert weight layout
lmdeploy convert codellama /the/path/of/codellama/model
......
......@@ -5,7 +5,7 @@ LMDeploy supports LLM model inference of 4-bit weight, with the minimum requirem
Before proceeding with the inference, please ensure that lmdeploy is installed.
```shell
pip install lmdeploy
pip install lmdeploy[all]
```
## 4-bit LLM model Inference
......
......@@ -17,7 +17,7 @@ pip install --upgrade mmengine
1. 您没有安装 lmdeploy 的预编译包。`_turbomind`是 turbomind c++ 的 pybind部分,涉及到编译。推荐您直接安装预编译包。
```
pip install lmdeploy
pip install lmdeploy[all]
```
2. 如果已经安装了,还是出现这个问题,请检查下执行目录。不要在 lmdeploy 的源码根目录下执行 python -m lmdeploy.turbomind.\*下的package,换到其他目录下执行。
......@@ -26,7 +26,7 @@ pip install lmdeploy
### libnccl.so.2 not found
确保通过 `pip install lmdeploy` 安装了 lmdeploy (>=v0.0.5)。
确保通过 `pip install lmdeploy[all]` 安装了 lmdeploy (>=v0.0.5)。
如果安装之后,问题还存在,那么就把`libnccl.so.2`的路径加入到环境变量 LD_LIBRARY_PATH 中。
......
......@@ -26,7 +26,7 @@
```shell
# 安装 lmdeploy
python3 -m pip install lmdeploy
python3 -m pip install lmdeploy[all]
# 转模型格式
lmdeploy convert codellama /path/of/codellama/model
......
......@@ -5,7 +5,7 @@ LMDeploy 支持 4bit 权重模型的推理,**对 NVIDIA 显卡的最低要求
在推理之前,请确保安装了 lmdeploy
```shell
pip install lmdeploy
pip install lmdeploy[all]
```
## 4bit 权重模型推理
......
accelerate
datasets
fastapi
fire
gradio<4.0.0
mmengine
numpy
pybind11
safetensors
sentencepiece
setuptools
shortuuid
tiktoken
torch
transformers>=4.33.0
tritonclient[all]
uvicorn
-r requirements/build.txt
-r requirements/runtime.txt
-r requirements/lite.txt
-r requirements/serve.txt
pybind11
setuptools
accelerate
datasets
flash-attn
fire
mmengine
numpy
safetensors
sentencepiece
tiktoken
torch
transformers>=4.33.0
fastapi
gradio<4.0.0
pydantic>2.0.0
shortuuid
uvicorn
......@@ -134,7 +134,14 @@ if __name__ == '__main__':
'lmdeploy': lmdeploy_package_data,
},
include_package_data=True,
install_requires=parse_requirements('requirements.txt'),
setup_requires=parse_requirements('requirements/build.txt'),
tests_require=parse_requirements('requirements/test.txt'),
install_requires=parse_requirements('requirements/runtime.txt'),
extras_require={
'all': parse_requirements('requirements.txt'),
'lite': parse_requirements('requirements/lite.txt'),
'serve': parse_requirements('requirements/serve.txt')
},
has_ext_modules=check_ext_modules,
classifiers=[
'Programming Language :: Python :: 3.8',
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment