Commit 396700dd authored by chenzk's avatar chenzk
Browse files

v1.0

parents
Pipeline #2603 failed with stages
in 0 seconds
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Docker Deployment
## Docker image preparation
There are two ways to prepare a Docker image.
1. Pull from the official image
2. Build locally, see [Build Docker Image](./build_image.md)
You can **choose any one** during actual use.
## Deploy With Proxy Model
In this deployment, you don't need an GPU environment.
1. Pull from the official image repository, [Eosphoros AI Docker Hub](https://hub.docker.com/u/eosphorosai)
```bash
docker pull eosphorosai/dbgpt-openai:latest
```
2. Run the Docker container
This example requires you previde a valid API key for the SiliconFlow API. You can obtain one by signing up at [SiliconFlow](https://siliconflow.cn/) and creating an API key at [API Key](https://cloud.siliconflow.cn/account/ak).
```bash
docker run -it --rm -e SILICONFLOW_API_KEY=${SILICONFLOW_API_KEY} \
-p 5670:5670 --name dbgpt eosphorosai/dbgpt-openai
```
Please replace `${SILICONFLOW_API_KEY}` with your own API key.
Then you can visit [http://localhost:5670](http://localhost:5670) in the browser.
## Deploy With GPU (Local Model)
In this deployment, you need an GPU environment.
Before running the Docker container, you need to install the NVIDIA Container Toolkit. For more information, please refer to the official documentation [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
In this deployment, you will use a local model instead of downloading it from the Hugging Face or ModelScope model hub. This is useful if you have already downloaded the model to your local machine or if you want to use a model from a different source.
### Step 1: Download the Model
Before running the Docker container, you need to download the model to your local machine. You can use either Hugging Face or ModelScope (recommended for users in China) to download the model.
<Tabs>
<TabItem value="modelscope" label="Download from ModelScope">
1. Install `git` and `git-lfs` if you haven't already:
```bash
sudo apt-get install git git-lfs
```
2. Create a `models` directory in your current working directory:
```bash
mkdir -p ./models
```
3. Use `git` to clone the model repositories into the `models` directory:
```bash
cd ./models
git lfs install
git clone https://www.modelscope.cn/Qwen/Qwen2.5-Coder-0.5B-Instruct.git
git clone https://www.modelscope.cn/BAAI/bge-large-zh-v1.5.git
cd ..
```
This will download the models into the `./models/Qwen2.5-Coder-0.5B-Instruct` and `./models/bge-large-zh-v1.5` directories.
</TabItem>
<TabItem value="huggingface" label="Download from Hugging Face">
1. Install `git` and `git-lfs` if you haven't already:
```bash
sudo apt-get install git git-lfs
```
2. Create a `models` directory in your current working directory:
```bash
mkdir -p ./models
```
3. Use `git` to clone the model repositories into the `models` directory:
```bash
cd ./models
git lfs install
git clone https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct
git clone https://huggingface.co/BAAI/bge-large-zh-v1.5
cd ..
```
This will download the models into the `./models/Qwen2.5-Coder-0.5B-Instruct` and `./models/bge-large-zh-v1.5` directories.
</TabItem>
</Tabs>
---
### Step 2: Prepare the Configuration File
Create a `toml` file named `dbgpt-local-gpu.toml` and add the following content:
```toml
[models]
[[models.llms]]
name = "Qwen2.5-Coder-0.5B-Instruct"
provider = "hf"
# Specify the model path in the local file system
path = "/app/models/Qwen2.5-Coder-0.5B-Instruct"
[[models.embeddings]]
name = "BAAI/bge-large-zh-v1.5"
provider = "hf"
# Specify the model path in the local file system
path = "/app/models/bge-large-zh-v1.5"
```
This configuration file specifies the local paths to the models inside the Docker container.
---
### Step 3: Run the Docker Container
Run the Docker container with the local `models` directory mounted:
```bash
docker run --ipc host --gpus all \
-it --rm \
-p 5670:5670 \
-v ./dbgpt-local-gpu.toml:/app/configs/dbgpt-local-gpu.toml \
-v ./models:/app/models \
--name dbgpt \
eosphorosai/dbgpt \
dbgpt start webserver --config /app/configs/dbgpt-local-gpu.toml
```
#### Explanation of the Command:
- `--ipc host`: Enables host IPC mode for better performance.
- `--gpus all`: Allows the container to use all available GPUs.
- `-v ./dbgpt-local-gpu.toml:/app/configs/dbgpt-local-gpu.toml`: Mounts the local configuration file into the container.
- `-v ./models:/app/models`: Mounts the local `models` directory into the container.
- `eosphorosai/dbgpt`: The Docker image to use.
- `dbgpt start webserver --config /app/configs/dbgpt-local-gpu.toml`: Starts the webserver with the specified configuration file.
---
### Step 4: Access the Application
Once the container is running, you can visit [http://localhost:5670](http://localhost:5670) in your browser to access the application.
---
### Step 5: Persist Data (Optional)
To ensure that your data is not lost when the container is stopped or removed, you can map the `pilot/data` and `pilot/message` directories to your local machine. These directories store application data and messages.
1. Create local directories for data persistence:
```bash
mkdir -p ./pilot/data
mkdir -p ./pilot/message
mkdir -p ./pilot/alembic_versions
```
2. Modify the `dbgpt-local-gpu.toml` configuration file to point to the correct paths:
```toml
[service.web.database]
type = "sqlite"
path = "/app/pilot/message/dbgpt.db"
```
3. Run the Docker container with the additional volume mounts:
```bash
docker run --ipc host --gpus all \
-it --rm \
-p 5670:5670 \
-v ./dbgpt-local-gpu.toml:/app/configs/dbgpt-local-gpu.toml \
-v ./models:/app/models \
-v ./pilot/data:/app/pilot/data \
-v ./pilot/message:/app/pilot/message \
-v ./pilot/alembic_versions:/app/pilot/meta_data/alembic/versions \
--name dbgpt \
eosphorosai/dbgpt \
dbgpt start webserver --config /app/configs/dbgpt-local-gpu.toml
```
This ensures that the `pilot/data` and `pilot/message` directories are persisted on your local machine.
---
### Summary of Directory Structure
After completing the steps, your directory structure should look like this:
```
.
├── dbgpt-local-gpu.toml
├── models
│ ├── Qwen2.5-Coder-0.5B-Instruct
│ └── bge-large-zh-v1.5
├── pilot
│ ├── data
│ └── message
```
This setup ensures that the models and application data are stored locally and mounted into the Docker container, allowing you to use them without losing data.
```
# Docker-Compose Deployment
## Run via Docker-Compose
This example requires you previde a valid API key for the SiliconFlow API. You can obtain one by signing up at [SiliconFlow](https://siliconflow.cn/) and creating an API key at [API Key](https://cloud.siliconflow.cn/account/ak).
```bash
SILICONFLOW_API_KEY=${SILICONFLOW_API_KEY} docker compose up -d
```
You will see the following output if the deployment is successful.
```bash
[+] Running 3/3
✔ Network dbgptnet Created 0.0s
✔ Container db-gpt-db-1 Started 0.2s
✔ Container db-gpt-webserver-1 Started 0.2s
```
## View log
```bash
docker logs db-gpt-webserver-1 -f
```
:::info note
For more configuration content, you can view the `docker-compose.yml` file
:::
## Visit
Open the browser and visit [http://localhost:5670](http://localhost:5670)
# DB-GPT Integrations
DB-GPT integrates with many datasources and rag storage providers.
Integration Packages
# Datasource Providers
| Provider | Supported | Install Packages |
|-------------|-----------|----------------------|
| MySQL | ✅ | --extra datasource_mysql |
| OceanBase | ✅ | |
| ClickHouse | ✅ | --extra datasource_clickhouse |
| Hive | ✅ | --extra datasource_hive |
| MSSQL | ✅ | --extra datasource_mssql |
| PostgreSQL | ✅ | --extra datasource_postgres |
| ApacheDoris | ✅ | |
| StarRocks | ✅ | --extra datasource_starroks |
| Spark | ✅ | --extra datasource_spark |
| Oracle | ❌ | |
# RAG Storage Providers
| Provider | Supported | Install Packages |
|-------------|-----------|--------------------------------|
| Chroma | ✅ | --extra storage_chroma |
| Milvus | ✅ | --extra storage_milvus |
| Elasticsearch | ✅ | --extra storage_elasticsearch |
| OceanBase | ✅ | --extra storage_obvector |
# Graph RAG Storage Providers
| Provider | Supported | Install Packages |
|----------|-----------|------------------|
| TuGraph | ✅ | --extra graph_rag|
| Neo4j | ❌ | |
# BM25 RAG
In this example, we will show how to use the Elasticsearch as in DB-GPT RAG Storage. Using a Elasticsearch database to implement RAG can, to some extent, alleviate the uncertainty and interpretability issues brought about by Elasticsearch database retrieval.
### Install Dependencies
First, you need to install the `dbgpt elasticsearch storage` library.
```bash
uv sync --all-packages --frozen \
--extra "base" \
--extra "proxy_openai" \
--extra "rag" \
--extra "storage_elasticsearch" \
--extra "dbgpts"
````
### Prepare Elasticsearch
Prepare Elasticsearch database service, reference-[Elasticsearch Installation](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html) .
### Elasticsearch Configuration
Set rag storage variables below in `configs/dbgpt-bm25-rag.toml` file, let DB-GPT know how to connect to Elasticsearch.
```
[rag.storage]
[rag.storage.full_text]
type = "ElasticSearch"
uri = "127.0.0.1"
port = "9200"
```
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-bm25-rag.toml
```
Optionally
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-bm25-rag.toml
```
# ClickHouse
In this example, we will show how to use the ClickHouse as in DB-GPT Datasource. Using a column-oriented database to implement Datasource can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt clickhouse datasource` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "datasource_clickhouse" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts"
```
### Prepare ClickHouse
Prepare ClickHouse database service, reference-[ClickHouse Installation](https://clickhouse.tech/docs/en/getting-started/install/).
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
### ClickHouse Configuration
<p align="left">
<img src={'https://github.com/user-attachments/assets/b506dc5e-2930-49da-b0c0-5ca051cb6c3f'} width="1000px"/>
</p>
# DuckDB
DuckDB is a high-performance analytical database system. It is designed to execute analytical SQL queries fast and efficiently, and it can also be used as an embedded analytical database.
In this example, we will show how to use DuckDB as in DB-GPT Datasource. Using DuckDB to implement Datasource can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt duckdb datasource` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "datasource_duckdb" \
--extra "rag" \
--extra "storage_chromadb" \
```
### Prepare DuckDB
Prepare DuckDB database service, reference-[DuckDB Installation](https://duckdb.org/docs/installation).
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
### DuckDB Configuration
<p align="left">
<img src={'https://github.com/user-attachments/assets/bc5ffc20-4b5b-4e24-8c29-bf5702b0e840'} width="1000px"/>
</p>
\ No newline at end of file
# Graph RAG
In this example, we will show how to use the Graph RAG framework in DB-GPT. Using a graph database to implement RAG can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
You can refer to the python example file `DB-GPT/examples/rag/graph_rag_example.py` in the source code. This example demonstrates how to load knowledge from a document and store it in a graph store. Subsequently, it recalls knowledge relevant to your question by searching for triplets in the graph store.
### Install Dependencies
First, you need to install the `dbgpt graph_rag` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "proxy_openai" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts" \
--extra "graph_rag"
````
### Prepare Graph Database
To store the knowledge in graph, we need an graph database, [TuGraph](https://github.com/TuGraph-family/tugraph-db) is the first graph database supported by DB-GPT.
Visit github repository of TuGraph to view [Quick Start](https://tugraph-db.readthedocs.io/zh-cn/latest/3.quick-start/1.preparation.html#id5) document, follow the instructions to pull the TuGraph database docker image (latest / version >= 4.5.1) and launch it.
```
docker pull tugraph/tugraph-runtime-centos7:4.5.1
docker run -d -p 7070:7070 -p 7687:7687 -p 9090:9090 --name tugraph_demo tugraph/tugraph-runtime-centos7:latest lgraph_server -d run --enable_plugin true
```
The default port for the bolt protocol is `7687`.
> **Download Tips:**
>
> There is also a corresponding version of the TuGraph Docker image package on OSS. You can also directly download and import it.
>
> ```
> wget 'https://tugraph-web.oss-cn-beijing.aliyuncs.com/tugraph/tugraph-4.5.1/tugraph-runtime-centos7-4.5.1.tar' -O tugraph-runtime-centos7-4.5.1.tar
> docker load -i tugraph-runtime-centos7-4.5.1.tar
> ```
### TuGraph Configuration
Set variables below in `configs/dbgpt-graphrag.toml` file, let DB-GPT know how to connect to TuGraph.
```
[rag.storage.graph]
type = "TuGraph"
host="127.0.0.1"
port=7687
username="admin"
password="73@TuGraph"
enable_summary="True"
enable_similarity_search="True"
```
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-graphrag.toml
```
Optionally, you can also use the following command to start the webserver:
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
# Hive
In this example, we will show how to use the Hive as in DB-GPT Datasource. Using Hive to implement Datasource can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt hive datasource` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "datasource_hive" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts"
```
### Prepare Hive
Prepare Hive database service, reference-[Hive Installation](https://cwiki.apache.org/confluence/display/Hive/GettingStarted).
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
### Hive Configuration
<p align="left">
<img src={'https://github.com/user-attachments/assets/40fb83c5-9b12-496f-8249-c331adceb76f'} width="1000px"/>
</p>
# Milvus RAG
In this example, we will show how to use the Milvus as in DB-GPT RAG Storage. Using a graph database to implement RAG can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt milvus storage` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "proxy_openai" \
--extra "rag" \
--extra "storage_milvus" \
--extra "dbgpts"
````
### Prepare Milvus
Prepare Milvus database service, reference-[Milvus Installation](https://milvus.io/docs/install_standalone-docker-compose.md) .
### TuGraph Configuration
Set rag storage variables below in `configs/dbgpt-proxy-openai.toml` file, let DB-GPT know how to connect to Milvus.
```
[rag.storage]
[rag.storage.vector]
type = "Milvus"
uri = "127.0.0.1"
port = "19530"
#username="dbgpt"
#password=19530
```
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
\ No newline at end of file
# MSSQL
In this example, we will show how to use the MSSQL as in DB-GPT Datasource. Using MSSQL to implement Datasource can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt mssql datasource` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "datasource_mssql" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts"
```
### Prepare MSSQL
Prepare MSSQL database service, reference-[MSSQL Installation](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver15).
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
### MSSQL Configuration
<p align="left">
<img src={'https://github.com/user-attachments/assets/2798aaf7-b16f-453e-844a-6ad5dec1d58f'} width="1000px"/>
</p>
# Oceanbase Vector RAG
In this example, we will show how to use the Oceanbase Vector as in DB-GPT RAG Storage. Using a graph database to implement RAG can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt Oceanbase Vector storage` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "proxy_openai" \
--extra "rag" \
--extra "storage_obvector" \
--extra "dbgpts"
````
### Prepare Oceanbase Vector
Prepare Oceanbase Vector database service, reference[Oceanbase Vector](https://open.oceanbase.com/) .
### TuGraph Configuration
Set rag storage variables below in `configs/dbgpt-proxy-openai.toml` file, let DB-GPT know how to connect to Oceanbase Vector.
```
[rag.storage]
[rag.storage.vector]
type = "Oceanbase"
uri = "127.0.0.1"
port = "19530"
#username="dbgpt"
#password=19530
```
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
\ No newline at end of file
# Postgres
Postgres is a powerful, open source object-relational database system. It is a multi-user database management system and has sophisticated features such as Multi-Version Concurrency Control (MVCC), point in time recovery, tablespaces, asynchronous replication, nested transactions (savepoints), online/hot backups, a sophisticated query planner/optimizer, and write ahead logging for fault tolerance.
In this example, we will show how to use Postgres as in DB-GPT Datasource. Using Postgres to implement Datasource can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
### Install Dependencies
First, you need to install the `dbgpt postgres datasource` library.
```bash
uv sync --all-packages \
--extra "base" \
--extra "datasource_postgres" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts"
```
### Prepare Postgres
Prepare Postgres database service, reference-[Postgres Installation](https://www.postgresql.org/download/).
Then run the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
Optionally, you can also use the following command to start the webserver:
```bash
uv run python packages/dbgpt-app/src/dbgpt_app/dbgpt_server.py --config configs/dbgpt-proxy-openai.toml
```
### Postgres Configuration
<p align="left">
<img src={'https://github.com/user-attachments/assets/affa5ef2-09d6-404c-951e-1220a0dce235'} width="1000px"/>
</p>
# Cluster Deployment
## Install command line tools
All the following operations are completed through the `dbgpt` command. To use the `dbgpt` command, you first need to install the `DB-GPT` project. You can install it through the following command
```shell
$ pip install -e ".[default]"
```
It can also be used in script mode
```shell
$ python pilot/scripts/cli_scripts.py
```
## Start Model Controller
```shell
$ dbgpt start controller
```
## View log
```shell
$ docker logs db-gpt-webserver-1 -f
```
By default, `Model Server` will start on port `8000`
## Start Model Worker
:::tip
Start `glm-4-9b-chat` model Worker
:::
```shell
dbgpt start worker --model_name glm-4-9b-chat \
--model_path /app/models/glm-4-9b-chat \
--port 8001 \
--controller_addr http://127.0.0.1:8000
```
:::tip
Start `vicuna-13b-v1.5` model Worker
:::
```shell
dbgpt start worker --model_name vicuna-13b-v1.5 \
--model_path /app/models/vicuna-13b-v1.5 \
--port 8002 \
--controller_addr http://127.0.0.1:8000
```
:::info note
⚠️ Make sure to use your own model name and model path.
:::
## Start Embedding Model Worker
```shell
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr http://127.0.0.1:8000
```
:::info note
⚠️ Make sure to use your own model name and model path.
:::
## Start Reranking Model Worker
```shell
dbgpt start worker --worker_type text2vec \
--rerank \
--model_path /app/models/bge-reranker-base \
--model_name bge-reranker-base \
--port 8004 \
--controller_addr http://127.0.0.1:8000
```
:::info note
⚠️ Make sure to use your own model name and model path.
:::
:::tip
View and inspect deployed models
:::
```shell
$ dbgpt model list
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| glm-4-9b-chat | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 |
| WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 |
| WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 |
| WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 |
| WorkerManager | service | 172.21.0.5 | 8004 | True | True | | 2023-09-12T23:04:24.598439 |
| text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 |
| vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |
| bge-reranker-base | text2vec | 172.21.0.5 | 8004 | True | True | | 2024-05-15T11:36:12.935012 |
+-------------------+------------+------------+------+---------+---------+-----------------+----------------------------+
```
## Use model serving
The model service deployed as above can be used through dbgpt_server. First modify the `.env` configuration file to change the connection model address
```shell
dbgpt start webserver --light
```
## Start Webserver
```shell
LLM_MODEL=vicuna-13b-v1.5
# The current default MODEL_SERVER address is the address of the Model Controller
MODEL_SERVER=http://127.0.0.1:8000
```
`--light` means not to start the embedded model service.
Or it can be started directly by command to formulate the model.
```shell
LLM_MODEL=glm-4-9b-chat dbgpt start webserver --light --remote_embedding
```
## Command line usage
For more information about the use of the command line, you can view the command line help. The following is a reference example.
:::tip
View dbgpt help `dbgpt --help`
:::
```shell
dbgpt --help
Already connect 'dbgpt'
Usage: dbgpt [OPTIONS] COMMAND [ARGS]...
Options:
--log-level TEXT Log level
--version Show the version and exit.
--help Show this message and exit.
Commands:
install Install dependencies, plugins, etc.
knowledge Knowledge command line tool
model Clients that manage model serving
start Start specific server.
stop Start specific server.
trace Analyze and visualize trace spans.
```
:::tip
Check the dbgpt start command `dbgpt start --help`
:::
```shell
dbgpt start --help
Already connect 'dbgpt'
Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...
Start specific server.
Options:
--help Show this message and exit.
Commands:
apiserver Start apiserver
controller Start model controller
webserver Start webserver(dbgpt_server.py)
worker Start model worker
(dbgpt_env) magic@B-4TMH9N3X-2120 ~ %
```
:::tip
View the dbgpt start model service help command `dbgpt start worker --help`
:::
```shell
dbgpt start worker --help
Already connect 'dbgpt'
Usage: dbgpt start worker [OPTIONS]
Start model worker
Options:
--model_name TEXT Model name [required]
--model_path TEXT Model path [required]
--worker_type TEXT Worker type
--worker_class TEXT Model worker class,
pilot.model.cluster.DefaultModelWorker
--model_type TEXT Model type: huggingface, llama.cpp, proxy
and vllm [default: huggingface]
--host TEXT Model worker deploy host [default: 0.0.0.0]
--port INTEGER Model worker deploy port [default: 8001]
--daemon Run Model Worker in background
--limit_model_concurrency INTEGER
Model concurrency limit [default: 5]
--standalone Standalone mode. If True, embedded Run
ModelController
--register Register current worker to model controller
[default: True]
--worker_register_host TEXT The ip address of current worker to register
to ModelController. If None, the address is
automatically determined
--controller_addr TEXT The Model controller address to register
--send_heartbeat Send heartbeat to model controller
[default: True]
--heartbeat_interval INTEGER The interval for sending heartbeats
(seconds) [default: 20]
--log_level TEXT Logging level
--log_file TEXT The filename to store log [default:
dbgpt_model_worker_manager.log]
--tracer_file TEXT The filename to store tracer span records
[default:
dbgpt_model_worker_manager_tracer.jsonl]
--tracer_storage_cls TEXT The storage class to storage tracer span
records
--device TEXT Device to run model. If None, the device is
automatically determined
--prompt_template TEXT Prompt template. If None, the prompt
template is automatically determined from
model path, supported template: zero_shot,vi
cuna_v1.1,llama-2,codellama,alpaca,baichuan-
chat,internlm-chat
--max_context_size INTEGER Maximum context size [default: 4096]
--num_gpus INTEGER The number of gpus you expect to use, if it
is empty, use all of them as much as
possible
--max_gpu_memory TEXT The maximum memory limit of each GPU, only
valid in multi-GPU configuration
--cpu_offloading CPU offloading
--load_8bit 8-bit quantization
--load_4bit 4-bit quantization
--quant_type TEXT Quantization datatypes, `fp4` (four bit
float) and `nf4` (normal four bit float),
only valid when load_4bit=True [default:
nf4]
--use_double_quant Nested quantization, only valid when
load_4bit=True [default: True]
--compute_dtype TEXT Model compute type
--trust_remote_code Trust remote code [default: True]
--verbose Show verbose output.
--help Show this message and exit.
```
:::tip
View dbgpt model service related commands `dbgpt model --help`
:::
```shell
dbgpt model --help
Already connect 'dbgpt'
Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...
Clients that manage model serving
Options:
--address TEXT Address of the Model Controller to connect to. Just support
light deploy model, If the environment variable
CONTROLLER_ADDRESS is configured, read from the environment
variable
--help Show this message and exit.
Commands:
chat Interact with your bot from the command line
list List model instances
restart Restart model instances
start Start model instances
stop Stop model instances
```
# High Availability
## Architecture
Here is the architecture of the high availability cluster, more details can be found in
the [cluster deployment](./cluster.md) mode and [SMMF](../../modules/smmf.md) module.
<p align="center">
<img src={'/img/module/smmf.png'} width="600px" />
</p>
The model worker and API server can be deployed on different machines, and the model
worker and API server can be deployed with multiple instances.
But the model controller has only one instance by default, because it is a stateful
service and stores all metadata of the model service, specifically, all metadata are
stored in the component named **Model Registry**.
The default model registry is `EmbeddedModelRegistry`, which is a simple in-memory component.
To support high availability, we can use `StorageModelRegistry` as the model registry,
it can use a database as the storage backend, such as MySQL, SQLite, etc.
So we can deploy the model controller with multiple instances, and they can share the metadata by connecting to the same database.
Now let's see how to deploy the high availability cluster.
## Deploy High Availability Cluster
For simplicity, we will deploy two model controllers on two machines(`server1` and `server2`),
and deploy a model worker, an embedding model worker, and a web server on another machine(`server3`).
(Of course, you can deploy all of them on the same machine with different ports.)
### Prepare A MySQL Database
1. Install MySQL, create a database and a user for the model controller.
2. Create a table for the model controller, you can use the following SQL script to create the table.
```sql
-- For deploy model cluster of DB-GPT(StorageModelRegistry)
CREATE TABLE IF NOT EXISTS `dbgpt_cluster_registry_instance` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Auto increment id',
`model_name` varchar(128) NOT NULL COMMENT 'Model name',
`host` varchar(128) NOT NULL COMMENT 'Host of the model',
`port` int(11) NOT NULL COMMENT 'Port of the model',
`weight` float DEFAULT 1.0 COMMENT 'Weight of the model',
`check_healthy` tinyint(1) DEFAULT 1 COMMENT 'Whether to check the health of the model',
`healthy` tinyint(1) DEFAULT 0 COMMENT 'Whether the model is healthy',
`enabled` tinyint(1) DEFAULT 1 COMMENT 'Whether the model is enabled',
`prompt_template` varchar(128) DEFAULT NULL COMMENT 'Prompt template for the model instance',
`last_heartbeat` datetime DEFAULT NULL COMMENT 'Last heartbeat time of the model instance',
`user_name` varchar(128) DEFAULT NULL COMMENT 'User name',
`sys_code` varchar(128) DEFAULT NULL COMMENT 'System code',
`gmt_created` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'Record creation time',
`gmt_modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Record update time',
PRIMARY KEY (`id`),
UNIQUE KEY `uk_model_instance` (`model_name`, `host`, `port`, `sys_code`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COMMENT='Cluster model instance table, for registering and managing model instances';
```
### Start Model Controller With Storage Model Registry
We need to start the model controllers on two machines(`server1` and `server2`), and
they will share the metadata by connecting to the same database.
1. Start the model controller on `server1`:
```bash
dbgpt start controller \
--port 8000 \
--registry_type database \
--registry_db_type mysql \
--registry_db_name dbgpt \
--registry_db_host 127.0.0.1 \
--registry_db_port 3306 \
--registry_db_user root \
--registry_db_password aa123456
```
2. Start the model controller on `server2`:
```bash
dbgpt start controller \
--port 8000 \
--registry_type database \
--registry_db_type mysql \
--registry_db_name dbgpt \
--registry_db_host 127.0.0.1 \
--registry_db_port 3306 \
--registry_db_user root \
--registry_db_password aa123456
```
Note: please modify the parameters according to your actual situation.
### Start Model Worker
:::tip
Start `glm-4-9b-chat` model Worker
:::
```shell
dbgpt start worker --model_name glm-4-9b-chat \
--model_path /app/models/glm-4-9b-chat \
--port 8001 \
--controller_addr "http://server1:8000,http://server2:8000"
```
Here we use `server1` and `server2` as the controller address, so the model worker can
register to any healthy controller.
### Start Embedding Model Worker
```shell
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr "http://server1:8000,http://server2:8000"
```
:::info note
⚠️ Make sure to use your own model name and model path.
:::
### Deploy Web Server
```shell
LLM_MODEL=glm-4-9b-chat EMBEDDING_MODEL=text2vec \
dbgpt start webserver \
--light \
--remote_embedding \
--controller_addr "http://server1:8000,http://server2:8000"
```
### Show Your Model Instances
```bash
CONTROLLER_ADDRESS="http://server1:8000,http://server2:8000" dbgpt model list
```
Congratulations! You have successfully deployed a high availability cluster of DB-GPT.
## Deploy High Availability Cluster With Docker Compose
If your want know more about deploying a high availability DB-GPT cluster, you can see
the example of docker compose in `docker/compose_examples/ha-cluster-docker-compose.yml`.
It uses OpenAI LLM and OpenAI embedding model, so you can run it directly.
Here we will show you how to deploy a high availability cluster of DB-GPT with docker compose.
First, build the docker image just include openai dependencies:
```bash
bash ./docker/base/build_proxy_image.sh --pip-index-url https://pypi.tuna.tsinghua.edu.cn/simple
```
Then, run the following command to start the high availability cluster:
```bash
OPENAI_API_KEY="{your api key}" OPENAI_API_BASE="https://api.openai.com/v1" \
docker compose -f ha-cluster-docker-compose.yml up -d
```
## QA
### It will support more model registry types in the future?
Yes. We will support more model registry types in the future, such as `etcd`, `consul`, etc.
### How to deploy the high availability cluster with Kubernetes?
We will provide a Helm chart to deploy the high availability cluster with Kubernetes in the future.
\ No newline at end of file
# Stand-alone Deployment
## Preparation
```bash
# download source code
git clone https://github.com/eosphoros-ai/DB-GPT.git
cd DB-GPT
```
## Environment installation
```bash
# create a virtual environment
conda create -n dbgpt_env python=3.10
# activate virtual environment
conda activate dbgpt_env
```
## Install dependencies
```bash
pip install -e ".[default]"
```
## Model download
Download LLM and Embedding model
:::info note
⚠️ If there are no GPU resources, it is recommended to use the proxy model, such as OpenAI, Qwen, ERNIE Bot, etc.
:::
```bash
mkdir models && cd models
# download embedding model, eg: text2vec-large-chinese
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
```
:::tip
Set up proxy API and modify `.env`configuration
:::
```bash
#set LLM_MODEL TYPE
LLM_MODEL=proxyllm
#set your Proxy Api key and Proxy Server url
PROXY_API_KEY={your-openai-sk}
PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
```
:::info note
⚠️ If you have GPU resources, you can use local models to deploy
:::
```bash
mkdir models && cd models
# # download embedding model, eg: glm-4-9b-chat or
git clone https://huggingface.co/THUDM/glm-4-9b-chat
# download embedding model, eg: text2vec-large-chinese
git clone https://huggingface.co/GanymedeNil/text2vec-large-chinese
popd
```
## Command line startup
```bash
LLM_MODEL=glm-4-9b-chat
dbgpt start webserver --port 6006
```
By default, the `dbgpt start webserver command` will start the `webserver`, `model controller`, and `model worker` through a single Python process. In the above command, port `6006` is specified.
## View and verify model serving
:::tip
view and display all model services
:::
```bash
dbgpt model list
```
```bash
# result
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| glm-4-9b-chat | llm | 172.17.0.9 | 6006 | True | True | | 2023-10-16T19:49:59.201313 |
| WorkerManager | service | 172.17.0.9 | 6006 | True | True | | 2023-10-16T19:49:59.246756 |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
```
Where `WorkerManager` is the management process of `Model Workers`
:::tip
check and verify model serving
:::
```bash
dbgpt model chat --model_name glm-4-9b-chat
```
The above command will launch an interactive page that allows you to talk to the model through the terminal.
```bash
Chatbot started with model glm-4-9b-chat. Type 'exit' to leave the chat.
You: Hello
Bot: Hello! How can I assist you today?
You:
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment