"tests/python/vscode:/vscode.git/clone" did not exist on "badeaf19dc3d8c888aa8aed3fac6b0384858c559"
Unverified Commit 5a5f1843 authored by Chayenne's avatar Chayenne Committed by GitHub
Browse files

Fix docs ci (#1888)

parent 7b394e5f
...@@ -4,18 +4,19 @@ ...@@ -4,18 +4,19 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Native API\n", "# Native APIs\n",
"\n", "\n",
"Apart from the OpenAI compatible API, the SGLang Runtime also provides its native server API. We introduce these following API:\n", "Apart from the OpenAI compatible APIs, the SGLang Runtime also provides its native server APIs. We introduce these following APIs:\n",
"\n", "\n",
"- `/generate`\n", "- `/generate`\n",
"- `/update_weights`\n",
"- `/get_server_args`\n", "- `/get_server_args`\n",
"- `/get_model_info`\n", "- `/get_model_info`\n",
"- `/health`\n", "- `/health`\n",
"- `/health_generate`\n", "- `/health_generate`\n",
"- `/flush_cache`\n", "- `/flush_cache`\n",
"- `/get_memory_pool_size`\n", "- `/get_memory_pool_size`\n",
"- `/update_weights`\n",
"- `/encode`\n",
"\n", "\n",
"We mainly use `requests` to test these APIs in the following examples. You can also use `curl`." "We mainly use `requests` to test these APIs in the following examples. You can also use `curl`."
] ]
...@@ -68,7 +69,7 @@ ...@@ -68,7 +69,7 @@
"import requests\n", "import requests\n",
"\n", "\n",
"url = \"http://localhost:30010/generate\"\n", "url = \"http://localhost:30010/generate\"\n",
"data = {\"text\": \"List 3 countries and their capitals.\"}\n", "data = {\"text\": \"What is the capital of France?\"}\n",
"\n", "\n",
"response = requests.post(url, json=data)\n", "response = requests.post(url, json=data)\n",
"print_highlight(response.text)" "print_highlight(response.text)"
...@@ -78,7 +79,7 @@ ...@@ -78,7 +79,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Get Server Args\n", "## Get Server Args\n",
"\n", "\n",
"Used to get the serving args when the server is launched." "Used to get the serving args when the server is launched."
] ]
...@@ -252,13 +253,57 @@ ...@@ -252,13 +253,57 @@
")" ")"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Encode\n",
"\n",
"Used to encode text into embeddings. Note that this API is only available for [embedding models](./openai_embedding_api.ipynb) and will raise an error for generation models.\n",
"Therefore, we launch a new server to server an embedding model.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"terminate_process(server_process)\n",
"\n",
"embedding_process = execute_shell_command(\n",
" \"\"\"\n",
"python -m sglang.launch_server --model-path Alibaba-NLP/gte-Qwen2-7B-instruct \\\n",
" --port 30020 --host 0.0.0.0 --is-embedding\n",
"\"\"\"\n",
")\n",
"\n",
"wait_for_server(\"http://localhost:30020\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# successful encode for embedding model\n",
"\n",
"url = \"http://localhost:30020/encode\"\n",
"data = {\"model\": \"Alibaba-NLP/gte-Qwen2-7B-instruct\", \"text\": \"Once upon a time\"}\n",
"\n",
"response = requests.post(url, json=data)\n",
"response_json = response.json()\n",
"print_highlight(f\"Text embedding (first 10): {response_json['embedding'][:10]}\")"
]
},
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 22, "execution_count": 43,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"terminate_process(server_process)" "terminate_process(embedding_process)"
] ]
} }
], ],
......
...@@ -201,7 +201,7 @@ ...@@ -201,7 +201,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": 6,
"metadata": { "metadata": {
"execution": { "execution": {
"iopub.execute_input": "2024-11-01T02:48:01.875204Z", "iopub.execute_input": "2024-11-01T02:48:01.875204Z",
......
...@@ -81,7 +81,7 @@ ...@@ -81,7 +81,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Using Requests" "## Using Python Requests"
] ]
}, },
{ {
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment