This document describes how to set up an AMD-based environment for [SGLang](https://github.com/sgl-project/sglang). If you encounter issues or have questions, please [open an issue](https://github.com/sgl-project/sglang/issues) on the SGLang repository.
This document describes how to set up an AMD-based environment for [SGLang](https://github.com/sgl-project/sglang). If you encounter issues or have questions, please [open an issue](https://github.com/sgl-project/sglang/issues) on the SGLang repository.
## System Configure
## System Configuration
When using AMD GPUs (such as MI300X), certain system-level optimizations help ensure stable performance. Here we take MI300X as an example. AMD provides official documentation for MI300X optimization and system tuning:
When using AMD GPUs (such as MI300X), certain system-level optimizations help ensure stable performance. Here we take MI300X as an example. AMD provides official documentation for MI300X optimization and system tuning:
...
@@ -13,9 +11,9 @@ When using AMD GPUs (such as MI300X), certain system-level optimizations help en
...
@@ -13,9 +11,9 @@ When using AMD GPUs (such as MI300X), certain system-level optimizations help en
-[AMD Instinct MI300X System Optimization](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html)
-[AMD Instinct MI300X System Optimization](https://rocm.docs.amd.com/en/latest/how-to/system-optimization/mi300x.html)
4. To verify the utility, you can run a benchmark in another terminal or refer to [other docs](https://docs.sglang.ai/backend/openai_api_completions.html) to send requests to the engine.
4. To verify the utility, you can run a benchmark in another terminal or refer to [other docs](https://docs.sglang.ai/backend/openai_api_completions.html) to send requests to the engine.
```bash
```bash
drun sglang_image \
drun sglang_image \
python3 -m sglang.bench_serving \
python3 -m sglang.bench_serving \
--backend sglang \
--backend sglang \
--dataset-name random \
--dataset-name random \
--num-prompts 4000 \
--num-prompts 4000 \
--random-input 128 \
--random-input 128 \
--random-output 128
--random-output 128
```
```
With your AMD system properly configured and SGLang installed, you can now fully leverage AMD hardware to power SGLang’s machine learning capabilities.
With your AMD system properly configured and SGLang installed, you can now fully leverage AMD hardware to power SGLang’s machine learning capabilities.
...
@@ -108,7 +105,7 @@ With your AMD system properly configured and SGLang installed, you can now fully
...
@@ -108,7 +105,7 @@ With your AMD system properly configured and SGLang installed, you can now fully
### Running DeepSeek-V3
### Running DeepSeek-V3
The only difference in running DeepSeek-V3 is when starting the server. Here's an example command:
The only difference when running DeepSeek-V3 is in how you start the server. Here's an example command:
```bash
```bash
drun -p 30000:30000 \
drun -p 30000:30000 \
...
@@ -128,7 +125,7 @@ drun -p 30000:30000 \
...
@@ -128,7 +125,7 @@ drun -p 30000:30000 \
### Running Llama3.1
### Running Llama3.1
Running Llama3.1 is nearly identical. The only difference is in the model specified when starting the server, shown by the following example command:
Running Llama3.1 is nearly identical to running DeepSeek-V3. The only difference is in the model specified when starting the server, shown by the following example command:
```bash
```bash
drun -p 30000:30000 \
drun -p 30000:30000 \
...
@@ -146,4 +143,4 @@ drun -p 30000:30000 \
...
@@ -146,4 +143,4 @@ drun -p 30000:30000 \
### Warmup Step
### Warmup Step
When the server displays "The server is fired up and ready to roll!", it means the startup is successful.
When the server displays `The server is fired up and ready to roll!`, it means the startup is successful.