Unverified Commit 2d23b42d authored by youkaichao's avatar youkaichao Committed by GitHub
Browse files

[doc] update pipeline parallel in readme (#6347)

parent 1df43de9
...@@ -56,7 +56,7 @@ vLLM is flexible and easy to use with: ...@@ -56,7 +56,7 @@ vLLM is flexible and easy to use with:
- Seamless integration with popular Hugging Face models - Seamless integration with popular Hugging Face models
- High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more - High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
- Tensor parallelism support for distributed inference - Tensor parallelism and pipieline parallelism support for distributed inference
- Streaming outputs - Streaming outputs
- OpenAI-compatible API server - OpenAI-compatible API server
- Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs - Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs
......
...@@ -38,7 +38,7 @@ vLLM is flexible and easy to use with: ...@@ -38,7 +38,7 @@ vLLM is flexible and easy to use with:
* Seamless integration with popular HuggingFace models * Seamless integration with popular HuggingFace models
* High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more * High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
* Tensor parallelism support for distributed inference * Tensor parallelism and pipieline parallelism support for distributed inference
* Streaming outputs * Streaming outputs
* OpenAI-compatible API server * OpenAI-compatible API server
* Support NVIDIA GPUs and AMD GPUs * Support NVIDIA GPUs and AMD GPUs
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment