Unverified Commit d1d49397 authored by Alberto Ferrer's avatar Alberto Ferrer Committed by GitHub
Browse files

Update bnb.md with example for OpenAI (#11718)

parent 9c93636d
......@@ -37,3 +37,10 @@ model_id = "huggyllama/llama-7b"
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, \
quantization="bitsandbytes", load_format="bitsandbytes")
```
## OpenAI Compatible Server
Append the following to your 4bit model arguments:
```
--quantization bitsandbytes --load-format bitsandbytes
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment