Update README.md (#6847)

b5f49ee5 · Gurpreet Singh Dhami · GitHub · 150a1ffb · b5f49ee5
Unverified Commit b5f49ee5 authored Jul 26, 2024 by Gurpreet Singh Dhami Committed by GitHub Jul 27, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

examples/fp8/quantizer/README.md examples/fp8/quantizer/README.md +1 -1

No files found.
--- a/examples/fp8/quantizer/README.md
+++ b/examples/fp8/quantizer/README.md
@@ -16,7 +16,7 @@
 #### Run on H100 system for speed if FP8; number of GPUs depends on the model size

 #### Example: quantize Llama2-7b model from HF to FP8 with FP8 KV Cache:
-`python quantize.py --model_dir ./ll2-7b --dtype float16 --qformat fp8 --kv_cache_dtype fp8 --output_dir ./ll2_7b_fp8 --calib_size 512 --tp_size 1`
+`python quantize.py --model-dir ./ll2-7b --dtype float16 --qformat fp8 --kv-cache-dtype fp8 --output-dir ./ll2_7b_fp8 --calib-size 512 --tp-size 1`

 Outputs: model structure, quantized model & parameters (with scaling factors) are in JSON and Safetensors (npz is generated only for the reference)
 ```