[Doc] Update examples to remove SparseAutoModelForCausalLM (#12062)

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

[Doc] Update examples to remove SparseAutoModelForCausalLM (#12062)
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
3f9b7ab9 · Kyle Sayers · GitHub · ad34c0df · 3f9b7ab9 · 3f9b7ab9
Unverified Commit 3f9b7ab9 authored Jan 15, 2025 by Kyle Sayers Committed by GitHub Jan 15, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 10 deletions

docs/source/features/quantization/fp8.md docs/source/features/quantization/fp8.md +5 -6

docs/source/features/quantization/int8.md docs/source/features/quantization/int8.md +3 -4

No files found.
--- a/docs/source/features/quantization/fp8.md
+++ b/docs/source/features/quantization/fp8.md
@@ -54,16 +54,15 @@ The quantization process involves three main steps:

 ### 1. Loading the Model

-Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models:
+Load your model and tokenizer using the standard `transformers` AutoModel classes:

 ```python
-from llmcompressor.transformers import SparseAutoModelForCausalLM
-from transformers import AutoTokenizer
+from transformers import AutoTokenizer, AutoModelForCausalLM

 MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"
-
-model = SparseAutoModelForCausalLM.from_pretrained(
-  MODEL_ID, device_map="auto", torch_dtype="auto")
+model = AutoModelForCausalLM.from_pretrained(
+    MODEL_ID, device_map="auto", torch_dtype="auto",
+)
 tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
 ```


--- a/docs/source/features/quantization/int8.md
+++ b/docs/source/features/quantization/int8.md
@@ -30,14 +30,13 @@ The quantization process involves four main steps:

 ### 1. Loading the Model

-Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models:
+Load your model and tokenizer using the standard `transformers` AutoModel classes:

 ```python
-from llmcompressor.transformers import SparseAutoModelForCausalLM
-from transformers import AutoTokenizer
+from transformers import AutoTokenizer, AutoModelForCausalLM

 MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"
-model = SparseAutoModelForCausalLM.from_pretrained(
+model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID, device_map="auto", torch_dtype="auto",
 )
 tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)