Unverified Commit cd00c443 authored by Zhiyu's avatar Zhiyu Committed by GitHub
Browse files

[Misc] Rename TensorRT Model Optimizer to Model Optimizer (#30091)


Signed-off-by: default avatarZhiyu Cheng <zhiyuc@nvidia.com>
parent d1432712
...@@ -14,7 +14,7 @@ Contents: ...@@ -14,7 +14,7 @@ Contents:
- [INT4 W4A16](int4.md) - [INT4 W4A16](int4.md)
- [INT8 W8A8](int8.md) - [INT8 W8A8](int8.md)
- [FP8 W8A8](fp8.md) - [FP8 W8A8](fp8.md)
- [NVIDIA TensorRT Model Optimizer](modelopt.md) - [NVIDIA Model Optimizer](modelopt.md)
- [AMD Quark](quark.md) - [AMD Quark](quark.md)
- [Quantized KV Cache](quantized_kvcache.md) - [Quantized KV Cache](quantized_kvcache.md)
- [TorchAO](torchao.md) - [TorchAO](torchao.md)
......
# NVIDIA TensorRT Model Optimizer # NVIDIA Model Optimizer
The [NVIDIA TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) is a library designed to optimize models for inference with NVIDIA GPUs. It includes tools for Post-Training Quantization (PTQ) and Quantization Aware Training (QAT) of Large Language Models (LLMs), Vision Language Models (VLMs), and diffusion models. The [NVIDIA Model Optimizer](https://github.com/NVIDIA/Model-Optimizer) is a library designed to optimize models for inference with NVIDIA GPUs. It includes tools for Post-Training Quantization (PTQ) and Quantization Aware Training (QAT) of Large Language Models (LLMs), Vision Language Models (VLMs), and diffusion models.
We recommend installing the library with: We recommend installing the library with:
...@@ -10,7 +10,7 @@ pip install nvidia-modelopt ...@@ -10,7 +10,7 @@ pip install nvidia-modelopt
## Quantizing HuggingFace Models with PTQ ## Quantizing HuggingFace Models with PTQ
You can quantize HuggingFace models using the example scripts provided in the TensorRT Model Optimizer repository. The primary script for LLM PTQ is typically found within the `examples/llm_ptq` directory. You can quantize HuggingFace models using the example scripts provided in the Model Optimizer repository. The primary script for LLM PTQ is typically found within the `examples/llm_ptq` directory.
Below is an example showing how to quantize a model using modelopt's PTQ API: Below is an example showing how to quantize a model using modelopt's PTQ API:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment