Note that `device_map` is optional but setting `device_map = 'auto'` is prefered for inference as it will dispatch efficiently the model on the available ressources.
### Running FP4 models - multi GPU setup
The way to load your mixed 8-bit model in multiple GPUs is as follows (same command as single GPU setup):
The way to load your mixed 4-bit model in multiple GPUs is as follows (same command as single GPU setup):