--- title: "Unsloth" description: "Hyper-optimized QLoRA finetuning for single GPUs" --- ### Overview Unsloth provides hand-written optimized kernels for LLM finetuning that slightly improve speed and VRAM over standard industry baselines. ::: {.callout-important} Due to breaking changes in transformers `v4.48.0`, users will need to downgrade to `<=v4.47.1` to use this patch. This will later be deprecated in favor of [LoRA Optimizations](lora_optims.qmd). ::: ### Installation The following will install the correct unsloth and extras from source. ```bash python scripts/unsloth_install.py | sh ``` ### Usage Axolotl exposes a few configuration options to try out unsloth and get most of the performance gains. Our unsloth integration is currently limited to the following model architectures: - llama These options are specific to LoRA finetuning and cannot be used for multi-GPU finetuning ```yaml unsloth_lora_mlp: true unsloth_lora_qkv: true unsloth_lora_o: true ``` These options are composable and can be used with multi-gpu finetuning ```yaml unsloth_cross_entropy_loss: true unsloth_rms_norm: true unsloth_rope: true ``` ### Limitations - Single GPU only; e.g. no multi-gpu support - No deepspeed or FSDP support (requires multi-gpu) - LoRA + QLoRA support only. No full fine tunes or fp8 support. - Limited model architecture support. Llama, Phi, Gemma, Mistral only - No MoE support.