• Tim Dettmers's avatar
    4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) · 9d73b922
    Tim Dettmers authored
    
    
    * Added lion and paged optimizers and made original tests pass.
    
    * Added tests for paged and lion optimizers.
    
    * Added and fixed optimizer tests.
    
    * Style and quality checks.
    
    * Initial draft. Some tests fail.
    
    * Fixed dtype bug.
    
    * Fixed bug caused by torch_dtype='auto'.
    
    * All test green for 8-bit and 4-bit layers.
    
    * Added fix for fp32 layer norms and bf16 compute in LLaMA.
    
    * Initial draft. Some tests fail.
    
    * Fixed dtype bug.
    
    * Fixed bug caused by torch_dtype='auto'.
    
    * All test green for 8-bit and 4-bit layers.
    
    * Added lion and paged optimizers and made original tests pass.
    
    * Added tests for paged and lion optimizers.
    
    * Added and fixed optimizer tests.
    
    * Style and quality checks.
    
    * Fixing issues for PR #23479.
    
    * Added fix for fp32 layer norms and bf16 compute in LLaMA.
    
    * Reverted variable name change.
    
    * Initial draft. Some tests fail.
    
    * Fixed dtype bug.
    
    * Fixed bug caused by torch_dtype='auto'.
    
    * All test green for 8-bit and 4-bit layers.
    
    * Added lion and paged optimizers and made original tests pass.
    
    * Added tests for paged and lion optimizers.
    
    * Added and fixed optimizer tests.
    
    * Style and quality checks.
    
    * Added missing tests.
    
    * Fixup changes.
    
    * Added fixup changes.
    
    * Missed some variables to rename.
    
    * revert trainer tests
    
    * revert test trainer
    
    * another revert
    
    * fix tests and safety checkers
    
    * protect import
    
    * simplify a bit
    
    * Update src/transformers/trainer.py
    
    * few fixes
    
    * add warning
    
    * replace with `load_in_kbit = load_in_4bit or load_in_8bit`
    
    * fix test
    
    * fix tests
    
    * this time fix tests
    
    * safety checker
    
    * add docs
    
    * revert torch_dtype
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    
    * multiple fixes
    
    * update docs
    
    * version checks and multiple fixes
    
    * replace `is_loaded_in_kbit`
    
    * replace `load_in_kbit`
    
    * change methods names
    
    * better checks
    
    * oops
    
    * oops
    
    * address final comments
    
    ---------
    Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
    Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    9d73b922
perf_infer_gpu_one.mdx 8.67 KB