linear4bit.mdx 745 Bytes
Newer Older
Steven Liu's avatar
Steven Liu committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 4-bit quantization

[QLoRA](https://hf.co/papers/2305.14314) is a finetuning method that quantizes a model to 4-bits and adds a set of low-rank adaptation (LoRA) weights to the model and tuning them through the quantized weights. This method also introduces a new data type, 4-bit NormalFloat (`LinearNF4`) in addition to the standard Float4 data type (`LinearFP4`). `LinearNF4` is a quantization data type for normally distributed data and can improve performance.

## Linear4bit

[[autodoc]] bitsandbytes.nn.Linear4bit
    - __init__

## LinearFP4

[[autdodoc]] bitsandbytes.nn.LinearFP4
    - __init__

## LinearNF4

[[autodoc]] bitsandbytes.nn.LinearNF4
    - __init__

## Params4bit

[[autodoc]] bitsandbytes.nn.Params4bit
    - __init__