linear4bit.mdx 744 Bytes
Newer Older
Steven Liu's avatar
Steven Liu committed
1
2
3
4
5
6
7
8
9
10
11
# 4-bit quantization

[QLoRA](https://hf.co/papers/2305.14314) is a finetuning method that quantizes a model to 4-bits and adds a set of low-rank adaptation (LoRA) weights to the model and tuning them through the quantized weights. This method also introduces a new data type, 4-bit NormalFloat (`LinearNF4`) in addition to the standard Float4 data type (`LinearFP4`). `LinearNF4` is a quantization data type for normally distributed data and can improve performance.

## Linear4bit

[[autodoc]] bitsandbytes.nn.Linear4bit
    - __init__

## LinearFP4

Titus's avatar
Titus committed
12
[[autodoc]] bitsandbytes.nn.LinearFP4
Steven Liu's avatar
Steven Liu committed
13
14
15
16
17
18
19
20
21
22
23
    - __init__

## LinearNF4

[[autodoc]] bitsandbytes.nn.LinearNF4
    - __init__

## Params4bit

[[autodoc]] bitsandbytes.nn.Params4bit
    - __init__