lamb.mdx 693 Bytes
Newer Older
Steven Liu's avatar
Steven Liu committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# LAMB

[LAMB (Layerwise adaptive large batch optimization)](https://hf.co/papers/1904.00962) is an adaptive optimizer designed for training with large batch sizes to accelerate training, combining ideas from [`LARS`] and [`Adam`] to automatically scale the learning rate for each layer:

- calculates a *trust ratio* between the weight and gradient norm in a layer and clips the ratio to prevent overly large or small updates
- updates weights with the first and second-moments

## LAMB[[api-class]]

[[autodoc]] bitsandbytes.optim.LAMB
    - __init__

## LAMB8bit

[[autodoc]] bitsandbytes.optim.LAMB8bit
    - __init__

## LAMB32bit

[[autodoc]] bitsandbytes.optim.LAMB32bit
    - __init__