README.md 476 Bytes
Newer Older
yongshk's avatar
yongshk committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Candle Cuda Layer Norm

Layer Norm fused operation for the Candle ML framework.

This Layer was adapted from https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm.

It implements fused dropout + residual + LayerNorm, building on Apex's FastLayerNorm. 

Major changes:

- Add residual.
- Make it work for both pre-norm and post-norm architecture.
- Support more hidden dimensions (all dimensions divisible by 8, up to 8192).
- Implement RMSNorm as an option.