--- id: ref-liger-kernel repo: linkedin/Liger-Kernel title: Liger Kernel Triton Kernels for LLM Training url: https://github.com/linkedin/Liger-Kernel source_type: source-reference source_category: open-triton-kernel-library architectures: - amd - nvidia - rocm - dcu tags: - triton - liger-kernel - llm-training - rmsnorm - rope - swiglu - cross-entropy - fused-linear-cross-entropy - loss - amd techniques: - llm-training-kernel - autograd-wrapper - fused-epilogue - memory-reduction - benchmark hardware_features: - wavefront - vectorization - cache kernel_types: - normalization - rotary - activation - loss - fused-linear languages: - python - triton captured_at: '2026-05-26' license: not-captured source_paths: - src/liger_kernel - test - benchmark - examples - docs - README.md --- # Liger Kernel Triton Kernels For LLM Training - Repository: `linkedin/Liger-Kernel` - Source: [linkedin/Liger-Kernel](https://github.com/linkedin/Liger-Kernel) ## Route Fit Use Liger Kernel when the Triton task is training-side LLM work: RMSNorm, RoPE, SwiGLU, cross entropy, fused linear + loss, or memory-saving fused epilogues. It is most useful for direct-file Triton structure, PyTorch autograd wrappers, and correctness/benchmark patterns around training kernels. ## What To Inspect - `src/liger_kernel` for wrappers and Triton kernels. - `test` for numerical tolerance and training-shape coverage. - `benchmark` and `examples` for measuring memory and runtime tradeoffs. ## DCU Use Notes Use Liger as an algorithmic and harness reference. When porting to DCU, verify dtype support, generated ISA, and resource pressure on the target card before claiming a performance win. ## Query Hooks ```bash python3 scripts/query.py "liger triton rmsnorm swiglu" --type source-reference --compact python3 scripts/query.py "liger fused linear cross entropy" --type source-reference --compact python3 scripts/get_page.py ref-liger-kernel ```