huggingface-kernels.md 2.12 KB
Newer Older
whlwhlwhl's avatar
whlwhlwhl committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
id: ref-huggingface-kernels
repo: huggingface/kernels
title: Hugging Face Kernels and kernels-community Hub
url: https://github.com/huggingface/kernels
source_type: source-reference
source_category: open-triton-kernel-library
architectures:
- amd
- nvidia
- rocm
- dcu
tags:
- triton
- huggingface
- kernels-community
- kernel-hub
- paged-attention
- triton-moe
- triton-scaled-mm
- rmsnorm
- rotary
- quantization
techniques:
- kernel-package
- direct-file-harness
- reference-discovery
- hub-source-search
hardware_features:
- wavefront
- lds
- mfma
- cache
kernel_types:
- attention
- moe
- gemm
- normalization
- rotary
- quantization
languages:
- python
- triton
captured_at: '2026-05-26'
license: not-captured
source_paths:
- src
- kernel-builder
- tests
- README.md
- https://huggingface.co/kernels
- https://huggingface.co/kernels-community
---
# Hugging Face Kernels And kernels-community Hub

- Repository: `huggingface/kernels`
- Source: [huggingface/kernels](https://github.com/huggingface/kernels)
- Hub: [huggingface.co/kernels](https://huggingface.co/kernels)

## Route Fit

Use Hugging Face kernels as a discovery route for small, packageable Triton
kernels and direct-file references. It is useful for finding paged attention,
Triton MoE, scaled MM, RMSNorm, rotary, and quantization examples that are
easier to inspect than a full serving framework.

## What To Inspect

- Kernel package metadata and source links on the Hub.
- `tests` and package examples for minimal wrappers.
- Whether the package declares ROCm/AMD support or only works on CUDA.

## DCU Use Notes

Treat Hub kernels as candidates, not proof. Before adapting one, capture the
source URL, package revision, license/notice, and then run direct correctness,
benchmark, and Triton cache/profiler proof on DCU.

## Query Hooks

```bash
python3 scripts/query.py "huggingface kernels triton moe" --type source-reference --compact
python3 scripts/query.py "kernels-community paged attention triton" --type source-reference --compact
python3 scripts/query.py "triton scaled mm kernels community" --type source-reference --compact
python3 scripts/get_page.py ref-huggingface-kernels
```