Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
whlwhlwhl
Lightop-SKIILS
Commits
067b04c0
Commit
067b04c0
authored
May 21, 2026
by
whlwhlwhl
Browse files
添加融合算子limit
parent
4b893124
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
44 additions
and
3 deletions
+44
-3
humanize/skills/humanize-kernel-agent-loop/SKILL.md
humanize/skills/humanize-kernel-agent-loop/SKILL.md
+44
-3
No files found.
humanize/skills/humanize-kernel-agent-loop/SKILL.md
View file @
067b04c0
...
...
@@ -105,6 +105,15 @@ that split and both paths point at the same compiled extension.
For a new or modified operator, inspect the nearest existing operator family
first and follow its style.
For a new fused operator, first recover the fusion ingredients from the new
operator name, requested semantics, or provided implementation sketch. Search
LightOp for the pre-fusion single operators and related fused implementations
before designing a new kernel. Use those local implementations as the primary
baseline for API shape, tensor validation, dispatch/config style, correctness
reference, benchmark comparison, and performance expectations. If LightOp has
no matching local baseline, record the search terms and absence, then fall back
to a PyTorch or literal oracle reference.
Core source locations:
```
text
...
...
@@ -120,6 +129,12 @@ lightop/config*.py
Typical add-operator checklist:
-
For fused operators, inspect each component single-op wrapper, binding,
kernel, test, benchmark, and config path, plus any neighboring fused kernels
with similar data movement or epilogue structure.
-
Build the first correctness and benchmark baseline from the unfused LightOp
composition when those component operators exist; otherwise use the nearest
LightOp implementation plus a PyTorch reference.
-
Add or modify HIP/C++ implementation under the closest
`lightop/csrc/`
family. Create a new family only when no existing family fits.
-
Expose the C++ symbol in
`lightop/csrc/export.cpp`
with
`m.def(...)`
.
...
...
@@ -218,13 +233,18 @@ evidence.
threshold.
3.
Inspect the existing wrapper, binding, kernel, config table, tests, and
benchmarks.
4.
Before the first optimization edit, it is recommended to query
4.
For a new fused operator, search by the requested operator name, name
tokens, component op names, and semantics to find LightOp's pre-fusion
single operators and related fused kernels. Record the chosen baseline:
unfused LightOp composition, nearest fused LightOp implementation, PyTorch
reference, or explicit "no local baseline found".
5.
Before the first optimization edit, it is recommended to query
`lightop-kernel-knowledge`
for local LightOp patterns, ROCm/DCU upstream
evidence, Hygon/DCU source references, and portable ideas from the bundled
corpus. Use this whenever it can shape the first implementation route.
5
.
Write a concise research digest in the loop state before the first serious
6
.
Write a concise research digest in the loop state before the first serious
implementation lineage.
6
.
Define the benchmark contract before editing code: exact target shape(s),
7
.
Define the benchmark contract before editing code: exact target shape(s),
dtype/layout/contiguity, axis/mode/epsilon, effective-bandwidth formula,
warmup/repeat counts, selected summary statistic, noise band, and the
benchmark command that will be used for baseline and candidates.
...
...
@@ -327,6 +347,7 @@ Keep Humanize state local and untracked:
.humanize/lightop-agent/research-digest.md
.humanize/lightop-agent/attempt-ledger.md
.humanize/lightop-agent/kernel_opt_readme.md
.humanize/lightop-agent/rlcr-fallback.md
.humanize/lightop-agent/optimization-ledger.md
.humanize/lightop-agent/lineage.jsonl
.humanize/lightop-agent/performance-map.json
...
...
@@ -456,6 +477,9 @@ Write `.humanize/lightop-agent/refined-plan.md` using the Humanize gen-plan
schema. Include acceptance criteria for:
-
LightOp root, target operator family, public API, and modified files.
-
For new fused operators, the LightOp pre-fusion single-op search result,
related fused implementation search result, and chosen baseline/reference
path.
-
Explicit
`K`
,
`R`
,
`W`
, target gfx arch, baseline command, comparison target,
and hard scope exclusions.
-
Workload contract: target shape(s), dtype, layout/contiguity, axis/mode,
...
...
@@ -508,6 +532,23 @@ the loop from the LightOp root:
```
If setup exits non-zero, stop and report the error. Do not bypass the gate.
Exception: if setup fails only because the
`codex`
CLI is unavailable, manual
fallback mode is allowed. Before continuing, write:
```
text
.humanize/lightop-agent/rlcr-fallback.md
.humanize/lightop-agent/refined-plan.md
.humanize/lightop-agent/research-digest.md
.humanize/lightop-agent/attempt-ledger.md
.humanize/lightop-agent/kernel_opt_readme.md
```
`rlcr-fallback.md`
must state that Codex review gate is unavailable, include
the exact setup command and error output, name the missing dependency, and
declare that all build/test/benchmark/profile, device-selection, evidence,
performance-target, low-gain, and logging constraints from this skill still
apply. In fallback mode, proceed manually with the same optimization loop, but
do not claim that Humanize/Codex review was active.
After setup succeeds:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment