feat: pythonized model and QwenImage Support (#593)
* start refract the codebase * update * update * start to implement ops * add gemm * write the docstrings * define the w4a4 svdq linear * update * make the linter happy * finished the SVDQW4A4Linear * finished the SVDQW4A4Linear * update * update * add a patcher to the model * update * add adanormsinglezero * update * update * finished the naive implementation of nunchaku flux * add ff * finished the naive forward * update * svdq linear * start debugging * fix some issues * successfully built the model * update * successfully load the model * update * update * update * try to making it runnable * debugging * debugging * debugging * add bias to awq linear * run through * fix the normalization * update * update * update * fix the attention * fix the no fuse nvfp models * update * finished the fused ff * make linter happy * make linter happy * make linter happy * debugging the fp16 attn * nunchaku fp16 is buggy * finish the fp16 attn * fp4 done * fix the lora scales * add a default value for alpha; need to debug int4 * fix input4 * update * update * ff does not work * specialize the processors * qwen transformer done. start debugging * make linter happy * add schnell v2 for metrics eval * chore: schnellv2 eval * update * ff and attention correct * need to check what happened to module * fp4 done * make linter happy * update an example script * reformat * add an example script * add the annoucement * remove a misleading info * ready to release
Showing
examples/v1/qwen-image.py
0 → 100644
nunchaku/models/attention.py
0 → 100644
nunchaku/models/linear.py
0 → 100644
This diff is collapsed.
nunchaku/models/utils.py
0 → 100644
nunchaku/ops/fused.py
0 → 100644
Please register or sign in to comment