"megatron/inference/gpt/__init__.py" did not exist on "3aca141586a4b8cdc983c3ecf5f7baf60506c7f8"
-
tpoisonooo authored
* feat(src): add int8 and compile passed * feat(kernels): fix * feat(llama): update kernel * feat(src): add debug * fix(kernel): k_cache use int8_t pointer * style(llama): clean code * feat(deploy.py): revert to enable fmha * style(LlamaV2): clean code * feat(deploy.py): add default quant policy
cc93136e