"example/01_gemm/gemm_xdl_fp16.cpp" did not exist on "4d40b1974e18e9215067fb4b1117213e69a2923e"
* add CI, update Dockerfile * remove useless loop in inference, add some comments to Attention * update inference test and CI * fix path * add pytest for test env * add einops install * add cuda cache, loose the condition