* modify requirements, fix CI container * fix format error * remove torch1.11 kernel cache * add cache back
* add CI, update Dockerfile * remove useless loop in inference, add some comments to Attention * update inference test and CI * fix path * add pytest for test env * add einops install * add cuda cache, loose the condition