"...csrc/cpu/git@developer.sourcefind.cn:OpenDAS/vision.git" did not exist on "a9e4cea0cdb350de950b9bccd989fad19f826d8d"
Add Phi-3 medium support (#2039)
Add support for Phi-3-medium The main difference between the medium and mini models is that medium uses grouped query attention with a packed QKV matrix. This change adds support for GQA with packed matrixes to `Weights.get_weights_col_packed` and uses it for Phi-3. This also allows us to remove the custom implementation of GQA from dbrx attention loading.
Showing
Please register or sign in to comment