[PyTorch] Add sliding window support to FlashAttention (#551)
* add sliding window to FA Signed-off-by:Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix forward logic Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix lint Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * change bert test to causal as unfused does not support padding Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix FlashAttention for v2-2.3 versions Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * verify FA swa works Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix mask related restrictions and duplicate code after merge Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix swa test Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add docstring for get_swa func Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * move repeated code into a function Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * revert mask change Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add determinism filter and fix FA warning message Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add message for determinism filter Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * simplify check_set_window_size() Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix check_set_window_size in transformer layers Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix indent Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Signed-off-by:
cyanguwa <8636796+cyanguwa@users.noreply.github.com>
Showing
Please register or sign in to comment