"torchvision/csrc/ops/roi_align.h" did not exist on "6cb4fc21c2cd2f9bbc59b42bae64831cecdb45a4"
  • Daniël de Kok's avatar
    Improve support for GPUs with capability < 8 (#2575) · 5b6b74e2
    Daniël de Kok authored
    * Improve support for GPUs with capability < 8
    
    - For models that cannot use flashinfer, use flash-attn v1 + paged
      attention for models with a compute capability older than 8.
    - Disable prefix caching when using paged attention.
    - When using flash-attn v1, pass the key/value, rather than the
      cache, since v1 cannot use block tables.
    
    * nix: add flash-attn-v1 to the server environment
    
    * Move disabling prefix caching into the block of exceptions
    
    * Capability as `usize`s
    5b6b74e2
impure-shell.nix 874 Bytes