"router/vscode:/vscode.git/clone" did not exist on "9d8f21cace66e8593bc559174de0eed3ecdab6a2"
  • Nicolas Patry's avatar
    Prefix caching (#2402) · b70ae096
    Nicolas Patry authored
    
    
    * Prefix caching WIP
    
    * Fixing prefix attention.
    
    * Fixing flashinfer import.
    
    * Fixing black.
    
    * Fixing medusa (still wrong outputs, but functional).
    
    * Just medusa values now.
    
    * Fixing medusa without prefix caching.
    
    * Fixing prefix caching.
    
    * Medusa requires reshaping.
    
    * Removing the logs.
    
    * Remove router.nix
    
    * Fixup:
    
    - Remove logs
    - Disable VLMs (they do not work)
    - Disable prefix caching when user wants prefill logprobs.
    
    * Update flake.lock
    
    ---------
    Co-authored-by: default avatarDaniël de Kok <me@danieldk.eu>
    b70ae096
flake.lock 28.5 KB