1. 10 Sep, 2025 1 commit
  2. 03 Sep, 2025 1 commit
    • Muyang Li's avatar
      feat: async CPU offloading for Python backend (#624) · eb901251
      Muyang Li authored
      * tmp
      
      * update
      
      * update
      
      * finished the offloading impl
      
      * the offloading is buggy
      
      * update utils
      
      * the offloading is still buggy
      
      * update
      
      * correctness and speedup done; need to check the vram overhead
      
      * done
      
      * final debugging
      
      * update
      
      * update
      
      * correct now
      
      * fix
      
      * update
      
      * use per-layer offloading
      
      * fix the offloading on 5090
      
      * support setting the num_blocks_on_gpu
      
      * change the import name
      eb901251
  3. 27 Aug, 2025 1 commit