* partial offloading: allow flash attention and disable mmap * allow mmap with num_gpu=0
Attach a file by drag & drop or click to upload