discover/gpu.go · aa45f7ce27f41ce28e08701cd7b0ef6671646053 · OpenDAS / ollama

"examples/pytorch/pointcloud/bipointnet/train_cls.py" did not exist on "558673e139790b6c1d9e2403034b5b35b704324b"

discover: Disable flash attention for Jetson Xavier (CC 7.2) · aa45f7ce

Jesse Gross authored Oct 07, 2025

GGML picks the wrong kernel and these systems fail with:
Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437:
ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu
was compiled for: __CUDA_ARCH_LIST__

Fixes #12442

aa45f7ce

gpu.go 4.5 KB

Replace gpu.go