Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D DeepSeek-R1_ollama
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 4
    • Issues 4
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • ModelZoo
  • DeepSeek-R1_ollama
  • Issues
  • #2

Closed
Open
Created Mar 04, 2025 by ychan@ychan

找不到tensile,

time=2025-03-05T04:03:25.037+08:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-0000000000001000 library=rocm variant="" compute=gfx926 driver=5.16 name=1d94:61b7 total="32.0 GiB" available="32.0 GiB" [GIN] 2025/03/05 - 04:03:59 | 200 | 155.921µs | 127.0.0.1 | HEAD "/" [GIN] 2025/03/05 - 04:03:59 | 200 | 25.547044ms | 127.0.0.1 | POST "/api/show" time=2025-03-05T04:04:00.022+08:00 level=INFO source=sched.go:730 msg="new model will fit in available VRAM, loading" model=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc library=rocm parallel=4 required="4.9 GiB" time=2025-03-05T04:04:00.022+08:00 level=INFO source=server.go:104 msg="system memory" total="249.5 GiB" free="198.0 GiB" free_swap="0 B" time=2025-03-05T04:04:00.023+08:00 level=INFO source=memory.go:356 msg="offload to rocm" layers.requested=-1 layers.model=29 layers.offload=29 layers.split=8,7,7,7 memory.available="[32.0 GiB 32.0 GiB 32.0 GiB 32.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="4.9 GiB" memory.required.partial="4.9 GiB" memory.required.kv="224.0 MiB" memory.required.allocations="[1.4 GiB 1.2 GiB 1.2 GiB 1.2 GiB]" memory.weights.total="976.1 MiB" memory.weights.repeating="793.5 MiB" memory.weights.nonrepeating="182.6 MiB" memory.graph.full="482.3 MiB" memory.graph.partial="482.3 MiB" time=2025-03-05T04:04:00.034+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/mnt/ollama/inference/down_load_ollama/ollama/llama/build/linux-amd64/runners/rocm_avx/ollama_llama_server runner --model /root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc --ctx-size 8192 --batch-size 512 --n-gpu-layers 29 --threads 32 --parallel 4 --tensor-split 8,7,7,7 --port 41549" time=2025-03-05T04:04:00.034+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1 time=2025-03-05T04:04:00.034+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding" time=2025-03-05T04:04:00.035+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error" time=2025-03-05T04:04:00.113+08:00 level=INFO source=runner.go:936 msg="starting go runner"

rocBLAS error: Could not initialize Tensile host: No devices found time=2025-03-05T04:04:00.285+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: error:Could not initialize Tensile host: No devices found" [GIN] 2025/03/05 - 04:04:00 | 500 | 339.189093ms | 127.0.0.1 | POST "/api/generate"

Assignee
Assign to
Time tracking