Unverified Commit 8ef7163a authored by Yifan Xiong's avatar Yifan Xiong Committed by GitHub
Browse files

Deployment - Refine error message when GPU is not detected (#368)

Refine error message when GPU is not detected.

Possible solutions if hardware exists and drivers are already installed:
* nvidia gpus:
  ```sh
  /sbin/modprobe nvidia-uvm
  D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
  mknod -m 666 /dev/nvidia-uvm c $D 0
  ```

* amd gpus
  ```sh
  modprobe amdgpu
  ```
parent 325a7338
...@@ -48,8 +48,8 @@ ...@@ -48,8 +48,8 @@
- name: Print GPU Checking Result - name: Print GPU Checking Result
debug: debug:
msg: msg:
- "NVIDIA GPU {{ 'detected' if nvidia_gpu_exist else 'not detected' }}" - "NVIDIA GPU {{ 'detected' if nvidia_gpu_exist else 'not operational, pls confirm nvidia_uvm kernel module is loaded and /dev/nvidia-uvm exists' }}"
- "AMD GPU {{ 'detected' if amd_gpu_exist else 'not detected' }}" - "AMD GPU {{ 'detected' if amd_gpu_exist else 'not operational, pls confirm amdgpu kernel module is loaded' }}"
- name: Remote Deployment - name: Remote Deployment
hosts: all hosts: all
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment