1. 07 Jul, 2025 1 commit
    • Hu Yaoqi's avatar
      fix: resolve issue with inability to correctly specify non-zero GPUs in multi-GPU systems (#404) · 1e621a58
      Hu Yaoqi authored
      * Fix: Correctly specify non-zero GPUs in multi-GPU environments
      
      This commit resolves an issue where the Nunchaku model could not be
      correctly initialized and run on a user-specified non-zero GPU in
      multi-GPU systems.
      
      Key changes include:
      - Using CUDADeviceContext in the FluxModel constructor to ensure
        the model and its submodules are created within the specified GPU context.
      - Modifying the logic in FluxModel::forward for copying residual data
        from CPU back to GPU, ensuring it returns to the correct original GPU device.
      - Adding explicit CUDA context management in Tensor::copy_ for data
        copy operations involving CUDA devices (H2D, D2H, D2D) to guarantee
        cudaMemcpyAsync executes on the correct device.
      
      These changes allow users to reliably run Nunchaku on any specified
      GPU in a multi-GPU setup.
      
      * finish pre-commit
      1e621a58
  2. 01 May, 2025 1 commit
  3. 01 Apr, 2025 1 commit
  4. 07 Mar, 2025 1 commit
  5. 27 Feb, 2025 1 commit
  6. 20 Feb, 2025 1 commit
  7. 23 Jan, 2025 1 commit
  8. 08 Nov, 2024 1 commit