Commits · 61ec4f1aa215ca3381e7b79e98f002dc0c021d77 · tianlh / LightGBM-DCU

22 Sep, 2025 1 commit

[ROCm] re-add support for ROCm builds · 61ec4f1a

Jeff Daily authored Sep 22, 2025

Previously #6086 added ROCm support but after numerous rebases it lost
critical changes. This PR restores the ROCm build.

There are many source file changes but most were automated using the
following:

```bash
for f in `grep -rl '#ifdef USE_CUDA'`
do
    sed -i 's@#ifdef USE_CUDA@#if defined(USE_CUDA) || defined(USE_ROCM)@g' $f
done

for f in `grep -rl '#endif  // USE_CUDA'`
do
    sed -i 's@#endif  // USE_CUDA@#endif  // USE_CUDA || USE_ROCM@g' $f
done
```

61ec4f1a

24 Jul, 2025 1 commit

[ROCm] add support for ROCm/HIP device (#6086) · a0fde1b0

Jeff Daily authored Jul 23, 2025



* [ROCm] add support for ROCm/HIP

- CMakeLists.txt ROCm updates, also replace glob with explicit file list
- initial warpSize interop changes
- helpers/hipify.sh script added
- .gitignore to ignore generated hip source files

* more rocm updates

- disable compiler warnings
- move PercentileDevice __device__ template function into header
- bug fixes for __host__ __define__ and __HIP__ preprocessor symbols

* more bug fixes

* warp 32 vs 64 updates

* lint fixes

* missing device_index variable

* accidental inclusion of hip headers

* copyright notice compliance

* Update CMakeLists.txt
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* fix lint issue

* clean up

* Update CMakeLists.txt
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update CMakeLists.txt
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* clean up CMakeLists.txt

use WARPSIZE

* use WARPSIZE

* fix share buffer size

---------
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Yu Shi <yushi2@microsoft.com>

a0fde1b0

08 Oct, 2023 1 commit

[CUDA] CUDA Quantized Training (fixes #5606) (#5933) · f901f471

shiyu1994 authored Oct 08, 2023

* add quantized training (first stage)

* add histogram construction functions for integer gradients

* add stochastic rounding

* update docs

* fix compilation errors by adding template instantiations

* update files for compilation

* fix compilation of gpu version

* initialize gradient discretizer before share states

* add a test case for quantized training

* add quantized training for data distributed training

* Delete origin.pred

* Delete ifelse.pred

* Delete LightGBM_model.txt

* remove useless changes

* fix lint error

* remove debug loggings

* fix mismatch of vector and allocator types

* remove changes in main.cpp

* fix bugs with uninitialized gradient discretizer

* initialize ordered gradients in gradient discretizer

* disable quantized training with gpu and cuda

fix msvc compilation errors and warnings

* fix bug in data parallel tree learner

* make quantized training test deterministic

* make quantized training in test case more accurate

* refactor test_quantized_training

* fix leaf splits initialization with quantized training

* check distributed quantized training result

* add cuda gradient discretizer

* add quantized training for CUDA version in tree learner

* remove cuda computability 6.1 and 6.2

* fix parts of gpu quantized training errors and warnings

* fix build-python.sh to install locally built version

* fix memory access bugs

* fix lint errors

* mark cuda quantized training on cuda with categorical features as unsupported

* rename cuda_utils.h to cuda_utils.hu

* enable quantized training with cuda

* fix cuda quantized training with sparse row data

* allow using global memory buffer in histogram construction with cuda quantized training

* recover build-python.sh

enlarge allowed package size to 100M

f901f471

16 Jun, 2023 1 commit

[CUDA] Add more CUDA Regression Metrics (#5924) · 07e3cf47

Xuweijia-buaa authored Jun 16, 2023

* add l1 metric for cuda_exp

* add huber/fair metric for cuda_exp

* add poisson/mape/gamma/gamma_deviance/tweedie  metrics for cuda_exp

* fix cpplint error

* fix return  error

07e3cf47

16 Mar, 2023 1 commit
- [CUDA] Add quantile metric for new CUDA version (contribute to #5163) (#5665) · 54486b4f
  shiyu1994 authored Mar 16, 2023
```
Co-authored-by: James Lamb <jaylamb20@gmail.com>
```
  54486b4f
01 Feb, 2023 1 commit

[CUDA] consolidate CUDA versions (#5677) · 4f47547c

James Lamb authored Jan 31, 2023



* [ci] speed up if-else, swig, and lint conda setup

* add 'source activate'

* python constraint

* start removing cuda v1

* comment out CI

* remove more references

* revert some unnecessaary changes

* revert a few more mistakes

* revert another change that ignored params

* sigh

* remove CUDATreeLearner

* fix tests, docs

* fix quoting in setup.py

* restore all CI

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Apply suggestions from code review

* completely remove cuda_exp, update docs

---------
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

4f47547c

29 Dec, 2022 1 commit
- [CUDA] Add binary logloss metric for new CUDA version (#5635) · 73531662
  shiyu1994 authored Dec 29, 2022
  
  73531662
27 Dec, 2022 1 commit

[CUDA] Add L2 metric for new CUDA version (#5633) · 6482b47e

shiyu1994 authored Dec 27, 2022

* add rmse metric for new cuda version

* add Init for CUDAMetricInterface

* fix lint errors

* fix rmse and add l2 metric for new cuda version

* use CUDAL2Metric

* explicit template instantiation

* write result only with the first thread

* pre allocate buffer for output converting

* fix l2 regression with cuda metric evaluation

* weighting loss in cuda metric evaluation

* mark CUDATree::AsConstantTree as override

6482b47e

02 Dec, 2022 1 commit
- [CUDA] Add rmse metric for new CUDA version (#5611) · f0cfbff6
  shiyu1994 authored Dec 02, 2022
```
* add rmse metric for new cuda version

* add Init for CUDAMetricInterface

* fix lint errors
```
  f0cfbff6