Commits · 20996c92acfac13f4157937310a97229d09a654c · tianlh / LightGBM-DCU

23 Sep, 2025 1 commit

partial revert of · 20996c92

Jeff Daily authored Sep 23, 2025

Instead of replacing all #ifdef USE_CUDA, just add USE_CUDA define to ROCm build.

20996c92

22 Sep, 2025 1 commit

[ROCm] re-add support for ROCm builds · 61ec4f1a

Jeff Daily authored Sep 22, 2025

Previously #6086 added ROCm support but after numerous rebases it lost
critical changes. This PR restores the ROCm build.

There are many source file changes but most were automated using the
following:

```bash
for f in `grep -rl '#ifdef USE_CUDA'`
do
    sed -i 's@#ifdef USE_CUDA@#if defined(USE_CUDA) || defined(USE_ROCM)@g' $f
done

for f in `grep -rl '#endif  // USE_CUDA'`
do
    sed -i 's@#endif  // USE_CUDA@#endif  // USE_CUDA || USE_ROCM@g' $f
done
```

61ec4f1a

02 Oct, 2024 1 commit

[c++] Add Bagging by Query for Lambdarank (#6623) · d1d218c3

shiyu1994 authored Oct 03, 2024



* add bagging by query for lambdarank

* fix pre-commit

* fix bagging by query with cuda

* fix bagging by query test case

* fix bagging by query test case

* fix bagging by query test case

* add #include <vector>

* Update include/LightGBM/objective_function.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

---------
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d1d218c3

08 Oct, 2023 1 commit

[CUDA] CUDA Quantized Training (fixes #5606) (#5933) · f901f471

shiyu1994 authored Oct 08, 2023

* add quantized training (first stage)

* add histogram construction functions for integer gradients

* add stochastic rounding

* update docs

* fix compilation errors by adding template instantiations

* update files for compilation

* fix compilation of gpu version

* initialize gradient discretizer before share states

* add a test case for quantized training

* add quantized training for data distributed training

* Delete origin.pred

* Delete ifelse.pred

* Delete LightGBM_model.txt

* remove useless changes

* fix lint error

* remove debug loggings

* fix mismatch of vector and allocator types

* remove changes in main.cpp

* fix bugs with uninitialized gradient discretizer

* initialize ordered gradients in gradient discretizer

* disable quantized training with gpu and cuda

fix msvc compilation errors and warnings

* fix bug in data parallel tree learner

* make quantized training test deterministic

* make quantized training in test case more accurate

* refactor test_quantized_training

* fix leaf splits initialization with quantized training

* check distributed quantized training result

* add cuda gradient discretizer

* add quantized training for CUDA version in tree learner

* remove cuda computability 6.1 and 6.2

* fix parts of gpu quantized training errors and warnings

* fix build-python.sh to install locally built version

* fix memory access bugs

* fix lint errors

* mark cuda quantized training on cuda with categorical features as unsupported

* rename cuda_utils.h to cuda_utils.hu

* enable quantized training with cuda

* fix cuda quantized training with sparse row data

* allow using global memory buffer in histogram construction with cuda quantized training

* recover build-python.sh

enlarge allowed package size to 100M

f901f471

13 Aug, 2023 1 commit

[CUDA] Set GPU device ID in threads (#6028) · 5c9e61d1

shiyu1994 authored Aug 13, 2023



* set gpu device id in open mp threads

* move SetCUDADevice outside for loop

---------
Co-authored-by: James Lamb <jaylamb20@gmail.com>

5c9e61d1

01 Feb, 2023 1 commit

[CUDA] consolidate CUDA versions (#5677) · 4f47547c

James Lamb authored Jan 31, 2023



* [ci] speed up if-else, swig, and lint conda setup

* add 'source activate'

* python constraint

* start removing cuda v1

* comment out CI

* remove more references

* revert some unnecessaary changes

* revert a few more mistakes

* revert another change that ignored params

* sigh

* remove CUDATreeLearner

* fix tests, docs

* fix quoting in setup.py

* restore all CI

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Apply suggestions from code review

* completely remove cuda_exp, update docs

---------
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

4f47547c

02 Dec, 2022 1 commit
- [CUDA] Add rmse metric for new CUDA version (#5611) · f0cfbff6
  shiyu1994 authored Dec 02, 2022
```
* add rmse metric for new cuda version

* add Init for CUDAMetricInterface

* fix lint errors
```
  f0cfbff6
27 Nov, 2022 1 commit

[CUDA] Add Poisson regression objective for cuda_exp and refactor objective... · 24af9fa5

shiyu1994 authored Nov 27, 2022


[CUDA] Add Poisson regression objective for cuda_exp and refactor objective functions for cuda_exp (#5486)

* add poisson regression objective for cuda_exp

* enable Poisson regression for cuda_exp

* refactor cuda objective functions

* remove useless changes

* fix linter errors

* remove redundant buffer in cuda poisson regression objective

* fix log of cuda_exp binary objective

* fix threshold of poisson objective result

* remove useless changes

* fix compilation errors

* add cuda quantile regression objective

* remove cuda quantile regression objective
Co-authored-by: James Lamb <jaylamb20@gmail.com>

24af9fa5

05 Sep, 2022 1 commit

Fix CUDA `#ifndef` guards (#5466) · c9a3b479

Nikita Titov authored Sep 05, 2022

* Update cuda_column_data.hpp

* Update cuda_metadata.hpp

* Update cuda_objective_function.hpp

* Update cuda_row_data.hpp

* Update cuda_regression_objective.hpp

c9a3b479

31 Aug, 2022 1 commit

[CUDA] Add binary objective for cuda_exp (#5425) · 2b8fe8b4

shiyu1994 authored Aug 31, 2022

* add binary objective for cuda_exp

* include <string> and <vector>

* exchange include ordering

* fix length of score to copy in evaluation

* fix EvalOneMetric

* fix cuda binary objective and prediction when boosting on gpu

* Add white space

* fix BoostFromScore for CUDABinaryLogloss

update log in test_register_logger

* include <algorithm>

* simplify shared memory buffer

2b8fe8b4