- 22 Sep, 2025 1 commit
-
-
Jeff Daily authored
Previously #6086 added ROCm support but after numerous rebases it lost critical changes. This PR restores the ROCm build. There are many source file changes but most were automated using the following: ```bash for f in `grep -rl '#ifdef USE_CUDA'` do sed -i 's@#ifdef USE_CUDA@#if defined(USE_CUDA) || defined(USE_ROCM)@g' $f done for f in `grep -rl '#endif // USE_CUDA'` do sed -i 's@#endif // USE_CUDA@#endif // USE_CUDA || USE_ROCM@g' $f done ```
-
- 24 Jul, 2025 1 commit
-
-
Jeff Daily authored
* [ROCm] add support for ROCm/HIP - CMakeLists.txt ROCm updates, also replace glob with explicit file list - initial warpSize interop changes - helpers/hipify.sh script added - .gitignore to ignore generated hip source files * more rocm updates - disable compiler warnings - move PercentileDevice __device__ template function into header - bug fixes for __host__ __define__ and __HIP__ preprocessor symbols * more bug fixes * warp 32 vs 64 updates * lint fixes * missing device_index variable * accidental inclusion of hip headers * copyright notice compliance * Update CMakeLists.txt Co-authored-by:
James Lamb <jaylamb20@gmail.com> * fix lint issue * clean up * Update CMakeLists.txt Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update CMakeLists.txt Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * clean up CMakeLists.txt use WARPSIZE * use WARPSIZE * fix share buffer size --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Yu Shi <yushi2@microsoft.com>
-
- 05 Dec, 2024 1 commit
-
-
James Lamb authored
-
- 01 Dec, 2024 1 commit
-
-
Oliver Borchert authored
Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 13 Oct, 2024 1 commit
-
-
Atanas Dimitrov authored
Co-authored-by:
Atanas Dimitrov <nasko119@abv.bg> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 02 Oct, 2024 1 commit
-
-
shiyu1994 authored
* add bagging by query for lambdarank * fix pre-commit * fix bagging by query with cuda * fix bagging by query test case * fix bagging by query test case * fix bagging by query test case * add #include <vector> * Update include/LightGBM/objective_function.h Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update tests/python_package_test/test_engine.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update tests/python_package_test/test_engine.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> --------- Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 08 Oct, 2023 1 commit
-
-
shiyu1994 authored
* add quantized training (first stage) * add histogram construction functions for integer gradients * add stochastic rounding * update docs * fix compilation errors by adding template instantiations * update files for compilation * fix compilation of gpu version * initialize gradient discretizer before share states * add a test case for quantized training * add quantized training for data distributed training * Delete origin.pred * Delete ifelse.pred * Delete LightGBM_model.txt * remove useless changes * fix lint error * remove debug loggings * fix mismatch of vector and allocator types * remove changes in main.cpp * fix bugs with uninitialized gradient discretizer * initialize ordered gradients in gradient discretizer * disable quantized training with gpu and cuda fix msvc compilation errors and warnings * fix bug in data parallel tree learner * make quantized training test deterministic * make quantized training in test case more accurate * refactor test_quantized_training * fix leaf splits initialization with quantized training * check distributed quantized training result * add cuda gradient discretizer * add quantized training for CUDA version in tree learner * remove cuda computability 6.1 and 6.2 * fix parts of gpu quantized training errors and warnings * fix build-python.sh to install locally built version * fix memory access bugs * fix lint errors * mark cuda quantized training on cuda with categorical features as unsupported * rename cuda_utils.h to cuda_utils.hu * enable quantized training with cuda * fix cuda quantized training with sparse row data * allow using global memory buffer in histogram construction with cuda quantized training * recover build-python.sh enlarge allowed package size to 100M
-
- 13 Aug, 2023 1 commit
-
-
shiyu1994 authored
* set gpu device id in open mp threads * move SetCUDADevice outside for loop --------- Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 16 Jun, 2023 1 commit
-
-
Xuweijia-buaa authored
* add l1 metric for cuda_exp * add huber/fair metric for cuda_exp * add poisson/mape/gamma/gamma_deviance/tweedie metrics for cuda_exp * fix cpplint error * fix return error
-
- 21 Mar, 2023 1 commit
-
-
shiyu1994 authored
* add cuda quantile regression objective * remove white space * resolve merge conflicts * remove useless changes * remove useless changes * enable cuda quantile regression objective * add a test case for quantile regression objective * remove useless changes * remove useless changes * reduce DP_SHARED_HIST_SIZE to 5176 for CUDA 10 --------- Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 01 Feb, 2023 1 commit
-
-
James Lamb authored
* [ci] speed up if-else, swig, and lint conda setup * add 'source activate' * python constraint * start removing cuda v1 * comment out CI * remove more references * revert some unnecessaary changes * revert a few more mistakes * revert another change that ignored params * sigh * remove CUDATreeLearner * fix tests, docs * fix quoting in setup.py * restore all CI * Apply suggestions from code review Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * Apply suggestions from code review * completely remove cuda_exp, update docs --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com>
-
- 28 Dec, 2022 1 commit
-
-
Yifei Liu authored
* add parameter data_sample_strategy * abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now) * abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept * fix some variables * remove GOSS(as boost) and Bagging logic in GBDT * rename GOSS1 to GOSS(as sample strategy) * add warning about use GOSS as boosting_type * a little ; bug * remove CHECK when "gradients != nullptr" * rename DataSampleStrategy to avoid confusion * remove and add some ccomments, followingconvention * fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet… * add std::ignore to avoid compiler warnings (anpotential fails) * update Makevars and vcxproj * handle constant hessian move resize of gradient vectors out of sample strategy * mark override for IsHessianChange * fix lint errors * rerun parameter_generator.py * update config_auto.cpp * delete redundant blank line * update num_data_ when train_data_ is updated set gradients and hessians when GOSS * check bagging_freq is not zero * reset config_ value merge ResetBaggingConfig and ResetGOSS * remove useless check * add ttests in test_engine.py * remove whitespace in blank line * remove arguments verbose_eval and evals_result * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py reduce num_boost_round Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update src/boosting/sample_strategy.cpp modify warning about setting goss as `boosting_type` Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_engine.py replace load_boston() with make_regression() remove value checks of mean_squared_error in test_sample_strategy_with_boosting() * Update tests/python_package_test/test_engine.py add value checks of mean_squared_error in test_sample_strategy_with_boosting() * Modify warnning about using goss as boosting type * Update tests/python_package_test/test_engine.py add random_state=42 for make_regression() reduce the threshold of mean_square_error * Update src/boosting/sample_strategy.cpp Co-authored-by:
James Lamb <jaylamb20@gmail.com> * remove goss from boosting types in documentation * Update src/boosting/bagging.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/bagging.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/goss.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update src/boosting/goss.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * rename GOSS with GOSSStrategy * update doc * address comments * fix table in doc * Update include/LightGBM/config.h Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update documentation * update test case * revert useless change in test_engine.py * add tests for evaluation results in test_sample_strategy_with_boosting * include <string> * change to assert_allclose in test_goss_boosting_and_strategy_equivalent * more tolerance in result checking, due to minor difference in results of gpu versions * change == to np.testing.assert_allclose * fix test case * set gpu_use_dp to true * change --report to --report-level for rstcheck * use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent * revert unexpected changes of non-ascii characters * revert unexpected changes of non-ascii characters * remove useless changes * allocate gradients_pointer_ and hessians_pointer when necessary * add spaces * remove redundant virtual * include <LightGBM/utils/log.h> for USE_CUDA * check for in test_goss_boosting_and_strategy_equivalent * check for identity in test_sample_strategy_with_boosting * remove cuda option in test_sample_strategy_with_boosting * Update tests/python_package_test/test_engine.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update tests/python_package_test/test_engine.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * ResetGradientBuffers after ResetSampleConfig * ResetGradientBuffers after ResetSampleConfig * ResetGradientBuffers after bagging * remove useless code * check objective_function_ instead of gradients * enable rf with goss simplify params in test cases * remove useless changes * allow rf with feature subsampling alone * change position of ResetGradientBuffers * check for dask * add parameter types for data_sample_strategy Co-authored-by:
Guangda Liu <v-guangdaliu@microsoft.com> Co-authored-by:
Yu Shi <shiyu_k1994@qq.com> Co-authored-by:
GuangdaLiu <90019144+GuangdaLiu@users.noreply.github.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 27 Dec, 2022 1 commit
-
-
shiyu1994 authored
* add rmse metric for new cuda version * add Init for CUDAMetricInterface * fix lint errors * fix rmse and add l2 metric for new cuda version * use CUDAL2Metric * explicit template instantiation * write result only with the first thread * pre allocate buffer for output converting * fix l2 regression with cuda metric evaluation * weighting loss in cuda metric evaluation * mark CUDATree::AsConstantTree as override
-
- 02 Dec, 2022 1 commit
-
-
shiyu1994 authored
* add rmse metric for new cuda version * add Init for CUDAMetricInterface * fix lint errors
-
- 27 Nov, 2022 1 commit
-
-
shiyu1994 authored
[CUDA] Add Poisson regression objective for cuda_exp and refactor objective functions for cuda_exp (#5486) * add poisson regression objective for cuda_exp * enable Poisson regression for cuda_exp * refactor cuda objective functions * remove useless changes * fix linter errors * remove redundant buffer in cuda poisson regression objective * fix log of cuda_exp binary objective * fix threshold of poisson objective result * remove useless changes * fix compilation errors * add cuda quantile regression objective * remove cuda quantile regression objective Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 07 Sep, 2022 1 commit
-
-
shiyu1994 authored
* add feature interaction constraint for cuda_exp * test feature interaction constraints for cuda_exp * remove useless check * update comment
-
- 05 Sep, 2022 2 commits
-
-
shiyu1994 authored
* add lambdarank for cuda_exp * support unlimited number of ranks in labels * fix lint errors * remove warning for lambdarank with cuda_exp * Update src/objective/cuda/cuda_rank_objective.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update src/objective/cuda/cuda_rank_objective.hpp Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
Nikita Titov authored
* Update cuda_column_data.hpp * Update cuda_metadata.hpp * Update cuda_objective_function.hpp * Update cuda_row_data.hpp * Update cuda_regression_objective.hpp
-
- 01 Sep, 2022 1 commit
-
-
shiyu1994 authored
* add (l1) regression objective for cuda_exp * remove RenewTreeOutputCUDA from CUDARegressionL2loss * remove mutable and use CUDAVector * remove white spaces * remove TODO and document in (#5459)
-
- 31 Aug, 2022 2 commits
-
-
shiyu1994 authored
* add (l2) regression objective for cuda_exp * fix lint errors * correct time tag
-
shiyu1994 authored
* add binary objective for cuda_exp * include <string> and <vector> * exchange include ordering * fix length of score to copy in evaluation * fix EvalOneMetric * fix cuda binary objective and prediction when boosting on gpu * Add white space * fix BoostFromScore for CUDABinaryLogloss update log in test_register_logger * include <algorithm> * simplify shared memory buffer
-
- 29 Aug, 2022 1 commit
-
-
shiyu1994 authored
* fix cuda_exp ci * fix ci failures introduced by #5279 * cleanup cuda.yml * fix test.sh * clean up test.sh * clean up test.sh * skip lines by cuda_exp in test_register_logger * Update tests/python_package_test/test_utilities.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 29 Jul, 2022 1 commit
-
-
shiyu1994 authored
* initial work for boosting and evaluation with CUDA * fix compatibility with CPU code * fix creating objective without USE_CUDA_EXP * fix static analysis errors * fix static analysis errors
-
- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 11 Jan, 2021 1 commit
-
-
Chip Kerchner authored
-
- 26 Oct, 2020 1 commit
-
-
Pengfei Shi authored
-
- 20 Sep, 2020 1 commit
-
-
Chip Kerchner authored
* Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676 . * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add CUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by:
Gordon Fossum <fossum@us.ibm.com> Co-authored-by:
ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
StrikerRUS <nekit94-12@hotmail.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-