• shiyu1994's avatar
    [CUDA] CUDA Quantized Training (fixes #5606) (#5933) · f901f471
    shiyu1994 authored
    * add quantized training (first stage)
    
    * add histogram construction functions for integer gradients
    
    * add stochastic rounding
    
    * update docs
    
    * fix compilation errors by adding template instantiations
    
    * update files for compilation
    
    * fix compilation of gpu version
    
    * initialize gradient discretizer before share states
    
    * add a test case for quantized training
    
    * add quantized training for data distributed training
    
    * Delete origin.pred
    
    * Delete ifelse.pred
    
    * Delete LightGBM_model.txt
    
    * remove useless changes
    
    * fix lint error
    
    * remove debug loggings
    
    * fix mismatch of vector and allocator types
    
    * remove changes in main.cpp
    
    * fix bugs with uninitialized gradient discretizer
    
    * initialize ordered gradients in gradient discretizer
    
    * disable quantized training with gpu and cuda
    
    fix msvc compilation errors and warnings
    
    * fix bug in data parallel tree learner
    
    * make quantized training test deterministic
    
    * make quantized training in test case more accurate
    
    * refactor test_quantized_training
    
    * fix leaf splits initialization with quantized training
    
    * check distributed quantized training result
    
    * add cuda gradient discretizer
    
    * add quantized training for CUDA version in tree learner
    
    * remove cuda computability 6.1 and 6.2
    
    * fix parts of gpu quantized training errors and warnings
    
    * fix build-python.sh to install locally built version
    
    * fix memory access bugs
    
    * fix lint errors
    
    * mark cuda quantized training on cuda with categorical features as unsupported
    
    * rename cuda_utils.h to cuda_utils.hu
    
    * enable quantized training with cuda
    
    * fix cuda quantized training with sparse row data
    
    * allow using global memory buffer in histogram construction with cuda quantized training
    
    * recover build-python.sh
    
    enlarge allowed package size to 100M
    f901f471
cuda_best_split_finder.cu 100 KB