• shiyu1994's avatar
    Optimization of row-wise histogram construction (#3522) · 0655d67c
    shiyu1994 authored
    
    
    * store without offset in multi_val_dense_bin
    
    * fix offset bug
    
    * add comment for offset
    
    * add comment for bin type selection
    
    * faster operations for offset
    
    * keep most freq bin in histogram for multi val dense
    
    * use original feature iterators
    
    * consider 9 cases (3 x 3) for multi val bin construction
    
    * fix dense bin setting
    
    * fix bin data in multi val group
    
    * fix offset of the first feature histogram
    
    * use float hist buf
    
    * avx in histogram construction
    
    * use avx for hist construction without prefetch
    
    * vectorize bin extraction
    
    * use only 128 vec
    
    * use avx2
    
    * use vectorization for sparse row wise
    
    * add bit size for multi val dense bin
    
    * float with no vectorization
    
    * change multithreading strategy to dynamic
    
    * remove intrinsic header
    
    * fix dense multi val col copy
    
    * remove bit size
    
    * use large enough block size when the bin number is large
    
    * calc min block size by sparsity
    
    * rescale gradients
    
    * rollback gradients scaling
    
    * single precision histogram buffer as an option
    
    * add float hist buffer with thread buffer
    
    * fix setting zero in hist data
    
    * fix hist begin pointer in tree learners
    
    * remove debug logs
    
    * remove omp simd
    
    * update Makevars of R-package
    
    * fix feature group binary storing
    
    * two row wise for double hist buffer
    
    * add subfeature for two row wise
    
    * remove useless code and fix two row wise
    
    * refactor code
    
    * grouping the dense feature groups can get sparse multi val bin
    
    * clean format problems
    
    * one thread for two blocks in sep row wise
    
    * use ordered gradients for sep row wise
    
    * fix grad ptr
    
    * ordered grad with combined block for sep row wise
    
    * fix block threading
    
    * use the same min block size
    
    * rollback share min block size
    
    * remove logs
    
    * Update src/io/dataset.cpp
    Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
    
    * fix parameter description
    
    * remove sep_row_wise
    
    * remove check codes
    
    * add check for empty multi val bin
    
    * fix lint error
    
    * rollback changes in config.h
    
    * Apply suggestions from code review
    Co-authored-by: default avatarUbuntu <shiyu@gbdt-04.ren3kv4wanvufliwrpy4k03lsf.xx.internal.cloudapp.net>
    Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
    0655d67c
serial_tree_learner.cpp 34.3 KB