1. 16 Jul, 2020 1 commit
  2. 15 Jul, 2020 2 commits
    • Alberto Ferreira's avatar
      Feat/optimize single prediction (#2992) · fc79b366
      Alberto Ferreira authored
      * [performance] Add Fast methods to C API for SingleRow Predictions
      
       * Add methods to C API to make single-row predictions faster:
      
         - LGBM_BoosterPredictForMatSingleRowFastInit (setup)
         - LGBM_BoosterPredictForMatSingleRowFast (predict)
         - LGBM_FastConfigFree (cleanup setup outputs)
      
      * Code syle cleanup
      
      * Fix lint errors
      
      * [performance] Revert FastConfig improvement to pass data at init
      
      This reduces optimization by 5% / 30% with this branch but makes it so it can be used for higher level wrappers in MMLSpark.
      And outside it as well.
      
      * [performance] Introduce Fast variants for SingleRow predictors.
      
      Although this already provides performance gains by itself for any
      callers, two new functions were added to Java's SWIG interfaces to
      exploit that AND the GetPrimitiveArrayCritical data fetches.
      
      * [tests/profiling] Profile Fast predict methods
      
      Build with -DBUILD_PROFILING_TESTS=ON and copy the default
      model trained on the Higgs dataset from the benchmarks repo
      
       https://github.com/guolinke/boosting_tree_benchmarks.git
      
      
      
      to LightGBM repo root and run the lightgbm_profile_* binaries.
      
      The single instance used is the first row from that dataset.
      
      * Update comment on CMakeLists.
      
      * Fix doxygen-introduced issue (#threads)
      
      * Fix conflicts due to new RowFunctionFromCSR signature in master
      
      * Change FastConfig ncol to int32_t.
      
      * Removed profiling folder
      
      * fix doxygen typo include/LightGBM/c_api.h
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * fix doxygen typo include/LightGBM/c_api.h
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * fix doxygen typo include/LightGBM/c_api.h
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Doxygen: change new docstrings to double back-quote
      Co-authored-by: default avataralberto.ferreira <alberto.ferreira@feedzai.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      fc79b366
    • Guolin Ke's avatar
      feature importance type in saved model file (#3220) · 87d46489
      Guolin Ke authored
      
      
      * feature importance type in saved model file
      
      * fix nullptr
      
      * fixed formatting
      
      * fix python/R
      
      * Update src/c_api.cpp
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * fix c_api test
      
      * fix swig
      
      * minor docs improvements and added defines for importance types
      Co-authored-by: default avatarStrikerRUS <nekit94-12@hotmail.com>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      87d46489
  3. 28 Jun, 2020 1 commit
    • Ilya Matiach's avatar
      adding sparse support to TreeSHAP in lightgbm (#3000) · 9f367d11
      Ilya Matiach authored
      * adding sparse support to TreeSHAP in lightgbm
      
      * updating based on comments
      
      * updated based on comments, used fromiter instead of frombuffer
      
      * updated based on comments
      
      * fixed limits import order
      
      * fix sparse feature contribs to work with more than int32 max rows
      
      * really fixed int64 max error and build warnings
      
      * added sparse test with >int32 max rows
      
      * fixed python side reshape check on sparse data
      
      * updated based on latest comments
      
      * fixed comments
      
      * added CSC INT32_MAX validation to test, fixed comments
      9f367d11
  4. 11 Jun, 2020 1 commit
  5. 05 Jun, 2020 1 commit
  6. 01 Jun, 2020 1 commit
  7. 20 May, 2020 1 commit
  8. 12 Apr, 2020 1 commit
  9. 21 Mar, 2020 1 commit
  10. 20 Mar, 2020 1 commit
    • Alberto Ferreira's avatar
      Fix SWIG methods that return char** (#2850) · 91185c3a
      Alberto Ferreira authored
      
      
      * [swig] Fix SWIG methods that return char** with StringArray.
      
      + [new] Add StringArray class to manage and manipulate arrays of fixed-length strings:
      
        This class is now used to wrap any char** parameters, manage memory and
        manipulate the strings.
      
        Such class is defined at swig/StringArray.hpp and wrapped in StringArray.i.
      
      + [API+fix] Wrap LGBM_BoosterGetFeatureNames it resulted in segfault before:
      
        Added wrapper LGBM_BoosterGetFeatureNamesSWIG(BoosterHandle) that
        only receives the booster handle and figures how much memory to allocate
        for strings and returns a StringArray which can be easily converted to String[].
      
      + [API+safety] For consistency, LGBM_BoosterGetEvalNamesSWIG was wrapped as well:
      
        * Refactor to detect any kind of errors and removed all the parameters
          besides the BoosterHandle (much simpler API to use in Java).
        * No assumptions are made about the required string space necessary (128 before).
        * The amount of required string memory is computed internally
      
      + [safety] No possibility of undefined behaviour
      
        The two methods wrapped above now compute the necessary string storage space
        prior to allocation, as the low-level C API calls would crash the process
        irreversibly if they write more memory than which is passed to them.
      
      * Changes to C API and wrappers support char**
      
      To support the latest SWIG changes that enable proper char**
      return support that is safe, the C API was changed.
      
      The respecive wrappers in R and Python were changed too.
      
      * Cleanup indentation in new lightgbm_R.cpp code
      
      * Adress review code-style comments.
      
      * Update swig/StringArray.hpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update python-package/lightgbm/basic.py
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/lightgbm_R.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avataralberto.ferreira <alberto.ferreira@feedzai.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      91185c3a
  11. 17 Mar, 2020 1 commit
  12. 11 Mar, 2020 1 commit
  13. 05 Mar, 2020 1 commit
    • Guolin Ke's avatar
      speed up `FindBestThresholdFromHistogram` (#2867) · 77d92b7c
      Guolin Ke authored
      * speed up for const hessian
      
      * rename template
      
      * some refactorings
      
      * refine
      
      * refine
      
      * simplify codes
      
      * fix random in feature histogram
      
      * code refine
      
      * refine
      
      * try fix
      
      * make gcc happy
      
      * remove timer
      
      * rollback some changes
      
      * more templates
      
      * fix a bug
      
      * reduce the cost of timer
      
      * fix gpu
      
      * fix bug
      
      * fix gpu
      77d92b7c
  14. 04 Mar, 2020 2 commits
  15. 02 Mar, 2020 3 commits
  16. 25 Feb, 2020 1 commit
  17. 24 Feb, 2020 2 commits
  18. 22 Feb, 2020 1 commit
    • Guolin Ke's avatar
      some code refactoring (#2769) · 3e80df7e
      Guolin Ke authored
      * some refines
      
      * more omp refactoring
      
      * format define
      
      * fix merge bug
      
      * some fixes
      
      * fix some warnings
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * remove dup codes
      3e80df7e
  19. 20 Feb, 2020 2 commits
  20. 19 Feb, 2020 1 commit
    • Guolin Ke's avatar
      [python] [R-package] refine the parameters for Dataset (#2594) · 9f79e840
      Guolin Ke authored
      
      
      * reset
      
      * fix a bug
      
      * fix test
      
      * Update c_api.h
      
      * support to no filter features by min_data
      
      * add warning in reset config
      
      * refine warnings for override dataset's parameter
      
      * some cleans
      
      * clean code
      
      * clean code
      
      * refine C API function doxygen comments
      
      * refined new param description
      
      * refined doxygen comments for R API function
      
      * removed stuff related to int8
      
      * break long line in warning message
      
      * removed tests which results cannot be validated anymore
      
      * added test for warnings about unchangeable params
      
      * write parameter from dataset to booster
      
      * consider free_raw_data.
      
      * fix params
      
      * fix bug
      
      * implementing R
      
      * fix typo
      
      * filter params in R
      
      * fix R
      
      * not min_data
      
      * refined tests
      
      * fixed linting
      
      * refine
      
      * pilint
      
      * add docstring
      
      * fix docstring
      
      * R lint
      
      * updated description for C API function
      
      * use param aliases in Python
      
      * fixed typo
      
      * fixed typo
      
      * added more params to test
      
      * removed debug print
      
      * fix dataset construct place
      
      * fix merge bug
      
      * Update feature_histogram.hpp
      
      * add is_sparse back
      
      * remove unused parameters
      
      * fix lint
      
      * add data random seed
      
      * update
      
      * [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767)
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      9f79e840
  21. 02 Feb, 2020 1 commit
    • Guolin Ke's avatar
      Support both row-wise and col-wise multi-threading (#2699) · 509c2e50
      Guolin Ke authored
      
      
      * commit
      
      * fix a bug
      
      * fix bug
      
      * reset to track changes
      
      * refine the auto choose logic
      
      * sort the time stats output
      
      * fix include
      
      * change  multi_val_bin_sparse_threshold
      
      * add cmake
      
      * add _mm_malloc and _mm_free for cross platform
      
      * fix cmake bug
      
      * timer for split
      
      * try to fix cmake
      
      * fix tests
      
      * refactor DataPartition::Split
      
      * fix test
      
      * typo
      
      * formating
      
      * Revert "formating"
      
      This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222.
      
      * add document
      
      * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719)
      
      * naming
      
      * fix gpu code
      
      * Update include/LightGBM/bin.h
      Co-Authored-By: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Update src/treelearner/ocl/histogram16.cl
      
      * test: swap compilers for CI
      
      * fix omp
      
      * not avx2
      
      * no aligned for feature histogram
      
      * Revert "refactor DataPartition::Split"
      
      This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8.
      
      * slightly refactor data partition
      
      * reduce the memory cost
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      509c2e50
  22. 14 Jan, 2020 1 commit
    • Guolin Ke's avatar
      support most frequent bin (#2689) · c7e90393
      Guolin Ke authored
      * implement
      
      * fix warning
      
      * fix bug
      
      * fix a bug
      
      * remove unneed function
      
      * fix data push bug
      
      * fix valid data push
      
      * fix bug for missing_type=zero
      
      * refine split
      
      * renames
      
      * typo
      c7e90393
  23. 07 Jan, 2020 1 commit
  24. 05 Nov, 2019 1 commit
  25. 21 Oct, 2019 1 commit
  26. 03 Oct, 2019 1 commit
  27. 22 Sep, 2019 1 commit
  28. 30 Aug, 2019 1 commit
  29. 25 Jul, 2019 1 commit
  30. 13 Apr, 2019 2 commits
  31. 11 Apr, 2019 1 commit
  32. 26 Mar, 2019 1 commit
  33. 25 Mar, 2019 1 commit
    • mjmckp's avatar
      Add API method LGBM_BoosterPredictForMats (#2008) · 823fc03c
      mjmckp authored
      * Fix index out-of-range exception generated by BaggingHelper on small datasets.
      
      Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.
      
      * Update goss.hpp
      
      * Update goss.hpp
      
      * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)
      
      * Fix incorrect upstream merge
      
      * Add link to LightGBM.NET
      
      * Fix indenting to 2 spaces
      
      * Dummy edit to trigger CI
      
      * Dummy edit to trigger CI
      823fc03c