1. 28 Dec, 2020 1 commit
    • Nikita Titov's avatar
      small code and docs refactoring (#3681) · 5a460846
      Nikita Titov authored
      * small code and docs refactoring
      
      * Update CMakeLists.txt
      
      * Update .vsts-ci.yml
      
      * Update test.sh
      
      * continue
      
      * continue
      
      * revert stable sort for all-unique values
      5a460846
  2. 24 Dec, 2020 1 commit
    • Belinda Trotta's avatar
      Trees with linear models at leaves (#3299) · fcfd4132
      Belinda Trotta authored
      * Add Eigen library.
      
      * Working for simple test.
      
      * Apply changes to config params.
      
      * Handle nan data.
      
      * Update docs.
      
      * Add test.
      
      * Only load raw data if boosting=gbdt_linear
      
      * Remove unneeded code.
      
      * Minor updates.
      
      * Update to work with sk-learn interface.
      
      * Update to work with chunked datasets.
      
      * Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.
      
      * Save raw data in binary dataset file.
      
      * Update docs and fix parameter checking.
      
      * Fix dataset loading.
      
      * Add test for regularization.
      
      * Fix bugs when saving and loading tree.
      
      * Add test for load/save linear model.
      
      * Remove unneeded code.
      
      * Fix case where not enough leaf data for linear model.
      
      * Simplify code.
      
      * Speed up code.
      
      * Speed up code.
      
      * Simplify code.
      
      * Speed up code.
      
      * Fix bugs.
      
      * Working version.
      
      * Store feature data column-wise (not fully working yet).
      
      * Fix bugs.
      
      * Speed up.
      
      * Speed up.
      
      * Remove unneeded code.
      
      * Small speedup.
      
      * Speed up.
      
      * Minor updates.
      
      * Remove unneeded code.
      
      * Fix bug.
      
      * Fix bug.
      
      * Speed up.
      
      * Speed up.
      
      * Simplify code.
      
      * Remove unneeded code.
      
      * Fix bug, add more tests.
      
      * Fix bug and add test.
      
      * Only store numerical features
      
      * Fix bug and speed up using templates.
      
      * Speed up prediction.
      
      * Fix bug with regularisation
      
      * Visual studio files.
      
      * Working version
      
      * Only check nans if necessary
      
      * Store coeff matrix as an array.
      
      * Align cache lines
      
      * Align cache lines
      
      * Preallocation coefficient calculation matrices
      
      * Small speedups
      
      * Small speedup
      
      * Reverse cache alignment changes
      
      * Change to dynamic schedule
      
      * Update docs.
      
      * Refactor so that linear tree learner is not a separate class.
      
      * Add refit capability.
      
      * Speed up
      
      * Small speedups.
      
      * Speed up add prediction to score.
      
      * Fix bug
      
      * Fix bug and speed up.
      
      * Speed up dataload.
      
      * Speed up dataload
      
      * Use vectors instead of pointers
      
      * Fix bug
      
      * Add OMP exception handling.
      
      * Change return type of LGBM_BoosterGetLinear to bool
      
      * Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change
      
      * Remove unused internal_parent_ property of tree
      
      * Remove unused parameter to CreateTreeLearner
      
      * Remove reference to LinearTreeLearner
      
      * Minor style issues
      
      * Remove unneeded check
      
      * Reverse temporary testing change
      
      * Fix Visual Studio project files
      
      * Restore LightGBM.vcxproj.filters
      
      * Speed up
      
      * Speed up
      
      * Simplify code
      
      * Update docs
      
      * Simplify code
      
      * Initialise storage space for max num threads
      
      * Move Eigen to include directory and delete unused files
      
      * Remove old files.
      
      * Fix so it compiles with mingw
      
      * Fix gpu tree learner
      
      * Change AddPredictionToScore back to const
      
      * Fix python lint error
      
      * Fix C++ lint errors
      
      * Change eigen to a submodule
      
      * Update comment
      
      * Add the eigen folder
      
      * Try to fix build issues with eigen
      
      * Remove eigen files
      
      * Add eigen as submodule
      
      * Fix include paths
      
      * Exclude eigen files from Python linter
      
      * Ignore eigen folders for pydocstyle
      
      * Fix C++ linting errors
      
      * Fix docs
      
      * Fix docs
      
      * Exclude eigen directories from doxygen
      
      * Update manifest to include eigen
      
      * Update build_r to include eigen files
      
      * Fix compiler warnings
      
      * Store raw feature data as float
      
      * Use float for calculating linear coefficients
      
      * Remove eigen directory from GLOB
      
      * Don't compile linear model code when building R package
      
      * Fix doxygen issue
      
      * Fix lint issue
      
      * Fix lint issue
      
      * Remove uneeded code
      
      * Restore delected lines
      
      * Restore delected lines
      
      * Change return type of has_raw to bool
      
      * Update docs
      
      * Rename some variables and functions for readability
      
      * Make tree_learner parameter const in AddScore
      
      * Fix style issues
      
      * Pass vectors as const reference when setting tree properties
      
      * Make temporary storage of serial_tree_learner mutable so we can make the object's methods const
      
      * Remove get_raw_size, use num_numeric_features instead
      
      * Fix typo
      
      * Make contains_nan_ and any_nan_ properties immutable again
      
      * Remove data_has_nan_ property of tree
      
      * Remove temporary test code
      
      * Make linear_tree a dataset param
      
      * Fix lint error
      
      * Make LinearTreeLearner a separate class
      
      * Fix lint errors
      
      * Fix lint error
      
      * Add linear_tree_learner.o
      
      * Simulate omp_get_max_threads if openmp is not available
      
      * Update PushOneData to also store raw data.
      
      * Cast size to int
      
      * Fix bug in ReshapeRaw
      
      * Speed up code with multithreading
      
      * Use OMP_NUM_THREADS
      
      * Speed up with multithreading
      
      * Update to use ArrayToString
      
      * Fix tests
      
      * Fix test
      
      * Fix bug introduced in merge
      
      * Minor updates
      
      * Update docs
      fcfd4132
  3. 08 Dec, 2020 1 commit
    • Alberto Ferreira's avatar
      Fix model locale issue and improve model R/W performance. (#3405) · 792c9303
      Alberto Ferreira authored
      * Fix LightGBM models locale sensitivity and improve R/W performance.
      
      When Java is used, the default C++ locale is broken. This is true for
      Java providers that use the C API or even Python models that require JEP.
      
      This patch solves that issue making the model reads/writes insensitive
      to such settings.
      To achieve it, within the model read/write codebase:
       - C++ streams are imbued with the classic locale
       - Calls to functions that are dependent on the locale are replaced
       - The default locale is not changed!
      
      This approach means:
       - The user's locale is never tampered with, avoiding issues such as
          https://github.com/microsoft/LightGBM/issues/2979 with the previous
          approach https://github.com/microsoft/LightGBM/pull/2891
       - Datasets can still be read according the user's locale
       - The model file has a single format independent of locale
      
      Changes:
       - Add CommonC namespace which provides faster locale-independent versions of Common's methods
       - Model code makes conversions through CommonC
       - Cleanup unused Common methods
       - Performance improvements. Use fast libraries for locale-agnostic conversion:
         - value->string: https://github.com/fmtlib/fmt
         - string->double: https://github.com/lemire/fast_double_parser (10x
            faster double parsing according to their benchmark)
      
      Bugfixes:
       - https://github.com/microsoft/LightGBM/issues/2500
       - https://github.com/microsoft/LightGBM/issues/2890
       - https://github.com/ninia/jep/issues/205
      
       (as it is related to LGBM as well)
      
      * Align CommonC namespace
      
      * Add new external_libs/ to python setup
      
      * Try fast_double_parser fix #1
      
      Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe
      
      If it works it should fix more LGBM builds
      
      * CMake: Attempt to link fmt without explicit PUBLIC tag
      
      * Exclude external_libs from linting
      
      * Add exernal_libs to MANIFEST.in
      
      * Set dynamic linking option for fmt.
      
      * linting issues
      
      * Try to fix lint includes
      
      * Try to pass fPIC with static fmt lib
      
      * Try CMake P_I_C option with fmt library
      
      * [R-package] Add CMake support for R and CRAN
      
      * Cleanup CMakeLists
      
      * Try fmt hack to remove stdout
      
      * Switch to header-only mode
      
      * Add PRIVATE argument to target_link_libraries
      
      * use fmt in header-only mode
      
      * Remove CMakeLists comment
      
      * Change OpenMP to PUBLIC linking in Mac
      
      * Update fmt submodule to 7.1.2
      
      * Use fmt in header-only-mode
      
      * Remove fmt from CMakeLists.txt
      
      * Upgrade fast_double_parser to v0.2.0
      
      * Revert "Add PRIVATE argument to target_link_libraries"
      
      This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7.
      
      * Address James Lamb's comments
      
      * Update R-package/.Rbuildignore
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Upgrade to fast_double_parser v0.3.0 - Solaris support
      
      * Use legacy code only in Solaris
      
      * Fix lint issues
      
      * Fix comment
      
      * Address StrikerRUS's comments (solaris ifdef).
      
      * Change header guards
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      792c9303
  4. 09 Oct, 2020 1 commit
  5. 28 Jun, 2020 1 commit
    • Ilya Matiach's avatar
      adding sparse support to TreeSHAP in lightgbm (#3000) · 9f367d11
      Ilya Matiach authored
      * adding sparse support to TreeSHAP in lightgbm
      
      * updating based on comments
      
      * updated based on comments, used fromiter instead of frombuffer
      
      * updated based on comments
      
      * fixed limits import order
      
      * fix sparse feature contribs to work with more than int32 max rows
      
      * really fixed int64 max error and build warnings
      
      * added sparse test with >int32 max rows
      
      * fixed python side reshape check on sparse data
      
      * updated based on latest comments
      
      * fixed comments
      
      * added CSC INT32_MAX validation to test, fixed comments
      9f367d11
  6. 23 Jun, 2020 1 commit
    • Belinda Trotta's avatar
      Interaction constraints (#3126) · bca2da97
      Belinda Trotta authored
      * Add interaction constraints functionality.
      
      * Minor fixes.
      
      * Minor fixes.
      
      * Change lambda to function.
      
      * Fix gpu bug, remove extra blank lines.
      
      * Fix gpu bug.
      
      * Fix style issues.
      
      * Try to fix segfault on MACOS.
      
      * Fix bug.
      
      * Fix bug.
      
      * Fix bugs.
      
      * Change parameter format for R.
      
      * Fix R style issues.
      
      * Change string formatting code.
      
      * Change docs to say R package not supported.
      
      * Remove R functionality, moving to separate PR.
      
      * Keep track of branch features in tree object.
      
      * Only track branch features when feature interactions are enabled.
      
      * Fix lint error.
      
      * Update docs and simplify tests.
      bca2da97
  7. 09 Jun, 2020 1 commit
  8. 05 Jun, 2020 1 commit
  9. 01 Jun, 2020 1 commit
  10. 13 Apr, 2020 1 commit
  11. 04 Apr, 2020 1 commit
  12. 02 Apr, 2020 1 commit
  13. 23 Mar, 2020 1 commit
    • CharlesAuguste's avatar
      Improving monotone constraints ("Fast" method; linked to #2305, #2717) (#2770) · a8c1e0a1
      CharlesAuguste authored
      
      
      * Add util functions.
      
      * Added monotone_constraints_method as a parameter.
      
      * Add the intermediate constraining method.
      
      * Updated tests.
      
      * Minor fixes.
      
      * Typo.
      
      * Linting.
      
      * Ran the parameter generator for the doc.
      
      * Removed usage of the FeatureMonotone function.
      
      * more fixes
      
      * Fix.
      
      * Remove duplicated code.
      
      * Add debug checks.
      
      * Typo.
      
      * Bug fix.
      
      * Disable the use of intermediate monotone constraints and feature sampling at the same time.
      
      * Added an alias for monotone constraining method.
      
      * Use the right variable to get the number of threads.
      
      * Fix DEBUG checks.
      
      * Add back check to determine if histogram is splittable.
      
      * Added forgotten override keywords.
      
      * Perform monotone constraint update only when necessary.
      
      * Small refactor of FastLeafConstraints.
      
      * Post rebase commit.
      
      * Small refactor.
      
      * Typo.
      
      * Added comment and slightly improved logic of monotone constraints.
      
      * Forgot a const.
      
      * Vectors that are to be modified need to be pointers.
      
      * Rename FastLeafConstraints to IntermediateLeafConstraints to match documentation.
      
      * Remove overload of GoUpToFindLeavesToUpdate.
      
      * Stop memory leaking.
      
      * Fix cpplint issues.
      
      * Fix checks.
      
      * Fix more cpplint issues.
      
      * Refactor config monotone constraints method.
      
      * Typos.
      
      * Remove useless empty lines.
      
      * Add new line to separate includes.
      
      * Replace unsigned ind by size_t.
      
      * Reduce number of trials in tests to decrease CI time.
      
      * Specify monotone constraints better in tests.
      
      * Removed outer loop in test of monotone constraints.
      
      * Added categorical features to the monotone constraints tests.
      
      * Add blank line.
      
      * Regenerate parameters automatically.
      
      * Speed up ShouldKeepGoingLeftRight.
      Co-authored-by: default avatarCharles Auguste <auguste@dubquantdev801.ire.susq.com>
      Co-authored-by: default avatarguolinke <guolin.ke@outlook.com>
      a8c1e0a1
  14. 22 Feb, 2020 1 commit
    • Guolin Ke's avatar
      some code refactoring (#2769) · 3e80df7e
      Guolin Ke authored
      * some refines
      
      * more omp refactoring
      
      * format define
      
      * fix merge bug
      
      * some fixes
      
      * fix some warnings
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * remove dup codes
      3e80df7e
  15. 20 Feb, 2020 1 commit
    • Joan Fontanals's avatar
      Add capability to get possible max and min values for a model (#2737) · 18e7de4f
      Joan Fontanals authored
      
      
      * Add capability to get possible max and min values for a model
      
      * Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp
      
      * Update include/LightGBM/c_api.h
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Change iteration to avoid potential overflow, add bindings to R and Python and a basic test
      
      * Adjust test values
      
      * Consider const correctness and multithreading protection
      
      * Update test values
      
      * Update test values
      
      * Add test to check that model is exactly the same in all platforms
      
      * Try to parse the model to get the expected values
      
      * Try to parse the model to get the expected values
      
      * Fix implementation, num_leaves can be lower than the leaf_value_ size
      
      * Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value
      
      * Change test order
      
      * Add gpu_use_dp option in test
      
      * Remove helper test method
      
      * Update src/c_api.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/io/tree.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/io/tree.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update tests/python_package_test/test_basic.py
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Remoove imports
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      18e7de4f
  16. 14 Aug, 2019 1 commit
  17. 25 Jul, 2019 1 commit
  18. 24 Jul, 2019 1 commit
  19. 29 Apr, 2019 1 commit
  20. 13 Apr, 2019 1 commit
  21. 11 Apr, 2019 1 commit
  22. 01 Apr, 2019 1 commit
  23. 02 Feb, 2019 1 commit
  24. 20 May, 2018 1 commit
    • Guolin Ke's avatar
      Refine config object (#1381) · dc699574
      Guolin Ke authored
      * [WIP] refine config
      
      * [wip] ready for the auto code generate
      
      * auto generate config codes
      
      * use with to open file
      
      * fix bug
      
      * fix pylint
      
      * fix bug
      
      * fix pylint
      
      * fix bugs.
      
      * tmp for failed test.
      
      * fix tests.
      
      * added nthreads alias
      
      * added new aliases from new config.h
      
      * fixed duplicated alias
      
      * refactored parameter_generator.py
      
      * added new aliases from config.h and removed remaining old names
      
      * fix bugs & some miss alias
      
      * added aliases
      
      * add more descriptions.
      
      * add comment.
      dc699574
  25. 11 May, 2018 1 commit
  26. 24 Jan, 2018 1 commit
  27. 12 Dec, 2017 1 commit
  28. 26 Nov, 2017 1 commit
    • Guolin Ke's avatar
      Speed up saving and loading model (#1083) · 8a5ec366
      Guolin Ke authored
      * remove protobuf
      
      * add version number
      
      * remove pmml script
      
      * use float for split gain
      
      * fix warnings
      
      * refine the read model logic of gbdt
      
      * fix compile error
      
      * improve decode speed
      
      * fix some bugs
      
      * fix double accuracy problem
      
      * fix bug
      
      * multi-thread save model
      
      * speed up save model to string
      
      * parallel save/load model
      
      * fix some warnings.
      
      * fix warnings.
      
      * fix a bug
      
      * remove debug output
      
      * fix doc
      
      * fix max_bin warning in tests.
      
      * fix max_bin warning
      
      * fix pylint
      
      * clean code for stringToArray
      
      * clean code for TToString
      
      * remove max_bin
      
      * replace "class" with typename
      8a5ec366
  29. 15 Nov, 2017 2 commits
  30. 16 Sep, 2017 1 commit
  31. 02 Sep, 2017 1 commit
  32. 29 Aug, 2017 2 commits
  33. 20 Aug, 2017 2 commits
  34. 30 Jul, 2017 1 commit
    • Guolin Ke's avatar
      Better missing value handle (#747) · 00cb04a2
      Guolin Ke authored
      * finish the data loading part
      
      * allow prediction.
      
      * fix bug for decision type.
      
      * finish split finding part
      
      * fix bugs.
      
      * bug fixed. add a test .
      
      * fix pep8 .
      
      * update documents.
      
      * fix test bugs.
      
      * fix a format
      
      * fix import error in python test.
      
      * disable missing handle in categorial features.
      
      * fix a bug.
      
      * add more tests.
      
      * fix pep8
      
      * fix bugs.
      
      * remove the missing handle code for categorical feature.
      00cb04a2
  35. 13 Jun, 2017 1 commit
    • wxchan's avatar
      [python] fix dump model with infinite threshold (#617) · f2c99ea4
      wxchan authored
      * avoid threshold inf
      
      * use __save_model_to_string for feature importance
      
      * Revert "use __save_model_to_string for feature importance"
      
      This reverts commit dca6a85fb3d89866eb56eb0c9ca103ada4d92b53.
      f2c99ea4
  36. 06 Jun, 2017 1 commit
  37. 15 May, 2017 1 commit