1. 28 Feb, 2020 1 commit
  2. 22 Feb, 2020 1 commit
    • Guolin Ke's avatar
      some code refactoring (#2769) · 3e80df7e
      Guolin Ke authored
      * some refines
      
      * more omp refactoring
      
      * format define
      
      * fix merge bug
      
      * some fixes
      
      * fix some warnings
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * remove dup codes
      3e80df7e
  3. 20 Feb, 2020 1 commit
  4. 19 Feb, 2020 1 commit
    • Guolin Ke's avatar
      [python] [R-package] refine the parameters for Dataset (#2594) · 9f79e840
      Guolin Ke authored
      
      
      * reset
      
      * fix a bug
      
      * fix test
      
      * Update c_api.h
      
      * support to no filter features by min_data
      
      * add warning in reset config
      
      * refine warnings for override dataset's parameter
      
      * some cleans
      
      * clean code
      
      * clean code
      
      * refine C API function doxygen comments
      
      * refined new param description
      
      * refined doxygen comments for R API function
      
      * removed stuff related to int8
      
      * break long line in warning message
      
      * removed tests which results cannot be validated anymore
      
      * added test for warnings about unchangeable params
      
      * write parameter from dataset to booster
      
      * consider free_raw_data.
      
      * fix params
      
      * fix bug
      
      * implementing R
      
      * fix typo
      
      * filter params in R
      
      * fix R
      
      * not min_data
      
      * refined tests
      
      * fixed linting
      
      * refine
      
      * pilint
      
      * add docstring
      
      * fix docstring
      
      * R lint
      
      * updated description for C API function
      
      * use param aliases in Python
      
      * fixed typo
      
      * fixed typo
      
      * added more params to test
      
      * removed debug print
      
      * fix dataset construct place
      
      * fix merge bug
      
      * Update feature_histogram.hpp
      
      * add is_sparse back
      
      * remove unused parameters
      
      * fix lint
      
      * add data random seed
      
      * update
      
      * [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767)
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      9f79e840
  5. 02 Feb, 2020 1 commit
    • Guolin Ke's avatar
      Support both row-wise and col-wise multi-threading (#2699) · 509c2e50
      Guolin Ke authored
      
      
      * commit
      
      * fix a bug
      
      * fix bug
      
      * reset to track changes
      
      * refine the auto choose logic
      
      * sort the time stats output
      
      * fix include
      
      * change  multi_val_bin_sparse_threshold
      
      * add cmake
      
      * add _mm_malloc and _mm_free for cross platform
      
      * fix cmake bug
      
      * timer for split
      
      * try to fix cmake
      
      * fix tests
      
      * refactor DataPartition::Split
      
      * fix test
      
      * typo
      
      * formating
      
      * Revert "formating"
      
      This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222.
      
      * add document
      
      * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719)
      
      * naming
      
      * fix gpu code
      
      * Update include/LightGBM/bin.h
      Co-Authored-By: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Update src/treelearner/ocl/histogram16.cl
      
      * test: swap compilers for CI
      
      * fix omp
      
      * not avx2
      
      * no aligned for feature histogram
      
      * Revert "refactor DataPartition::Split"
      
      This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8.
      
      * slightly refactor data partition
      
      * reduce the memory cost
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      509c2e50
  6. 14 Jan, 2020 1 commit
    • Guolin Ke's avatar
      support most frequent bin (#2689) · c7e90393
      Guolin Ke authored
      * implement
      
      * fix warning
      
      * fix bug
      
      * fix a bug
      
      * remove unneed function
      
      * fix data push bug
      
      * fix valid data push
      
      * fix bug for missing_type=zero
      
      * refine split
      
      * renames
      
      * typo
      c7e90393
  7. 29 Nov, 2019 1 commit
  8. 01 Nov, 2019 1 commit
  9. 25 Oct, 2019 1 commit
  10. 15 Oct, 2019 1 commit
    • Guolin Ke's avatar
      reduce the buffer when using high dimensional data in distributed mode. (#2485) · 40e56ca7
      Guolin Ke authored
      * reduce the buffer when using high dimensional data in distributed mode.
      
      * Update dataset_loader.cpp
      
      * refix
      
      * typo
      
      * fix number of bin accumulation.
      
      * avoid overflow
      
      * fix warning
      
      * efficient solution.
      
      * Update dataset.h
      
      * fix bin count output
      
      * fix warning
      
      * bug in dist number of feature check
      
      * fix possible edge case
      
      * Update dataset.cpp
      
      * possible bug fix
      
      * fix
      40e56ca7
  11. 03 Oct, 2019 1 commit
  12. 01 Oct, 2019 1 commit
  13. 28 Sep, 2019 1 commit
    • Belinda Trotta's avatar
      Predefined bin thresholds (#2325) · cc7a1e27
      Belinda Trotta authored
      * Fix bug where small values of max_bin cause crash.
      
      * Revert "Fix bug where small values of max_bin cause crash."
      
      This reverts commit fe5c8e2547057c1fa5750bcddd359dd7708fab4b.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Change binning behavior to be same as PR #2342.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Change binning behavior to be same as PR #2342.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Minor style and doc fixes.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Minor style and doc fixes.
      
      * Change binning behavior to be same as PR #2342.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Add functionality to force bin thresholds.
      
      * Fix style issues.
      
      * Use stable sort.
      
      * Minor style and doc fixes.
      
      * Change binning behavior to be same as PR #2342.
      
      * Use different bin finding function for predefined bounds.
      
      * Fix style issues.
      
      * Minor refactoring, overload FindBinWithZeroAsOneBin.
      
      * Fix style issues.
      
      * Fix bug and add new test.
      
      * Add warning when using categorical features with forced bins.
      
      * Pass forced_upper_bounds by reference.
      
      * Pass container types by const reference.
      
      * Get categorical features using FeatureBinMapper.
      
      * Fix bug for small max_bin.
      
      * Move GetForcedBins to DatasetLoader.
      
      * Find forced bins in dataset_loader.
      
      * Minor fixes.
      cc7a1e27
  14. 22 Sep, 2019 1 commit
  15. 08 Sep, 2019 1 commit
    • CharlesAuguste's avatar
      [python] Improved python tree plots (#2304) · f52be9be
      CharlesAuguste authored
      * Some basic changes to the plot of the trees to make them readable.
      
      * Squeezed the information in the nodes.
      
      * Added colouring when a dictionnary mapping the features to the constraints is passed.
      
      * Fix spaces.
      
      * Added data percentage as an option in the nodes.
      
      * Squeezed the information in the leaves.
      
      * Important information is now in bold.
      
      * Added a legend for the color of monotone splits.
      
      * Changed "split_gain" to "gain" and "internal_value" to "value".
      
      * Sqeezed leaves a bit more.
      
      * Changed description in the legend.
      
      * Revert "Sqeezed leaves a bit more."
      
      This reverts commit dd8bf14a3ba604b0dfae3b7bb1c64b6784d15e03.
      
      * Increased the readability for the gain.
      
      * Tidied up the legend.
      
      * Added the data percentage in the leaves.
      
      * Added the monotone constraints to the dumped model.
      
      * Monotone constraints are now specified automatically when plotting trees.
      
      * Raise an exception instead of the bug that was here before.
      
      * Removed operators on the branches for a clearer design.
      
      * Small cleaning of the code.
      
      * Setting a monotone constraint on a categorical feature now returns an exception instead of doing nothing.
      
      * Fix bug when monotone constraints are empty.
      
      * Fix another bug when monotone constraints are empty.
      
      * Variable name change.
      
      * Added is / isn't on every edge of the trees.
      
      * Fix test "tree_create_digraph".
      
      * Add new test for plotting trees with monotone constraints.
      
      * Typo.
      
      * Update documentation of categorical features.
      
      * Typo.
      
      * Information in nodes more explicit.
      
      * Used regular strings instead of raw strings.
      
      * Small refactoring.
      
      * Some cleaning.
      
      * Added future statement.
      
      * Changed output for consistency.
      
      * Updated documentation.
      
      * Added comments for colors.
      
      * Changed text on edges for more clarity.
      
      * Small refactoring.
      
      * Modified text in leaves for consistency with nodes.
      
      * Updated default values and documentaton for consistency.
      
      * Replaced CHECK with Log::Fatal for user-friendliness.
      
      * Updated tests.
      
      * Typo.
      
      * Simplify imports.
      
      * Swapped count and weight to improve readibility of the leaves in the plotted trees.
      
      * Thresholds in bold.
      
      * Made information in nodes written in a specific order.
      
      * Added information to clarify legend.
      
      * Code cleaning.
      f52be9be
  16. 25 Jul, 2019 1 commit
  17. 08 Jul, 2019 1 commit
    • Belinda Trotta's avatar
      Max bin by feature (#2190) · 291752de
      Belinda Trotta authored
      * Add parameter max_bin_by_feature.
      
      * Fix minor bug.
      
      * Fix minor bug.
      
      * Fix calculation of header size for writing binary file.
      
      * Fix style issues.
      
      * Fix python style issue.
      
      * Fix test and python style issue.
      291752de
  18. 20 Jun, 2019 1 commit
  19. 13 Apr, 2019 1 commit
  20. 11 Apr, 2019 1 commit
  21. 26 Mar, 2019 1 commit
  22. 02 Feb, 2019 1 commit
  23. 31 Jan, 2019 1 commit
  24. 23 Jan, 2019 1 commit
  25. 18 Jan, 2019 1 commit
  26. 16 Jan, 2019 1 commit
    • remcob-gr's avatar
      When loading a binary file, take feature penalty and monotone constraints from... · 61527856
      remcob-gr authored
      When loading a binary file, take feature penalty and monotone constraints from config if given there. (#1881)
      
      * When loading a binary file, take feature penalty from config if given there.
      
      * When loading a binary file, take feature penalty from config if given there.
      
      * Fix crash when num_features != num_total_features and feature_contri is given.
      
      * Apply the same logic to monotone_types_.
      
      * Fix indentation
      61527856
  27. 09 Oct, 2018 1 commit
    • Guolin Ke's avatar
      average predictions for constant features (#1735) · c920e634
      Guolin Ke authored
      * average predictions for constant features
      
      * fix possible numerical issues in std::log.
      
      * fix pylint
      
      * fix bugs in c_api
      
      * fix styles
      
      * clean code for multi class
      
      * rewrite test
      
      * fix pylint
      
      * skip test_constant_features
      
      * refine test
      
      * fix tests
      
      * fix tests
      
      * update FAQ
      
      * fix test
      
      * Update FAQ.rst
      c920e634
  28. 16 Aug, 2018 1 commit
  29. 14 Jun, 2018 1 commit
  30. 20 May, 2018 1 commit
    • Guolin Ke's avatar
      Refine config object (#1381) · dc699574
      Guolin Ke authored
      * [WIP] refine config
      
      * [wip] ready for the auto code generate
      
      * auto generate config codes
      
      * use with to open file
      
      * fix bug
      
      * fix pylint
      
      * fix bug
      
      * fix pylint
      
      * fix bugs.
      
      * tmp for failed test.
      
      * fix tests.
      
      * added nthreads alias
      
      * added new aliases from new config.h
      
      * fixed duplicated alias
      
      * refactored parameter_generator.py
      
      * added new aliases from config.h and removed remaining old names
      
      * fix bugs & some miss alias
      
      * added aliases
      
      * add more descriptions.
      
      * add comment.
      dc699574
  31. 11 May, 2018 1 commit
  32. 18 Apr, 2018 1 commit
  33. 17 Apr, 2018 2 commits
  34. 27 Feb, 2018 1 commit
    • ebernhardson's avatar
      Experimental support for HDFS (#1243) · 7e186a57
      ebernhardson authored
      * Read and write datsets from hdfs.
      * Only enabled when cmake is run with -DUSE_HDFS:BOOL=TRUE
      * Introduces VirtualFile(Reader|Writer) to asbtract VFS differences
      7e186a57
  35. 25 Dec, 2017 1 commit
  36. 17 Dec, 2017 1 commit
  37. 15 Dec, 2017 1 commit
  38. 12 Dec, 2017 1 commit
  39. 12 Oct, 2017 1 commit