- 04 May, 2021 1 commit
-
-
Andrew Ziem authored
* Correct spelling Most changes were in comments, and there were a few changes to literals for log output. There were no changes to variable names, function names, IDs, or functionality. * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Correct spelling Most are code comments, but one case is a literal in a logging message. There are a few grammar fixes too. Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 19 Feb, 2021 1 commit
-
-
James Lamb authored
* [docs] Change some 'parallel learning' references to 'distributed learning' * found a few more * one more reference
-
- 03 Feb, 2021 1 commit
-
-
Chen Yufei authored
* Add new task type: "save_binary". * Document for task "save_binary".
-
- 07 Jan, 2021 2 commits
-
-
shiyu1994 authored
-
Belinda Trotta authored
* Fix compiler warnings caused by implicit type conversion * Fix more warnings * Fix more warnings
-
- 28 Dec, 2020 1 commit
-
-
Nikita Titov authored
* small code and docs refactoring * Update CMakeLists.txt * Update .vsts-ci.yml * Update test.sh * continue * continue * revert stable sort for all-unique values
-
- 24 Dec, 2020 1 commit
-
-
Belinda Trotta authored
* Add Eigen library. * Working for simple test. * Apply changes to config params. * Handle nan data. * Update docs. * Add test. * Only load raw data if boosting=gbdt_linear * Remove unneeded code. * Minor updates. * Update to work with sk-learn interface. * Update to work with chunked datasets. * Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters. * Save raw data in binary dataset file. * Update docs and fix parameter checking. * Fix dataset loading. * Add test for regularization. * Fix bugs when saving and loading tree. * Add test for load/save linear model. * Remove unneeded code. * Fix case where not enough leaf data for linear model. * Simplify code. * Speed up code. * Speed up code. * Simplify code. * Speed up code. * Fix bugs. * Working version. * Store feature data column-wise (not fully working yet). * Fix bugs. * Speed up. * Speed up. * Remove unneeded code. * Small speedup. * Speed up. * Minor updates. * Remove unneeded code. * Fix bug. * Fix bug. * Speed up. * Speed up. * Simplify code. * Remove unneeded code. * Fix bug, add more tests. * Fix bug and add test. * Only store numerical features * Fix bug and speed up using templates. * Speed up prediction. * Fix bug with regularisation * Visual studio files. * Working version * Only check nans if necessary * Store coeff matrix as an array. * Align cache lines * Align cache lines * Preallocation coefficient calculation matrices * Small speedups * Small speedup * Reverse cache alignment changes * Change to dynamic schedule * Update docs. * Refactor so that linear tree learner is not a separate class. * Add refit capability. * Speed up * Small speedups. * Speed up add prediction to score. * Fix bug * Fix bug and speed up. * Speed up dataload. * Speed up dataload * Use vectors instead of pointers * Fix bug * Add OMP exception handling. * Change return type of LGBM_BoosterGetLinear to bool * Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change * Remove unused internal_parent_ property of tree * Remove unused parameter to CreateTreeLearner * Remove reference to LinearTreeLearner * Minor style issues * Remove unneeded check * Reverse temporary testing change * Fix Visual Studio project files * Restore LightGBM.vcxproj.filters * Speed up * Speed up * Simplify code * Update docs * Simplify code * Initialise storage space for max num threads * Move Eigen to include directory and delete unused files * Remove old files. * Fix so it compiles with mingw * Fix gpu tree learner * Change AddPredictionToScore back to const * Fix python lint error * Fix C++ lint errors * Change eigen to a submodule * Update comment * Add the eigen folder * Try to fix build issues with eigen * Remove eigen files * Add eigen as submodule * Fix include paths * Exclude eigen files from Python linter * Ignore eigen folders for pydocstyle * Fix C++ linting errors * Fix docs * Fix docs * Exclude eigen directories from doxygen * Update manifest to include eigen * Update build_r to include eigen files * Fix compiler warnings * Store raw feature data as float * Use float for calculating linear coefficients * Remove eigen directory from GLOB * Don't compile linear model code when building R package * Fix doxygen issue * Fix lint issue * Fix lint issue * Remove uneeded code * Restore delected lines * Restore delected lines * Change return type of has_raw to bool * Update docs * Rename some variables and functions for readability * Make tree_learner parameter const in AddScore * Fix style issues * Pass vectors as const reference when setting tree properties * Make temporary storage of serial_tree_learner mutable so we can make the object's methods const * Remove get_raw_size, use num_numeric_features instead * Fix typo * Make contains_nan_ and any_nan_ properties immutable again * Remove data_has_nan_ property of tree * Remove temporary test code * Make linear_tree a dataset param * Fix lint error * Make LinearTreeLearner a separate class * Fix lint errors * Fix lint error * Add linear_tree_learner.o * Simulate omp_get_max_threads if openmp is not available * Update PushOneData to also store raw data. * Cast size to int * Fix bug in ReshapeRaw * Speed up code with multithreading * Use OMP_NUM_THREADS * Speed up with multithreading * Update to use ArrayToString * Fix tests * Fix test * Fix bug introduced in merge * Minor updates * Update docs
-
- 07 Dec, 2020 1 commit
-
-
Nikita Titov authored
-
- 05 Dec, 2020 1 commit
-
-
Chen Yufei authored
* Check max_bin, etc. match config when using binary. * Check max_bin_by_feature, bin_construct_sample_cnt matching config.
-
- 24 Nov, 2020 1 commit
-
-
shiyu1994 authored
Fix num_total_bin_ and bin_offsets_ of FeatureGroup if a dense multi val feature group with non zero most freq bin is the first feature group of the dataset.
-
- 06 Nov, 2020 1 commit
-
-
Guolin Ke authored
* better document for bin_construct_sample_cnt * add warnings Co-authored-by:StrikerRUS <nekit94-12@hotmail.com>
-
- 30 Sep, 2020 1 commit
-
-
Guolin Ke authored
* fix dataset binary file alignment * many fixes * fix warnings * fix bug * Update file_io.cpp * Update file_io.cpp * simplify code * Apply suggestions from code review * general * remove unneeded alignment * Update file_io.h * int32 to byte8 alignment * Apply suggestions from code review * Apply suggestions from code review
-
- 29 Jul, 2020 1 commit
-
-
Lucas David authored
* ~ Modified name of method DatasetLoader::CostructFromSampleData to DatasetLoader::ConstructFromSampleData. & Build passes for Debug, Debug_DLL, DLL and Release (not tested Debug_mpi and Release_mpi). * ~ Refactored indentations. Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 05 Jun, 2020 1 commit
-
-
Nikita Titov authored
This reverts commit 656d2676.
-
- 01 Jun, 2020 1 commit
-
-
James Lamb authored
-
- 08 Apr, 2020 1 commit
-
-
Guolin Ke authored
* indent and constructor * fix more * fix long Co-authored-by:StrikerRUS <nekit94-12@hotmail.com>
-
- 06 Mar, 2020 1 commit
-
-
Nikita Titov authored
-
- 05 Mar, 2020 1 commit
-
-
Guolin Ke authored
* speed up for const hessian * rename template * some refactorings * refine * refine * simplify codes * fix random in feature histogram * code refine * refine * try fix * make gcc happy * remove timer * rollback some changes * more templates * fix a bug * reduce the cost of timer * fix gpu * fix bug * fix gpu
-
- 04 Mar, 2020 1 commit
-
-
Nikita Titov authored
* fixed cpplint errors * fixed more cpplint errors
-
- 02 Mar, 2020 1 commit
-
-
Nikita Titov authored
-
- 28 Feb, 2020 1 commit
-
-
Nikita Titov authored
-
- 22 Feb, 2020 1 commit
-
-
Guolin Ke authored
* some refines * more omp refactoring * format define * fix merge bug * some fixes * fix some warnings * Apply suggestions from code review * Apply suggestions from code review * remove dup codes
-
- 20 Feb, 2020 1 commit
-
-
Guolin Ke authored
* remove related cpp codes * removed more mentiones of init_score_filename params Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 19 Feb, 2020 1 commit
-
-
Guolin Ke authored
* reset * fix a bug * fix test * Update c_api.h * support to no filter features by min_data * add warning in reset config * refine warnings for override dataset's parameter * some cleans * clean code * clean code * refine C API function doxygen comments * refined new param description * refined doxygen comments for R API function * removed stuff related to int8 * break long line in warning message * removed tests which results cannot be validated anymore * added test for warnings about unchangeable params * write parameter from dataset to booster * consider free_raw_data. * fix params * fix bug * implementing R * fix typo * filter params in R * fix R * not min_data * refined tests * fixed linting * refine * pilint * add docstring * fix docstring * R lint * updated description for C API function * use param aliases in Python * fixed typo * fixed typo * added more params to test * removed debug print * fix dataset construct place * fix merge bug * Update feature_histogram.hpp * add is_sparse back * remove unused parameters * fix lint * add data random seed * update * [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767) Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 02 Feb, 2020 1 commit
-
-
Guolin Ke authored
* commit * fix a bug * fix bug * reset to track changes * refine the auto choose logic * sort the time stats output * fix include * change multi_val_bin_sparse_threshold * add cmake * add _mm_malloc and _mm_free for cross platform * fix cmake bug * timer for split * try to fix cmake * fix tests * refactor DataPartition::Split * fix test * typo * formating * Revert "formating" This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222. * add document * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719) * naming * fix gpu code * Update include/LightGBM/bin.h Co-Authored-By:
James Lamb <jaylamb20@gmail.com> * Update src/treelearner/ocl/histogram16.cl * test: swap compilers for CI * fix omp * not avx2 * no aligned for feature histogram * Revert "refactor DataPartition::Split" This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8. * slightly refactor data partition * reduce the memory cost Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 14 Jan, 2020 1 commit
-
-
Guolin Ke authored
* implement * fix warning * fix bug * fix a bug * remove unneed function * fix data push bug * fix valid data push * fix bug for missing_type=zero * refine split * renames * typo
-
- 29 Nov, 2019 1 commit
-
-
ashok-ponnuswami-msft authored
-
- 01 Nov, 2019 1 commit
-
-
Guolin Ke authored
-
- 25 Oct, 2019 1 commit
-
-
Nikita Titov authored
-
- 15 Oct, 2019 1 commit
-
-
Guolin Ke authored
* reduce the buffer when using high dimensional data in distributed mode. * Update dataset_loader.cpp * refix * typo * fix number of bin accumulation. * avoid overflow * fix warning * efficient solution. * Update dataset.h * fix bin count output * fix warning * bug in dist number of feature check * fix possible edge case * Update dataset.cpp * possible bug fix * fix
-
- 03 Oct, 2019 1 commit
-
-
Guolin Ke authored
* check the shape for mat, csr and csc * guess from csr * support file checking * better error msg * grammar * clean code * code clean * check range for CSR * Update test_.py * Update test_.py * added tests
-
- 01 Oct, 2019 1 commit
-
-
Nikita Titov authored
-
- 28 Sep, 2019 1 commit
-
-
Belinda Trotta authored
* Fix bug where small values of max_bin cause crash. * Revert "Fix bug where small values of max_bin cause crash." This reverts commit fe5c8e2547057c1fa5750bcddd359dd7708fab4b. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Change binning behavior to be same as PR #2342. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Change binning behavior to be same as PR #2342. * Add functionality to force bin thresholds. * Fix style issues. * Minor style and doc fixes. * Add functionality to force bin thresholds. * Fix style issues. * Minor style and doc fixes. * Change binning behavior to be same as PR #2342. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Add functionality to force bin thresholds. * Fix style issues. * Use stable sort. * Minor style and doc fixes. * Change binning behavior to be same as PR #2342. * Use different bin finding function for predefined bounds. * Fix style issues. * Minor refactoring, overload FindBinWithZeroAsOneBin. * Fix style issues. * Fix bug and add new test. * Add warning when using categorical features with forced bins. * Pass forced_upper_bounds by reference. * Pass container types by const reference. * Get categorical features using FeatureBinMapper. * Fix bug for small max_bin. * Move GetForcedBins to DatasetLoader. * Find forced bins in dataset_loader. * Minor fixes.
-
- 22 Sep, 2019 1 commit
-
-
Guolin Ke authored
* fix many cpp lint errors * indent * fix bug * fix more * fix gpu * more fixes
-
- 08 Sep, 2019 1 commit
-
-
CharlesAuguste authored
* Some basic changes to the plot of the trees to make them readable. * Squeezed the information in the nodes. * Added colouring when a dictionnary mapping the features to the constraints is passed. * Fix spaces. * Added data percentage as an option in the nodes. * Squeezed the information in the leaves. * Important information is now in bold. * Added a legend for the color of monotone splits. * Changed "split_gain" to "gain" and "internal_value" to "value". * Sqeezed leaves a bit more. * Changed description in the legend. * Revert "Sqeezed leaves a bit more." This reverts commit dd8bf14a3ba604b0dfae3b7bb1c64b6784d15e03. * Increased the readability for the gain. * Tidied up the legend. * Added the data percentage in the leaves. * Added the monotone constraints to the dumped model. * Monotone constraints are now specified automatically when plotting trees. * Raise an exception instead of the bug that was here before. * Removed operators on the branches for a clearer design. * Small cleaning of the code. * Setting a monotone constraint on a categorical feature now returns an exception instead of doing nothing. * Fix bug when monotone constraints are empty. * Fix another bug when monotone constraints are empty. * Variable name change. * Added is / isn't on every edge of the trees. * Fix test "tree_create_digraph". * Add new test for plotting trees with monotone constraints. * Typo. * Update documentation of categorical features. * Typo. * Information in nodes more explicit. * Used regular strings instead of raw strings. * Small refactoring. * Some cleaning. * Added future statement. * Changed output for consistency. * Updated documentation. * Added comments for colors. * Changed text on edges for more clarity. * Small refactoring. * Modified text in leaves for consistency with nodes. * Updated default values and documentaton for consistency. * Replaced CHECK with Log::Fatal for user-friendliness. * Updated tests. * Typo. * Simplify imports. * Swapped count and weight to improve readibility of the leaves in the plotted trees. * Thresholds in bold. * Made information in nodes written in a specific order. * Added information to clarify legend. * Code cleaning.
-
- 25 Jul, 2019 1 commit
-
-
Nikita Titov authored
-
- 08 Jul, 2019 1 commit
-
-
Belinda Trotta authored
* Add parameter max_bin_by_feature. * Fix minor bug. * Fix minor bug. * Fix calculation of header size for writing binary file. * Fix style issues. * Fix python style issue. * Fix test and python style issue.
-
- 20 Jun, 2019 1 commit
-
-
Guolin Ke authored
-
- 13 Apr, 2019 1 commit
-
-
Nikita Titov authored
-
- 11 Apr, 2019 1 commit
-
-
Nikita Titov authored
* added all necessary includes - fixed build/include_what_you_use error * fixed the order of includes (build/include_order)
-