- 04 May, 2024 1 commit
-
-
James Lamb authored
-
- 19 Mar, 2024 1 commit
-
-
James Lamb authored
-
- 10 Oct, 2023 1 commit
-
-
James Lamb authored
-
- 30 Jun, 2023 1 commit
-
-
maskedcoder1337 authored
-
- 29 May, 2022 1 commit
-
-
Nikita Titov authored
* Update tree.cpp * Update common.h * Update common.h
-
- 22 May, 2022 1 commit
-
-
James Lamb authored
-
- 16 Nov, 2021 1 commit
-
-
chjinche authored
* add customized parser support * fix typo of parser_config_file description * make delimiter as parameter of JoinedLines
-
- 18 Jun, 2021 1 commit
-
-
Chen Yufei authored
* Log warning instead of fatal when parsing float get under/overflow. For texts that resolve to infinity, under or overflow should be accepted. * Remove outdated unit test. * empty commit to trigger ci
-
- 07 May, 2021 1 commit
-
-
Chen Yufei authored
* New build option: USE_PRECISE_TEXT_PARSER. Use fast_double_parser for text file parsing. For each number, fallback to strtod in case of parse failure. * Add benchmark for CSVParser with Atof and AtofPrecise. * Fix lint complaint. * Fix typo in open result error message. * Revert "Fix lint complaint." This reverts commit 92ab0b6bce9f17d7be9eaeb20f19d4a0a36f0387. * Revert "Add benchmark for CSVParser with Atof and AtofPrecise." This reverts commit 4f8639abd06c679d4382eb715a1793afd94df3d2. * Use AtofPrecise in Common::__StringToTHelper. * [option] precise_float_parser: precise float number parsing for text input. * Remove USE_PRECISE_TEXT_PARSER compile option. * test: add test for Common::AtofPrecise. * test: remove ChunkedArrayTest with 0 length. This triggers Log::Fatal which aborts the test program. * fix lint, add copyright. * Revert "test: remove ChunkedArrayTest with 0 length." This reverts commit 346c76affe9e78b6ca2738c4a56dbb9c00f31102. * Use LightGBM::Common::Sign * save precise_float_parser in model file. * Fix error checking in AtofPrecise. Add more test cases. * Remove test case that can't pass under macOS. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 16 Mar, 2021 1 commit
-
-
mjmckp authored
* Fix index out-of-range exception generated by BaggingHelper on small datasets. Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero. * Update goss.hpp * Update goss.hpp * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array) * Fix incorrect upstream merge * Add link to LightGBM.NET * Fix indenting to 2 spaces * Dummy edit to trigger CI * Dummy edit to trigger CI * remove duplicate functions from merge * Fix parsing of non-finite values. Current implementation silently returns zero when input string is "inf", "-inf", or "nan" when compiled with VS2017, so instead just explicitly check for these values and fail if there is no match. No attempt to optimise string allocations in this implementation since it is usually rarely invoked. * Dummy commit to trigger CI * Also handle -nan in double parsing method * Update include/LightGBM/utils/common.h Remove trailing whitespace to pass linting tests Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
matthew-peacock <matthew.peacock@whiteoakam.com> Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 08 Dec, 2020 1 commit
-
-
Alberto Ferreira authored
* Fix LightGBM models locale sensitivity and improve R/W performance. When Java is used, the default C++ locale is broken. This is true for Java providers that use the C API or even Python models that require JEP. This patch solves that issue making the model reads/writes insensitive to such settings. To achieve it, within the model read/write codebase: - C++ streams are imbued with the classic locale - Calls to functions that are dependent on the locale are replaced - The default locale is not changed! This approach means: - The user's locale is never tampered with, avoiding issues such as https://github.com/microsoft/LightGBM/issues/2979 with the previous approach https://github.com/microsoft/LightGBM/pull/2891 - Datasets can still be read according the user's locale - The model file has a single format independent of locale Changes: - Add CommonC namespace which provides faster locale-independent versions of Common's methods - Model code makes conversions through CommonC - Cleanup unused Common methods - Performance improvements. Use fast libraries for locale-agnostic conversion: - value->string: https://github.com/fmtlib/fmt - string->double: https://github.com/lemire/fast_double_parser (10x faster double parsing according to their benchmark) Bugfixes: - https://github.com/microsoft/LightGBM/issues/2500 - https://github.com/microsoft/LightGBM/issues/2890 - https://github.com/ninia/jep/issues/205 (as it is related to LGBM as well) * Align CommonC namespace * Add new external_libs/ to python setup * Try fast_double_parser fix #1 Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe If it works it should fix more LGBM builds * CMake: Attempt to link fmt without explicit PUBLIC tag * Exclude external_libs from linting * Add exernal_libs to MANIFEST.in * Set dynamic linking option for fmt. * linting issues * Try to fix lint includes * Try to pass fPIC with static fmt lib * Try CMake P_I_C option with fmt library * [R-package] Add CMake support for R and CRAN * Cleanup CMakeLists * Try fmt hack to remove stdout * Switch to header-only mode * Add PRIVATE argument to target_link_libraries * use fmt in header-only mode * Remove CMakeLists comment * Change OpenMP to PUBLIC linking in Mac * Update fmt submodule to 7.1.2 * Use fmt in header-only-mode * Remove fmt from CMakeLists.txt * Upgrade fast_double_parser to v0.2.0 * Revert "Add PRIVATE argument to target_link_libraries" This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7. * Address James Lamb's comments * Update R-package/.Rbuildignore Co-authored-by:James Lamb <jaylamb20@gmail.com> * Upgrade to fast_double_parser v0.3.0 - Solaris support * Use legacy code only in Solaris * Fix lint issues * Fix comment * Address StrikerRUS's comments (solaris ifdef). * Change header guards Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 29 Jul, 2020 1 commit
-
-
James Lamb authored
* [R-package] make package installable with CRAN toolchain (fixes #2960) * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * remove GPU stuff * use wildcard to find objects to build * use -lomp * build configure before moving files * using wildcard for objects * Update .github/workflows/main.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * add explicit objects back * reduce allowed R CMD check NOTEs and catch stderr from build-cran-package on Windows * fixing things * pin autoconf version * show diff * add automake back * run less checks * command was in the wrong place * fix autoconf version * change strategy for handling configure * fix Rbuildignore * fix NOTEs * fix notes about unrecognized files * fixing extra files * remove USE_R35 * add OpenMP check for Mac CRAN build * run all checks * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * suggestions from code review * undo indenting * remove 03 from Makevars.win.in * update language about OpenMP in configure script * checking if configure.ac check works * add autoconf back * remove testing code in configure.ac * more fixes for CI on configure script * print git diff * add VERSION.txt when checking configure * fix relative paths * remove git diff Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 08 Jul, 2020 1 commit
-
-
Hongbin Shi authored
-
- 23 Jun, 2020 1 commit
-
-
Belinda Trotta authored
* Add interaction constraints functionality. * Minor fixes. * Minor fixes. * Change lambda to function. * Fix gpu bug, remove extra blank lines. * Fix gpu bug. * Fix style issues. * Try to fix segfault on MACOS. * Fix bug. * Fix bug. * Fix bugs. * Change parameter format for R. * Fix R style issues. * Change string formatting code. * Change docs to say R package not supported. * Remove R functionality, moving to separate PR. * Keep track of branch features in tree object. * Only track branch features when feature interactions are enabled. * Fix lint error. * Update docs and simplify tests.
-
- 05 Jun, 2020 2 commits
-
-
Nikita Titov authored
-
Nikita Titov authored
This reverts commit 656d2676.
-
- 01 Jun, 2020 1 commit
-
-
James Lamb authored
-
- 13 Apr, 2020 1 commit
-
-
Guolin Ke authored
* fix * Apply suggestions from code review Co-authored-by:StrikerRUS <nekit94-12@hotmail.com>
-
- 10 Apr, 2020 1 commit
-
-
OMOTO Tsukasa authored
* Support UTF-8 characters in feature name again This commit reverts 0d59859c. Also see: - https://github.com/microsoft/LightGBM/issues/2226 - https://github.com/microsoft/LightGBM/issues/2478 - https://github.com/microsoft/LightGBM/pull/2229 I reproduced the issue and as @kidotaka gave us a great survey in #2226, I don't conclude that the cause is UTF-8, but "an empty string (character)". Therefore, I revert "throw error when meet non ascii (#2229)" whose commit hash is 0d59859c, and add support feture names as UTF-8 again. * add tests * fix check-docs tests * update * fix tests * update .travis.yml * fix tests * update test_r_package.sh * update test_r_package.sh * update test_r_package.sh * add a test for R-package * update test_r_package.sh * update test_r_package.sh * update test_r_package.sh * fix test for R-package * update test_r_package.sh * update test_r_package.sh * update test_r_package.sh * update test_r_package.sh * update * updte * update * remove unneeded comments
-
- 23 Mar, 2020 1 commit
-
-
James Lamb authored
* [ci] changed sprintf uses to snprintf * checked for encoding issues with snprintf
-
- 06 Mar, 2020 1 commit
-
-
Guolin Ke authored
* only one fix * add more * add more
-
- 05 Mar, 2020 1 commit
-
-
Guolin Ke authored
* speed up for const hessian * rename template * some refactorings * refine * refine * simplify codes * fix random in feature histogram * code refine * refine * try fix * make gcc happy * remove timer * rollback some changes * more templates * fix a bug * reduce the cost of timer * fix gpu * fix bug * fix gpu
-
- 04 Mar, 2020 1 commit
-
-
Nikita Titov authored
* fixed cpplint errors * fixed more cpplint errors
-
- 02 Mar, 2020 1 commit
-
-
Guolin Ke authored
* don't cache `num_thread`, to avoid change outside * rename * update document * Update docs/Parameters.rst * Update include/LightGBM/config.h * Apply suggestions from code review Co-Authored-By:
Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-Authored-By:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 25 Feb, 2020 1 commit
-
-
Guolin Ke authored
* Make timer thread-safe * Update common.h * refines
-
- 08 Feb, 2020 1 commit
-
-
Nikita Titov authored
* various minor style, docs and cpplint improvements * fixed typo in warning * fix recently added cpplint errors * move note for params upper in description for consistency
-
- 02 Feb, 2020 1 commit
-
-
Guolin Ke authored
* commit * fix a bug * fix bug * reset to track changes * refine the auto choose logic * sort the time stats output * fix include * change multi_val_bin_sparse_threshold * add cmake * add _mm_malloc and _mm_free for cross platform * fix cmake bug * timer for split * try to fix cmake * fix tests * refactor DataPartition::Split * fix test * typo * formating * Revert "formating" This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222. * add document * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719) * naming * fix gpu code * Update include/LightGBM/bin.h Co-Authored-By:
James Lamb <jaylamb20@gmail.com> * Update src/treelearner/ocl/histogram16.cl * test: swap compilers for CI * fix omp * not avx2 * no aligned for feature histogram * Revert "refactor DataPartition::Split" This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8. * slightly refactor data partition * reduce the memory cost Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 15 Nov, 2019 1 commit
-
-
James Lamb authored
-
- 11 Nov, 2019 1 commit
-
-
Nikita Titov authored
-
- 01 Nov, 2019 1 commit
-
-
Guolin Ke authored
-
- 07 Oct, 2019 1 commit
-
-
James Lamb authored
* fixed miscellaneous typos in documentation * fix typo introduced in typo-fixing PR
-
- 22 Sep, 2019 1 commit
-
-
Guolin Ke authored
* fix many cpp lint errors * indent * fix bug * fix more * fix gpu * more fixes
-
- 08 Sep, 2019 1 commit
-
-
CharlesAuguste authored
* Some basic changes to the plot of the trees to make them readable. * Squeezed the information in the nodes. * Added colouring when a dictionnary mapping the features to the constraints is passed. * Fix spaces. * Added data percentage as an option in the nodes. * Squeezed the information in the leaves. * Important information is now in bold. * Added a legend for the color of monotone splits. * Changed "split_gain" to "gain" and "internal_value" to "value". * Sqeezed leaves a bit more. * Changed description in the legend. * Revert "Sqeezed leaves a bit more." This reverts commit dd8bf14a3ba604b0dfae3b7bb1c64b6784d15e03. * Increased the readability for the gain. * Tidied up the legend. * Added the data percentage in the leaves. * Added the monotone constraints to the dumped model. * Monotone constraints are now specified automatically when plotting trees. * Raise an exception instead of the bug that was here before. * Removed operators on the branches for a clearer design. * Small cleaning of the code. * Setting a monotone constraint on a categorical feature now returns an exception instead of doing nothing. * Fix bug when monotone constraints are empty. * Fix another bug when monotone constraints are empty. * Variable name change. * Added is / isn't on every edge of the trees. * Fix test "tree_create_digraph". * Add new test for plotting trees with monotone constraints. * Typo. * Update documentation of categorical features. * Typo. * Information in nodes more explicit. * Used regular strings instead of raw strings. * Small refactoring. * Some cleaning. * Added future statement. * Changed output for consistency. * Updated documentation. * Added comments for colors. * Changed text on edges for more clarity. * Small refactoring. * Modified text in leaves for consistency with nodes. * Updated default values and documentaton for consistency. * Replaced CHECK with Log::Fatal for user-friendliness. * Updated tests. * Typo. * Simplify imports. * Swapped count and weight to improve readibility of the leaves in the plotted trees. * Thresholds in bold. * Made information in nodes written in a specific order. * Added information to clarify legend. * Code cleaning.
-
- 14 Aug, 2019 1 commit
-
-
Guolin Ke authored
* fix nan in tree model * fix
-
- 18 Jul, 2019 1 commit
-
-
Guolin Ke authored
* throw error when meet non ascii * check ascii for config strings.
-
- 13 Apr, 2019 2 commits
-
-
Nikita Titov authored
-
Nikita Titov authored
-
- 11 Apr, 2019 1 commit
-
-
Nikita Titov authored
* added all necessary includes - fixed build/include_what_you_use error * fixed the order of includes (build/include_order)
-
- 04 Apr, 2019 1 commit
-
-
remcob-gr authored
* Add configuration parameters for CEGB. * Add skeleton CEGB tree learner Like the original CEGB version, this inherits from SerialTreeLearner. Currently, it changes nothing from the original. * Track features used in CEGB tree learner. * Pull CEGB tradeoff and coupled feature penalty from config. * Implement finding best splits for CEGB This is heavily based on the serial version, but just adds using the coupled penalties. * Set proper defaults for cegb parameters. * Ensure sanity checks don't switch off CEGB. * Implement per-data-point feature penalties in CEGB. * Implement split penalty and remove unused parameters. * Merge changes from CEGB tree learner into serial tree learner * Represent features_used_in_data by a bitset, to reduce the memory overhead of CEGB, and add sanity checks for the lengths of the penalty vectors. * Fix bug where CEGB would incorrectly penalise a previously used feature The tree learner did not update the gains of previously computed leaf splits when splitting a leaf elsewhere in the tree. This caused it to prefer new features due to incorrectly penalising splitting on previously used features. * Document CEGB parameters and add them to the appropriate section. * Remove leftover reference to cegb tree learner. * Remove outdated diff. * Fix warnings * Fix minor issues identified by @StrikerRUS. * Add docs section on CEGB, including citation. * Fix link. * Fix CI failure. * Add some unit tests * Fix pylint issues. * Fix remaining pylint issue
-
- 01 Apr, 2019 1 commit
-
-
Nikita Titov authored
-