- 19 Feb, 2021 1 commit
-
-
James Lamb authored
* [docs] Change some 'parallel learning' references to 'distributed learning' * found a few more * one more reference
-
- 13 Feb, 2021 3 commits
-
-
Belinda Trotta authored
* Update docs about linear tree and monotone constraints * Fix punctuation
-
Benjamin Sergeant authored
* openmp_wrapper.h stubs signature use __GOMP_NOTHROW Fix #3915. OpenMP stubs do not use the noexcept attibute which is present in the gcc version of openmp, and which trigger compilation errors as seen below. dmlc-core uses the same technique and macro. /usr/lib/gcc/x86_64-linux-gnu/9/include/omp.h:114:12: error: declaration of ‘int omp_get_thread_num() noexcept’ has a different exception specifier 114 | extern int omp_get_thread_num (void) __GOMP_NOTHROW; | ^~~~~~~~~~~~~~~~~~ ... xxx/include/LightGBM/utils/openmp_wrapper.h:81:14: note: from previous declaration ‘int omp_get_thread_num()’ 81 | inline int omp_get_thread_num() {return 0;} | ^~~~~~~~~~~~~~~~~~ * move __GOMP_NOTHROW definition in the no open mp stub conditional branch * Update include/LightGBM/utils/openmp_wrapper.h Yes make sense, just changed it Co-authored-by:James Lamb <jaylamb20@gmail.com> * add NOLINT macro to disable cpplint on a safe line of code Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
mjmckp authored
Fix access violation exception that can occur during invocation of loop lambda function when inner_start >= inner_end in 'For' template (#3936) * Fix index out-of-range exception generated by BaggingHelper on small datasets. Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero. * Update goss.hpp * Update goss.hpp * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array) * Fix incorrect upstream merge * Add link to LightGBM.NET * Fix indenting to 2 spaces * Dummy edit to trigger CI * Dummy edit to trigger CI * remove duplicate functions from merge * Fix access violation exception that can occur during invocation of loop lambda function when inner_start >= inner_end in 'For' template. In particular, this can occur in Tree::AddPredictionToScore on line 291 where the loop lambda function body (created by the PredictionFun macro) dereferences used_data_indices[start]. For reference, the particular case which triggered this exception in my case was: * start = 0 * end = 93,203 * n_block = 56 * min_block_size = 512 for which the BlockInfo method gave: * n_block = 56 * num_inner = 1,696 and so, for the case i=55 (i.e., the last case in the loop), we get * inner_start = start + num_inner * i = 93,280 which is greater than 'end' and hence triggers the exception. * Change formatting of proposed modification Co-authored-by:matthew-peacock <matthew.peacock@whiteoakam.com> Co-authored-by:
Guolin Ke <guolin.ke@outlook.com>
-
- 03 Feb, 2021 1 commit
-
-
Chen Yufei authored
* Add new task type: "save_binary". * Document for task "save_binary".
-
- 31 Jan, 2021 1 commit
-
-
Nikita Titov authored
* document CUDA version support * address review comments * collapse CUDA section in the guide * remove Clang support from CUDA docs as we have never tested it
-
- 28 Jan, 2021 1 commit
-
-
Nikita Titov authored
-
- 25 Jan, 2021 1 commit
-
-
shiyu1994 authored
-
- 21 Jan, 2021 1 commit
-
-
Nikita Titov authored
-
- 18 Jan, 2021 2 commits
-
-
James Lamb authored
* [python-package] expand documentation on 'group' for ranking task * add R package * update Query Data section * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * fix typo in group example * regenerate parameters * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * regenerate R docs Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
James Lamb authored
* [R-package] enable use of trees with linear models at leaves (fixes #3319) * remove problematic pragmas * fix tests * try to fix build scripts * try fixing pragma check * more pragma checks * ok fix pragma stuff for real * empty commit * regenerate documentation * try skipping test * uncomment CI * add note on missing value types for R * add tests on saving and re-loading booster
-
- 11 Jan, 2021 1 commit
-
-
Chip Kerchner authored
-
- 07 Jan, 2021 1 commit
-
-
Belinda Trotta authored
* Fix compiler warnings caused by implicit type conversion * Fix more warnings * Fix more warnings
-
- 03 Jan, 2021 1 commit
-
-
sisco0 authored
Compile warnings have been fixed
-
- 29 Dec, 2020 1 commit
-
-
James Lamb authored
* [docs] add doc on min_data_in_leaf approximation (fixes #3634) * Fix capital letter Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 28 Dec, 2020 1 commit
-
-
Nikita Titov authored
* small code and docs refactoring * Update CMakeLists.txt * Update .vsts-ci.yml * Update test.sh * continue * continue * revert stable sort for all-unique values
-
- 24 Dec, 2020 1 commit
-
-
Belinda Trotta authored
* Add Eigen library. * Working for simple test. * Apply changes to config params. * Handle nan data. * Update docs. * Add test. * Only load raw data if boosting=gbdt_linear * Remove unneeded code. * Minor updates. * Update to work with sk-learn interface. * Update to work with chunked datasets. * Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters. * Save raw data in binary dataset file. * Update docs and fix parameter checking. * Fix dataset loading. * Add test for regularization. * Fix bugs when saving and loading tree. * Add test for load/save linear model. * Remove unneeded code. * Fix case where not enough leaf data for linear model. * Simplify code. * Speed up code. * Speed up code. * Simplify code. * Speed up code. * Fix bugs. * Working version. * Store feature data column-wise (not fully working yet). * Fix bugs. * Speed up. * Speed up. * Remove unneeded code. * Small speedup. * Speed up. * Minor updates. * Remove unneeded code. * Fix bug. * Fix bug. * Speed up. * Speed up. * Simplify code. * Remove unneeded code. * Fix bug, add more tests. * Fix bug and add test. * Only store numerical features * Fix bug and speed up using templates. * Speed up prediction. * Fix bug with regularisation * Visual studio files. * Working version * Only check nans if necessary * Store coeff matrix as an array. * Align cache lines * Align cache lines * Preallocation coefficient calculation matrices * Small speedups * Small speedup * Reverse cache alignment changes * Change to dynamic schedule * Update docs. * Refactor so that linear tree learner is not a separate class. * Add refit capability. * Speed up * Small speedups. * Speed up add prediction to score. * Fix bug * Fix bug and speed up. * Speed up dataload. * Speed up dataload * Use vectors instead of pointers * Fix bug * Add OMP exception handling. * Change return type of LGBM_BoosterGetLinear to bool * Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change * Remove unused internal_parent_ property of tree * Remove unused parameter to CreateTreeLearner * Remove reference to LinearTreeLearner * Minor style issues * Remove unneeded check * Reverse temporary testing change * Fix Visual Studio project files * Restore LightGBM.vcxproj.filters * Speed up * Speed up * Simplify code * Update docs * Simplify code * Initialise storage space for max num threads * Move Eigen to include directory and delete unused files * Remove old files. * Fix so it compiles with mingw * Fix gpu tree learner * Change AddPredictionToScore back to const * Fix python lint error * Fix C++ lint errors * Change eigen to a submodule * Update comment * Add the eigen folder * Try to fix build issues with eigen * Remove eigen files * Add eigen as submodule * Fix include paths * Exclude eigen files from Python linter * Ignore eigen folders for pydocstyle * Fix C++ linting errors * Fix docs * Fix docs * Exclude eigen directories from doxygen * Update manifest to include eigen * Update build_r to include eigen files * Fix compiler warnings * Store raw feature data as float * Use float for calculating linear coefficients * Remove eigen directory from GLOB * Don't compile linear model code when building R package * Fix doxygen issue * Fix lint issue * Fix lint issue * Remove uneeded code * Restore delected lines * Restore delected lines * Change return type of has_raw to bool * Update docs * Rename some variables and functions for readability * Make tree_learner parameter const in AddScore * Fix style issues * Pass vectors as const reference when setting tree properties * Make temporary storage of serial_tree_learner mutable so we can make the object's methods const * Remove get_raw_size, use num_numeric_features instead * Fix typo * Make contains_nan_ and any_nan_ properties immutable again * Remove data_has_nan_ property of tree * Remove temporary test code * Make linear_tree a dataset param * Fix lint error * Make LinearTreeLearner a separate class * Fix lint errors * Fix lint error * Add linear_tree_learner.o * Simulate omp_get_max_threads if openmp is not available * Update PushOneData to also store raw data. * Cast size to int * Fix bug in ReshapeRaw * Speed up code with multithreading * Use OMP_NUM_THREADS * Speed up with multithreading * Update to use ArrayToString * Fix tests * Fix test * Fix bug introduced in merge * Minor updates * Update docs
-
- 11 Dec, 2020 1 commit
-
-
James Lamb authored
* [docs] Add details to docs on improving training speed * formatting * fix link * fix formatting * replace 'performance' with 'accuracy' and mention learning_rate * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * regenerate docs from config.h Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 08 Dec, 2020 1 commit
-
-
Alberto Ferreira authored
* Fix LightGBM models locale sensitivity and improve R/W performance. When Java is used, the default C++ locale is broken. This is true for Java providers that use the C API or even Python models that require JEP. This patch solves that issue making the model reads/writes insensitive to such settings. To achieve it, within the model read/write codebase: - C++ streams are imbued with the classic locale - Calls to functions that are dependent on the locale are replaced - The default locale is not changed! This approach means: - The user's locale is never tampered with, avoiding issues such as https://github.com/microsoft/LightGBM/issues/2979 with the previous approach https://github.com/microsoft/LightGBM/pull/2891 - Datasets can still be read according the user's locale - The model file has a single format independent of locale Changes: - Add CommonC namespace which provides faster locale-independent versions of Common's methods - Model code makes conversions through CommonC - Cleanup unused Common methods - Performance improvements. Use fast libraries for locale-agnostic conversion: - value->string: https://github.com/fmtlib/fmt - string->double: https://github.com/lemire/fast_double_parser (10x faster double parsing according to their benchmark) Bugfixes: - https://github.com/microsoft/LightGBM/issues/2500 - https://github.com/microsoft/LightGBM/issues/2890 - https://github.com/ninia/jep/issues/205 (as it is related to LGBM as well) * Align CommonC namespace * Add new external_libs/ to python setup * Try fast_double_parser fix #1 Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe If it works it should fix more LGBM builds * CMake: Attempt to link fmt without explicit PUBLIC tag * Exclude external_libs from linting * Add exernal_libs to MANIFEST.in * Set dynamic linking option for fmt. * linting issues * Try to fix lint includes * Try to pass fPIC with static fmt lib * Try CMake P_I_C option with fmt library * [R-package] Add CMake support for R and CRAN * Cleanup CMakeLists * Try fmt hack to remove stdout * Switch to header-only mode * Add PRIVATE argument to target_link_libraries * use fmt in header-only mode * Remove CMakeLists comment * Change OpenMP to PUBLIC linking in Mac * Update fmt submodule to 7.1.2 * Use fmt in header-only-mode * Remove fmt from CMakeLists.txt * Upgrade fast_double_parser to v0.2.0 * Revert "Add PRIVATE argument to target_link_libraries" This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7. * Address James Lamb's comments * Update R-package/.Rbuildignore Co-authored-by:James Lamb <jaylamb20@gmail.com> * Upgrade to fast_double_parser v0.3.0 - Solaris support * Use legacy code only in Solaris * Fix lint issues * Fix comment * Address StrikerRUS's comments (solaris ifdef). * Change header guards Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 05 Dec, 2020 1 commit
-
-
Chen Yufei authored
* Check max_bin, etc. match config when using binary. * Check max_bin_by_feature, bin_construct_sample_cnt matching config.
-
- 24 Nov, 2020 1 commit
-
-
shiyu1994 authored
Fix num_total_bin_ and bin_offsets_ of FeatureGroup if a dense multi val feature group with non zero most freq bin is the first feature group of the dataset.
-
- 23 Nov, 2020 1 commit
-
-
shiyu1994 authored
* remove max_block_size_ in train states (fix #3570) * avoid zero elements per row * add min constraint for min_block_size_
-
- 15 Nov, 2020 1 commit
-
-
Nikita Titov authored
-
- 14 Nov, 2020 2 commits
- 13 Nov, 2020 1 commit
-
-
shiyu1994 authored
* store without offset in multi_val_dense_bin * fix offset bug * add comment for offset * add comment for bin type selection * faster operations for offset * keep most freq bin in histogram for multi val dense * use original feature iterators * consider 9 cases (3 x 3) for multi val bin construction * fix dense bin setting * fix bin data in multi val group * fix offset of the first feature histogram * use float hist buf * avx in histogram construction * use avx for hist construction without prefetch * vectorize bin extraction * use only 128 vec * use avx2 * use vectorization for sparse row wise * add bit size for multi val dense bin * float with no vectorization * change multithreading strategy to dynamic * remove intrinsic header * fix dense multi val col copy * remove bit size * use large enough block size when the bin number is large * calc min block size by sparsity * rescale gradients * rollback gradients scaling * single precision histogram buffer as an option * add float hist buffer with thread buffer * fix setting zero in hist data * fix hist begin pointer in tree learners * remove debug logs * remove omp simd * update Makevars of R-package * fix feature group binary storing * two row wise for double hist buffer * add subfeature for two row wise * remove useless code and fix two row wise * refactor code * grouping the dense feature groups can get sparse multi val bin * clean format problems * one thread for two blocks in sep row wise * use ordered gradients for sep row wise * fix grad ptr * ordered grad with combined block for sep row wise * fix block threading * use the same min block size * rollback share min block size * remove logs * Update src/io/dataset.cpp Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * fix parameter description * remove sep_row_wise * remove check codes * add check for empty multi val bin * fix lint error * rollback changes in config.h * Apply suggestions from code review Co-authored-by:
Ubuntu <shiyu@gbdt-04.ren3kv4wanvufliwrpy4k03lsf.xx.internal.cloudapp.net> Co-authored-by:
Guolin Ke <guolin.ke@outlook.com>
-
- 06 Nov, 2020 1 commit
-
-
Guolin Ke authored
* better document for bin_construct_sample_cnt * add warnings Co-authored-by:StrikerRUS <nekit94-12@hotmail.com>
-
- 01 Nov, 2020 1 commit
-
-
Guolin Ke authored
* implement * fix compilation * Update config.cpp * unify wordings Co-authored-by:StrikerRUS <nekit94-12@hotmail.com>
-
- 28 Oct, 2020 1 commit
-
-
Nikita Titov authored
-
- 27 Oct, 2020 3 commits
-
-
Guolin Ke authored
* rollback omp sum * remove sum reduction
-
Guolin Ke authored
* speed up multi-threading sum * Apply suggestions from code review
-
Pavel Metrikov authored
* Add support to optimize for NDCG at a given truncation level In order to correctly optimize for NDCG@_k_, one should exclude pairs containing both documents beyond the top-_k_ (as they don't affect NDCG@_k_ when swapped). * Update rank_objective.hpp * Apply suggestions from code review Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * Update rank_objective.hpp remove the additional branching: get high_rank and low_rank by one "if". * Update config.h add description to lambdarank_truncation_level parameter * Update Parameters.rst * Update test_sklearn.py update expected NDCG value for a test, as it was affected by the underlying change in the algorithm * Update test_sklearn.py update NDCG@3 reference value * fix R learning-to-rank tests * Update rank_objective.hpp * Update include/LightGBM/config.h Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * Update Parameters.rst Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 26 Oct, 2020 2 commits
-
-
Guolin Ke authored
* fix subset bug * typo * add fixme tag * bin mapper * fix test * fix add_features_from * Update dataset.cpp * fix merge bug * added Python merge code * added test for add_features * Update dataset.cpp * Update src/io/dataset.cpp * continue implementing * warn users about categorical features Co-authored-by:
StrikerRUS <nekit94-12@hotmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
Pengfei Shi authored
-
- 18 Oct, 2020 1 commit
-
-
James Lamb authored
* fix int64 write error * attempt * [WIP] [ci] [R-package] Add CI job that runs valgrind tests * update all-successful * install * executable * fix redirect stuff * Apply suggestions from code review Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * more flags * add mc to msvc proj * fix memory leak in mc * Update monotone_constraints.hpp * Update r_package.yml * remove R_INT64_PTR * disable openmp * Update gbdt_model_text.cpp * Update gbdt_model_text.cpp * Apply suggestions from code review * try to free vector * free more memories. * Update src/boosting/gbdt_model_text.cpp * fix using * try the UNPROTECT(1); * fix a const pointer * fix Common * reduce UNPROTECT * remove UNPROTECT(1); * fix null handle * fix predictor * use NULL after free * fix a leaking in test * try more fixes * test the effect of tests * throw exception in Fatal * add test back * Apply suggestions from code review * commet some tests * Apply suggestions from code review * Apply suggestions from code review * trying to comment out tests * Update openmp_wrapper.h * Apply suggestions from code review * Update configure * Update configure.ac * trying to uncomment * more comments * more uncommenting * more uncommenting * fix comment * more uncommenting * uncomment fully-commented out stuff * try uncommenting more dataset tests * uncommenting more tests * ok getting closer * more uncommenting * free dataset * skipping a test, more uncommenting * more skipping * re-enable OpenMP * allow on OpenMP thing * move valgrind to comment-only job * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * changes from code review * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * linting * issue comments too * remove issue_comment Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 17 Oct, 2020 1 commit
-
-
Aakarsh Gopi authored
This allows for network retries, to scale well with the number of machines, and still retains the existing functionality for cases with smaller num_machines ( 500 ) Fixes #3301 Co-authored-by:Aakarsh Gopi <aakarsh@vaticlabs.com>
-
- 09 Oct, 2020 1 commit
-
-
Lucas David authored
~ Added 'noexcept' specifier and defaulted desctructor. Co-authored-by:Lucas DAVID <lucas@isdom.isoft.fr>
-
- 30 Sep, 2020 2 commits
-
-
Guolin Ke authored
* Update serial_tree_learner.cpp * Update src/treelearner/serial_tree_learner.cpp * stable multi-threading reduction * Update src/treelearner/serial_tree_learner.cpp * more fixes * Apply suggestions from code review * Apply suggestions from code review * Update src/boosting/gbdt.cpp
-
Guolin Ke authored
* fix dataset binary file alignment * many fixes * fix warnings * fix bug * Update file_io.cpp * Update file_io.cpp * simplify code * Apply suggestions from code review * general * remove unneeded alignment * Update file_io.h * int32 to byte8 alignment * Apply suggestions from code review * Apply suggestions from code review
-
- 29 Sep, 2020 1 commit
-
-
Guolin Ke authored
* fix warnings * Apply suggestions from code review * Update feature_group.h * Update feature_group.h * Update src/treelearner/serial_tree_learner.cpp * Update multiclass_metric.hpp
-