Commits · 946817a505eeedf7b99d975fb7dd7294dd3d4e9d · tianlh / LightGBM-DCU

15 Nov, 2021 1 commit

[c_api] Improve ANSI compatibility by avoiding <stdbool.h> (#4697) · bfb346c1

Drew Miller authored Nov 15, 2021

* [c_api] Improve ANSI compatibility by avoiding <stdbool.h>

* fixes in response to CI linting

* inline NOLINT instead of separate test

* moving length declaration to non-ANSI C conditional

* [c_api] Align expected return type in `basic.py` with new c_api type.

bfb346c1

28 Oct, 2021 1 commit
- Reset OpenMP thread number if num_threads <= 0 (#4704) · 42914830
  Zhiyuan He authored Oct 29, 2021
```
* mock func for no openmp

* use omp_get_max_threads
Co-authored-by: hzy46 <email@example.com>
```
  42914830
23 Jul, 2021 1 commit
- [refactor] Use `CreateSampleIndices()` in `c_api.cpp` (#4478) · 3be611e7
  Chen Yufei authored Jul 23, 2021
```
This removes code duplication for creating sample indices.
```
  3be611e7
02 Jul, 2021 1 commit

[python-package] Create Dataset from multiple data files (#4089) · c359896e

Chen Yufei authored Jul 02, 2021

* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389



* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c359896e

26 Jun, 2021 1 commit
- fix param aliases (#4387) · aab8fc18
  Nikita Titov authored Jun 26, 2021
  
  aab8fc18
07 May, 2021 1 commit

Precise text file parsing (#4081) · f8318088

Chen Yufei authored May 07, 2021



* New build option: USE_PRECISE_TEXT_PARSER.

Use fast_double_parser for text file parsing. For each number, fallback
to strtod in case of parse failure.

* Add benchmark for CSVParser with Atof and AtofPrecise.

* Fix lint complaint.

* Fix typo in open result error message.

* Revert "Fix lint complaint."

This reverts commit 92ab0b6bce9f17d7be9eaeb20f19d4a0a36f0387.

* Revert "Add benchmark for CSVParser with Atof and AtofPrecise."

This reverts commit 4f8639abd06c679d4382eb715a1793afd94df3d2.

* Use AtofPrecise in Common::__StringToTHelper.

* [option] precise_float_parser: precise float number parsing for text input.

* Remove USE_PRECISE_TEXT_PARSER compile option.

* test: add test for Common::AtofPrecise.

* test: remove ChunkedArrayTest with 0 length.

This triggers Log::Fatal which aborts the test program.

* fix lint, add copyright.

* Revert "test: remove ChunkedArrayTest with 0 length."

This reverts commit 346c76affe9e78b6ca2738c4a56dbb9c00f31102.

* Use LightGBM::Common::Sign

* save precise_float_parser in model file.

* Fix error checking in AtofPrecise. Add more test cases.

* Remove test case that can't pass under macOS.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

f8318088

17 Feb, 2021 1 commit

Fix for CreatePredictor function and VS2017 Debug build (#3937) · 5321fef6

mjmckp authored Feb 17, 2021



* Fix index out-of-range exception generated by BaggingHelper on small datasets.

Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.

* Update goss.hpp

* Update goss.hpp

* Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)

* Fix incorrect upstream merge

* Add link to LightGBM.NET

* Fix indenting to 2 spaces

* Dummy edit to trigger CI

* Dummy edit to trigger CI

* remove duplicate functions from merge

* Fix for CreatePredictor function: for VS2017 in Debug build, the previous version would end up giving an uninitialised prediction function that would throw access violation exceptions when invoked.
Co-authored-by: matthew-peacock <matthew.peacock@whiteoakam.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

5321fef6

21 Jan, 2021 1 commit

Fix thread-safety in C API's PredictSingleRow (#3771) · 4ae4abbe

Alberto Ferreira authored Jan 21, 2021

By using a unique lock instead of the shared lock the timings are very similar,
but predictions are correct.

Even so, by designing a small C++ benchmark with a very simple LGBM model,more threads on a simple model are slower than
the single-thread case. This is probably due to very small work units,
the lock contention overhead increases.

We should in the future benchmark with more complex models to see if supporting
threading on these calls is worth it in performance gains.

If not, then we could choose to not to provide thread-safety and remove
the locks altogether for maximal throughput.

See https://github.com/microsoft/LightGBM/issues/3751 for timings.

See gist for benchmark code:

https://gist.github.com/AlbertoEAF/5972db15a27c294bab65b97e1bc4c315

4ae4abbe

18 Jan, 2021 1 commit

Minor C API cleanup in predictor & SingleRowPredictor (#3777) · 35612633

Alberto Ferreira authored Jan 18, 2021



* Cleanup predictor

* Cleanup SingleRowPredictor

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/application/predictor.hpp
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

35612633

03 Jan, 2021 1 commit

[R-package] allow access to params in Booster (#3662) · 532fa914

James Lamb authored Jan 03, 2021

* [R-package] allow access to params in Booster

* remove unnecessary whitespace

* fix test on resetting params

* remove pytest_cache

* Update R-package/tests/testthat/test_custom_objective.R

532fa914

28 Dec, 2020 1 commit

small code and docs refactoring (#3681) · 5a460846

Nikita Titov authored Dec 29, 2020

* small code and docs refactoring

* Update CMakeLists.txt

* Update .vsts-ci.yml

* Update test.sh

* continue

* continue

* revert stable sort for all-unique values

5a460846

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

24 Nov, 2020 1 commit
- [refactor] Reduce code duplication in c_api.cpp (#3539) · 5e24b80b
  Alberto Ferreira authored Nov 24, 2020
```
* Refactor c_api.cpp with template code

* Further cleanup

* Fix whitespace for linter
```
  5e24b80b
21 Sep, 2020 1 commit
- fix sparse multiclass local feature contributions and add test (#3382) · eff287e9
  Ilya Matiach authored Sep 21, 2020
  
  eff287e9
20 Sep, 2020 1 commit

[GPU] Add support for CUDA-based GPU build (#3160) · f7ad9457

Chip Kerchner authored Sep 20, 2020

* Initial CUDA work

* redirect log to python console (#3090)

* redir log to python console

* fix pylint

* Apply suggestions from code review

* Update basic.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update c_api.h

* Apply suggestions from code review

* super-minor: better wording
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

* re-order includes (fixes #3132) (#3133)

* Revert "re-order includes (fixes #3132) (#3133)" (#3153)

This reverts commit 656d2676

* Missing change from previous rebase

* Minor cleanup and removal of development scripts.

* Only set gpu_use_dp on by default for CUDA. Other minor change.

* Fix python lint indentation problem.

* More python lint issues.

* Big lint cleanup - more to come.

* Another large lint cleanup - more to come.

* Even more lint cleanup.

* Minor cleanup so less differences in code.

* Revert is_use_subset changes

* Another rebase from master to fix recent conflicts.

* More lint.

* Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code.

* Removed parameters added for CUDA and various bug fix.

* Yet more lint and unneccessary changes.

* Revert another change.

* Removal of unneccessary code.

* temporary appveyor.yml for building and testing

* Remove return value in ReSize

* Removal of unused variables.

* Code cleanup from reviewers suggestions.

* Removal of FIXME comments and unused defines.

* More reviewers comments cleanup.

* Fix config variables.

* Attempt to fix check-docs failure

* Update Paramster.rst for num_gpu

* Removing test appveyor.yml

* Add CUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue.

* Fixed handling of data elements less than 2K.

* More reviewers comments cleanup.

* Removal of TODO and fix printing of int64_t

* Add cuda change for CI testing and remove cuda from device_type in python.

* Missed one change form previous check-in

* Removal AdditionConfig and fix settings.

* Limit number of GPUs to one for now in CUDA.

* Update Parameters.rst for previous check-in

* Whitespace removal.

* Cleanup unused code.

* Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work.

* Lint change from previous check-in.

* Changes based on reviewers comments.

* More reviewer comment changes.

* Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_

* Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define.

* Reviewer comment cleanup.

* Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once.

* Remove PRINT debug for CUDA code.

* Allow to use of multiple GPUs for CUDA.

* More multi-GPUs enablement for CUDA.

* More code cleanup based on reviews comments.

* Update docs with latest config changes.
Co-authored-by: Gordon Fossum <fossum@us.ibm.com>
Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

f7ad9457

06 Aug, 2020 1 commit

[Python] / [R] add start_iteration to python predict interface (fix #3058) (#3272) · 82e2ff7a

shiyu1994 authored Aug 06, 2020



* [python] add start_iteration to python predict interface (#3058)

* Apply suggestions from code review

* Update lightgbm_R.h

* Apply suggestions from code review

* Apply suggestions from code review

* fix R interface

* update R documentation
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

82e2ff7a

05 Aug, 2020 1 commit

Fast single row predict API v2 (#3268) · b5027de3

Alberto Ferreira authored Aug 05, 2020

* Fix bug introduced in PR #2992 for Fast predict

* Faster Fast predict API

* Add const to SingleRow Fast methods

b5027de3

29 Jul, 2020 1 commit

[TYPO] DatasetLoader::ConstructFromSampleData (#3258) · 6f339d77

Lucas David authored Jul 29, 2020



* ~ Modified name of method DatasetLoader::CostructFromSampleData to DatasetLoader::ConstructFromSampleData.
& Build passes for Debug, Debug_DLL, DLL and Release (not tested Debug_mpi and Release_mpi).

* ~ Refactored indentations.
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

6f339d77

19 Jul, 2020 1 commit

Change locking strategy of Booster, allow for share and unique locks (#2760) · 1c35c3b9

Joan Fontanals authored Jul 19, 2020



* Add capability to get possible max and min values for a model

* Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp

* Update include/LightGBM/c_api.h
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Change iteration to avoid potential overflow, add bindings to R and Python and a basic test

* Adjust test values

* Consider const correctness and multithreading protection

* Put everything possible as const

* Include shared_mutex, for now as unique_lock

* Update test values

* Put everything possible as const

* Include shared_mutex, for now as unique_lock

* Make PredictSingleRow const and share the lock with other reading threads

* Update test values

* Add test to check that model is exactly the same in all platforms

* Try to parse the model to get the expected values

* Try to parse the model to get the expected values

* Fix implementation, num_leaves can be lower than the leaf_value_ size

* Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value

* Change test order

* Add gpu_use_dp option in test

* Remove helper test method

* Remove TODO

* Add preprocessing option to compile with c++17

* Update python-package/setup.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Remove unwanted changes

* Move option

* Fix problems introduced by conflict fix

* Avoid switching to c++17 and use yamc mutex library to access shared lock functionality

* Add extra yamc include

* Change header order

* some lint fix

* change include order and remove some extra blank lines

* Further fix lint issues

* Update c_api.cpp

* Further fix lint issues

* Move yamc include files to a new yamc folder

* Use standard unique_lock

* Update windows/LightGBM.vcxproj
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update windows/LightGBM.vcxproj.filters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix problems coming from merge conflict resolution
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: joanfontanals <jfontanals@ntent.com>
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

1c35c3b9

16 Jul, 2020 1 commit

Upcast index to size_t in Refit (closes #3227) (#3228) · f5f27ca8

Jan Tilly authored Jul 16, 2020

In the current implementation, the index is an int32, which will segfault with large data sets and a large number of estimators.

f5f27ca8

15 Jul, 2020 2 commits

Feat/optimize single prediction (#2992) · fc79b366

Alberto Ferreira authored Jul 15, 2020

* [performance] Add Fast methods to C API for SingleRow Predictions

 * Add methods to C API to make single-row predictions faster:

   - LGBM_BoosterPredictForMatSingleRowFastInit (setup)
   - LGBM_BoosterPredictForMatSingleRowFast (predict)
   - LGBM_FastConfigFree (cleanup setup outputs)

* Code syle cleanup

* Fix lint errors

* [performance] Revert FastConfig improvement to pass data at init

This reduces optimization by 5% / 30% with this branch but makes it so it can be used for higher level wrappers in MMLSpark.
And outside it as well.

* [performance] Introduce Fast variants for SingleRow predictors.

Although this already provides performance gains by itself for any
callers, two new functions were added to Java's SWIG interfaces to
exploit that AND the GetPrimitiveArrayCritical data fetches.

* [tests/profiling] Profile Fast predict methods

Build with -DBUILD_PROFILING_TESTS=ON and copy the default
model trained on the Higgs dataset from the benchmarks repo

 https://github.com/guolinke/boosting_tree_benchmarks.git



to LightGBM repo root and run the lightgbm_profile_* binaries.

The single instance used is the first row from that dataset.

* Update comment on CMakeLists.

* Fix doxygen-introduced issue (#threads)

* Fix conflicts due to new RowFunctionFromCSR signature in master

* Change FastConfig ncol to int32_t.

* Removed profiling folder

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Doxygen: change new docstrings to double back-quote
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fc79b366

feature importance type in saved model file (#3220) · 87d46489

Guolin Ke authored Jul 16, 2020



* feature importance type in saved model file

* fix nullptr

* fixed formatting

* fix python/R

* Update src/c_api.cpp

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* fix c_api test

* fix swig

* minor docs improvements and added defines for importance types
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

87d46489

28 Jun, 2020 1 commit

adding sparse support to TreeSHAP in lightgbm (#3000) · 9f367d11

Ilya Matiach authored Jun 28, 2020

* adding sparse support to TreeSHAP in lightgbm

* updating based on comments

* updated based on comments, used fromiter instead of frombuffer

* updated based on comments

* fixed limits import order

* fix sparse feature contribs to work with more than int32 max rows

* really fixed int64 max error and build warnings

* added sparse test with >int32 max rows

* fixed python side reshape check on sparse data

* updated based on latest comments

* fixed comments

* added CSC INT32_MAX validation to test, fixed comments

9f367d11

11 Jun, 2020 1 commit
- refactor LGBM_DatasetGetFeatureNames (#3022) · f30e0bb3
  Nikita Titov authored Jun 11, 2020
  
  f30e0bb3
05 Jun, 2020 1 commit
- Revert "re-order includes (fixes #3132) (#3133)" (#3153) · ac5f5e56
  Nikita Titov authored Jun 05, 2020
```
This reverts commit 656d2676.
```
  ac5f5e56
01 Jun, 2020 1 commit
- re-order includes (fixes #3132) (#3133) · 656d2676
  James Lamb authored Jun 01, 2020
  
  656d2676
20 May, 2020 1 commit

redirect log to python console (#3090) · dea2391b

Guolin Ke authored May 21, 2020



* redir log to python console

* fix pylint

* Apply suggestions from code review

* Update basic.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update c_api.h

* Apply suggestions from code review

* Apply suggestions from code review

* super-minor: better wording
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

dea2391b

12 Apr, 2020 1 commit
- Fix user locale settings (#2980) · 8c335352
  OMOTO Tsukasa authored Apr 12, 2020
```
This reverts commit 6b68967d (#2891) and fixes #2979.
```
  8c335352
21 Mar, 2020 1 commit
- [ci] fixed cpplint errors about namespace using-directives (#2927) · 3e540eac
  James Lamb authored Mar 21, 2020
  
  3e540eac
20 Mar, 2020 1 commit

Fix SWIG methods that return char** (#2850) · 91185c3a

Alberto Ferreira authored Mar 20, 2020



* [swig] Fix SWIG methods that return char** with StringArray.

+ [new] Add StringArray class to manage and manipulate arrays of fixed-length strings:

  This class is now used to wrap any char** parameters, manage memory and
  manipulate the strings.

  Such class is defined at swig/StringArray.hpp and wrapped in StringArray.i.

+ [API+fix] Wrap LGBM_BoosterGetFeatureNames it resulted in segfault before:

  Added wrapper LGBM_BoosterGetFeatureNamesSWIG(BoosterHandle) that
  only receives the booster handle and figures how much memory to allocate
  for strings and returns a StringArray which can be easily converted to String[].

+ [API+safety] For consistency, LGBM_BoosterGetEvalNamesSWIG was wrapped as well:

  * Refactor to detect any kind of errors and removed all the parameters
    besides the BoosterHandle (much simpler API to use in Java).
  * No assumptions are made about the required string space necessary (128 before).
  * The amount of required string memory is computed internally

+ [safety] No possibility of undefined behaviour

  The two methods wrapped above now compute the necessary string storage space
  prior to allocation, as the low-level C API calls would crash the process
  irreversibly if they write more memory than which is passed to them.

* Changes to C API and wrappers support char**

To support the latest SWIG changes that enable proper char**
return support that is safe, the C API was changed.

The respecive wrappers in R and Python were changed too.

* Cleanup indentation in new lightgbm_R.cpp code

* Adress review code-style comments.

* Update swig/StringArray.hpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/basic.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/lightgbm_R.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

91185c3a

17 Mar, 2020 1 commit

Fix Booster read/write locale dependency (#2891) · 6b68967d

Alberto Ferreira authored Mar 17, 2020



* Fix Booster read/write locale dependency

* Address review comments

* Move LocaleContext.h->locale_context.h
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>

6b68967d

11 Mar, 2020 1 commit
- fixed cpplint errors and disable warning only for VS (#2888) · bd10918e
  Nikita Titov authored Mar 11, 2020
```
* fixed cpplint errors and disable warning only for VS

* wrap more pragma warning
```
  bd10918e
05 Mar, 2020 1 commit

speed up `FindBestThresholdFromHistogram` (#2867) · 77d92b7c

Guolin Ke authored Mar 05, 2020

* speed up for const hessian

* rename template

* some refactorings

* refine

* refine

* simplify codes

* fix random in feature histogram

* code refine

* refine

* try fix

* make gcc happy

* remove timer

* rollback some changes

* more templates

* fix a bug

* reduce the cost of timer

* fix gpu

* fix bug

* fix gpu

77d92b7c

04 Mar, 2020 2 commits
- fix warnings in debug mode about that not all control paths return a value (#2866) · b70636bc
  Nikita Titov authored Mar 04, 2020
  
  b70636bc
- fixed cpplint issues (#2863) · d018d30a
  Nikita Titov authored Mar 04, 2020
```
* fixed cpplint errors

* fixed more cpplint errors
```
  d018d30a
02 Mar, 2020 3 commits

speed up multi-val bin subset for bagging (#2827) · d0bec9e9

Guolin Ke authored Mar 02, 2020

* speed up multi-val bin subset for bagging

* remove the duplicated codes

* code refine

* some codes refactoring

* move `is_constant_hessian` into `TrainingShareStates`

* refine

* fix bug

* fix bug when num_groups_ < 0

* fix gpu

* fix gpu bagging

* fix gpu bug

* typo

* Update src/treelearner/serial_tree_learner.h

d0bec9e9

don't save num_thread as possible (#2839) · 0aa7bfee

Guolin Ke authored Mar 02, 2020



* don't cache `num_thread`, to avoid change outside

* rename

* update document

* Update docs/Parameters.rst

* Update include/LightGBM/config.h

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

0aa7bfee

introduced specific CHECKs (#2849) · 5a80b788
Nikita Titov authored Mar 02, 2020

5a80b788

25 Feb, 2020 1 commit
- replace std::runtime_error with Log::Fatal (#2816) · 37ce3eb2
  Nikita Titov authored Feb 25, 2020
```
* replace std::runtime_error with Log::Fatal

* Update c_api.cpp
```
  37ce3eb2
24 Feb, 2020 1 commit
- fixed cpplint issues (#2809) · 224b8b98
  Nikita Titov authored Feb 24, 2020
```
* fixed cpplint issues

* fixed cpplint issues
```
  224b8b98