Commits · e797985227a012a837c20eddc457de6b7fc7aeaa · tianlh / LightGBM-DCU

04 Dec, 2023 1 commit
- [python-package] Allow to pass Arrow table and array as init scores (#6167) · f5b6bd60
  Oliver Borchert authored Dec 04, 2023
  
  f5b6bd60
22 Nov, 2023 1 commit
- [python-package] Allow to pass Arrow array as groups (#6166) · 516bde95
  Oliver Borchert authored Nov 22, 2023
  
  516bde95
13 Nov, 2023 1 commit
- [python-package] Allow to pass Arrow array as weights (#6164) · deb70773
  Oliver Borchert authored Nov 13, 2023
  
  deb70773
07 Nov, 2023 1 commit
- [python-package] Allow to pass Arrow array as labels (#6163) · b7f6311f
  Oliver Borchert authored Nov 07, 2023
  
  b7f6311f
01 Nov, 2023 1 commit
- [python-package] Allow to pass Arrow table as training data (#6034) · 3405ee82
  Oliver Borchert authored Nov 01, 2023
  
  3405ee82
14 Feb, 2023 1 commit

feature: Add serialization of reference dataset (#5427) · 0f7983b6

Scott Votaw authored Feb 13, 2023

* Add serialization of reference dataset

* lint and missing file

* Fixes from reviewers

* responded to comments

* revert sdk change

0f7983b6

29 Nov, 2022 1 commit
- Fix OpenMP thread allocation in Linux (#5551) · 4c5d0fbb
  Scott Votaw authored Nov 29, 2022
  
  4c5d0fbb
03 Nov, 2022 1 commit
- [docs] Improve docs: fix consistency of dots in C API and add notes about new... · 9047604b
  Nikita Titov authored Nov 04, 2022
```
[docs] Improve docs: fix consistency of dots in C API and add notes about new ``time-costs`` Python-package build option (#5554)
```
  9047604b
11 Oct, 2022 1 commit
- [python-package][R-package] load parameters from model file (fixes #2613) (#5424) · 8b720844
  José Morales authored Oct 11, 2022
  
  8b720844
10 Aug, 2022 1 commit

feature: Add true streaming APIs to reduce client-side memory usage (#5299) · 0a5c5838

Scott Votaw authored Aug 10, 2022

* Extract streaming to own PR

* small merge fixes and cleanup

* linting fixes

* fix cast warning

* Fix accidental deletion during branch transfer

* responded to initial triage comments

* Added more tests to use create-from-samples APIs

* added mutex and adjusted nclasses logic

* Fix thread-safety for pushing data to sparse bins through Push APIs

* lint and doc fixes

* Small SWIG fix

* nit fix

* Responded to StrikerRUS comments

* fix breaking change after merge with master

* Extract streaming to own PR

* small merge fixes and cleanup

* Fix accidental deletion during branch transfer

* responded to initial triage comments

* Added more tests to use create-from-samples APIs

* Fix rstcheck call in ci

* remove TODOs

* Extract streaming to own PR

* small merge fixes and cleanup

* Fix accidental deletion during branch transfer

* responded to initial triage comments

* Added more tests to use create-from-samples APIs

* Small SWIG fix

* remove ci change

* responded to shiyu1994 comments

* responded to StrikerRUS comments

* Fixes from StrikerRUS comments

0a5c5838

21 Jul, 2022 1 commit

fix: Adjust LGBM_DatasetCreateFromSampledColumn to handle distributed data (#5344) · f94050a4

Scott Votaw authored Jul 21, 2022

* Adjust LGBM_DatasetCreateFromSampledColumn to handle distributed data better

* linting fix

* switch to 1 API with breaking change

* Fix pything native call

* more python test fixes

f94050a4

27 Jun, 2022 1 commit

[python-package] check feature names in predict with dataframe (fixes #812) (#4909) · bdb02e05

José Morales authored Jun 27, 2022



* check feature names and order in predict with dataframe

* slice df in predict to remove the target

* scramble features

* handle int column names

* only change column order when needed

* include validate_features param in booster and sklearn estimators

* document validate_features argument

* use all_close in preds checks and check for assertion error to compare different arrays

* perform remapping and checks in cpp

* remove extra logs

* fixes

* revert cpp

* proposal

* remove extra arg

* lint

* restore _data_from_pandas arguments

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* move data conversion to Predictor.predict

* use Vector2Ptr
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

bdb02e05

15 Mar, 2022 1 commit

[c-api][python-package][R-package] expose feature num bin (#5048) · d10372e2

José Morales authored Mar 14, 2022



* expose FeatureNumBin in C api

* parametrize min_data_in_bin and add test with max_bin_by_feature

* include feature_num_bin in R package

* add suggestion from review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update error message and lint

* lint

* add call method

* minor improvements in tests

* add suggestions from review

* lint

* rename argument to feature in python and r packages
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d10372e2

24 Feb, 2022 1 commit

Correct documentation for sparse predictions (#4979) · 7e478047

david-cortes authored Feb 24, 2022

* Correct documentation for sparse predictions

The documentation says that the parameter `nindptr` for `LGBM_BoosterPredictSparseOutput` should be the number of rows plus one, but this is incorrect when the input type is CSC. This PR fixes it.

* Update c_api.h

* Update c_api.h

* Update c_api.h

7e478047

30 Dec, 2021 1 commit

[python] raise an informative error instead of segfaulting when custom... · af5b40e1

Yaqub Alwan authored Dec 30, 2021


[python] raise an informative error instead of segfaulting when custom objective produces incorrect output (#4815)

* fix for bad grads causing segfault

* adjust checking criteria to properly reflect reality of multi-class classifiers

* fix styling

* Line break before operator

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* add a note to the C-API docs

* rearrange text s;ightly

* add some tests to python package

* Update include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* PR comments

* match argument is a regex and our expression has brackets ..

* rework tests

* isorting imports

* updating test to relfect that the python APi does not take pres/labels as a fobj function
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

af5b40e1

03 Dec, 2021 1 commit

Add C API function that returns all parameter names with their aliases (#4829) · cf38071b

Nikita Titov authored Dec 03, 2021



* add C API function that returns all param names with aliases

* add C API function that returns all param names with aliases

* add R code

* test R code

* remove debug CI

* fix R lint

* refactor

* run CI

* fix R

* fix

* revert CI checks

* revert changes in docs

* Try to make function `const`
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* add `const` in cpp file

* address review comments and sync with `master`
Co-authored-by: James Lamb <jaylamb20@gmail.com>

cf38071b

15 Nov, 2021 1 commit

[c_api] Improve ANSI compatibility by avoiding <stdbool.h> (#4697) · bfb346c1

Drew Miller authored Nov 15, 2021

* [c_api] Improve ANSI compatibility by avoiding <stdbool.h>

* fixes in response to CI linting

* inline NOLINT instead of separate test

* moving length declaration to non-ANSI C conditional

* [c_api] Align expected return type in `basic.py` with new c_api type.

bfb346c1

21 Oct, 2021 1 commit
- [docs] fix C API docs rendering (#4688) · d88b4456
  Nikita Titov authored Oct 22, 2021
```
* fix C API docs rendering

* place comments before members they describe
```
  d88b4456
05 Oct, 2021 1 commit

allow inclusion in C programs (#4608) · f3037c18

Drew Miller authored Oct 05, 2021

* allow inclusion in C programs

* adding documentation to macro

* Support for ANSI C, _Thread_local where available.

* fix macro for docs

f3037c18

03 Aug, 2021 1 commit

Update c_api LGBM_SampleIndices() comment. (#4490) · 1dbf4382

Chen Yufei authored Aug 04, 2021



* Update c_api LGBM_SampleIndices() comment.

rand.Sample() now returns exactly given number of samples, thus the
comment should be fixed.

* Update include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

1dbf4382

02 Jul, 2021 1 commit

[python-package] Create Dataset from multiple data files (#4089) · c359896e

Chen Yufei authored Jul 02, 2021

* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389



* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c359896e

10 May, 2021 1 commit
- [docs] clarify docs for LGBM_BoosterGetEvalNames and LGBM_BoosterGetEvalCounts... · 08d1ce4b
  James Lamb authored May 10, 2021
```
[docs] clarify docs for LGBM_BoosterGetEvalNames and LGBM_BoosterGetEvalCounts (fixes #4264) (#4270)
```
  08d1ce4b
28 Dec, 2020 1 commit

small code and docs refactoring (#3681) · 5a460846

Nikita Titov authored Dec 29, 2020

* small code and docs refactoring

* Update CMakeLists.txt

* Update .vsts-ci.yml

* Update test.sh

* continue

* continue

* revert stable sort for all-unique values

5a460846

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

17 Oct, 2020 1 commit

Updated network retry delay strategy to scale (#3306) · c0c65f76

Aakarsh Gopi authored Oct 17, 2020



This allows for network retries, to scale well with the
number of machines, and still retains the existing functionality
for cases with smaller num_machines ( 500 )

Fixes #3301
Co-authored-by: Aakarsh Gopi <aakarsh@vaticlabs.com>

c0c65f76

10 Aug, 2020 1 commit
- [ci][docs] fix current master failures for docs test (#3297) · 9ae2c11b
  Nikita Titov authored Aug 10, 2020
  
  9ae2c11b
06 Aug, 2020 1 commit

[Python] / [R] add start_iteration to python predict interface (fix #3058) (#3272) · 82e2ff7a

shiyu1994 authored Aug 06, 2020



* [python] add start_iteration to python predict interface (#3058)

* Apply suggestions from code review

* Update lightgbm_R.h

* Apply suggestions from code review

* Apply suggestions from code review

* fix R interface

* update R documentation
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

82e2ff7a

05 Aug, 2020 1 commit

Fast single row predict API v2 (#3268) · b5027de3

Alberto Ferreira authored Aug 05, 2020

* Fix bug introduced in PR #2992 for Fast predict

* Faster Fast predict API

* Add const to SingleRow Fast methods

b5027de3

15 Jul, 2020 2 commits

Feat/optimize single prediction (#2992) · fc79b366

Alberto Ferreira authored Jul 15, 2020

* [performance] Add Fast methods to C API for SingleRow Predictions

 * Add methods to C API to make single-row predictions faster:

   - LGBM_BoosterPredictForMatSingleRowFastInit (setup)
   - LGBM_BoosterPredictForMatSingleRowFast (predict)
   - LGBM_FastConfigFree (cleanup setup outputs)

* Code syle cleanup

* Fix lint errors

* [performance] Revert FastConfig improvement to pass data at init

This reduces optimization by 5% / 30% with this branch but makes it so it can be used for higher level wrappers in MMLSpark.
And outside it as well.

* [performance] Introduce Fast variants for SingleRow predictors.

Although this already provides performance gains by itself for any
callers, two new functions were added to Java's SWIG interfaces to
exploit that AND the GetPrimitiveArrayCritical data fetches.

* [tests/profiling] Profile Fast predict methods

Build with -DBUILD_PROFILING_TESTS=ON and copy the default
model trained on the Higgs dataset from the benchmarks repo

 https://github.com/guolinke/boosting_tree_benchmarks.git



to LightGBM repo root and run the lightgbm_profile_* binaries.

The single instance used is the first row from that dataset.

* Update comment on CMakeLists.

* Fix doxygen-introduced issue (#threads)

* Fix conflicts due to new RowFunctionFromCSR signature in master

* Change FastConfig ncol to int32_t.

* Removed profiling folder

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix doxygen typo include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Doxygen: change new docstrings to double back-quote
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fc79b366

feature importance type in saved model file (#3220) · 87d46489

Guolin Ke authored Jul 16, 2020



* feature importance type in saved model file

* fix nullptr

* fixed formatting

* fix python/R

* Update src/c_api.cpp

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* fix c_api test

* fix swig

* minor docs improvements and added defines for importance types
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

87d46489

28 Jun, 2020 1 commit

adding sparse support to TreeSHAP in lightgbm (#3000) · 9f367d11

Ilya Matiach authored Jun 28, 2020

* adding sparse support to TreeSHAP in lightgbm

* updating based on comments

* updated based on comments, used fromiter instead of frombuffer

* updated based on comments

* fixed limits import order

* fix sparse feature contribs to work with more than int32 max rows

* really fixed int64 max error and build warnings

* added sparse test with >int32 max rows

* fixed python side reshape check on sparse data

* updated based on latest comments

* fixed comments

* added CSC INT32_MAX validation to test, fixed comments

9f367d11

11 Jun, 2020 1 commit
- refactor LGBM_DatasetGetFeatureNames (#3022) · f30e0bb3
  Nikita Titov authored Jun 11, 2020
  
  f30e0bb3
05 Jun, 2020 1 commit
- Revert "re-order includes (fixes #3132) (#3133)" (#3153) · ac5f5e56
  Nikita Titov authored Jun 05, 2020
```
This reverts commit 656d2676.
```
  ac5f5e56
01 Jun, 2020 1 commit
- re-order includes (fixes #3132) (#3133) · 656d2676
  James Lamb authored Jun 01, 2020
  
  656d2676
20 May, 2020 1 commit

redirect log to python console (#3090) · dea2391b

Guolin Ke authored May 21, 2020



* redir log to python console

* fix pylint

* Apply suggestions from code review

* Update basic.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update c_api.h

* Apply suggestions from code review

* Apply suggestions from code review

* super-minor: better wording
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: StrikerRUS <nekit94-12@hotmail.com>

dea2391b

16 Apr, 2020 1 commit
- [ci] Changed use of strcpy to snprintf (fixes #1990) (#2973) · 9843506e
  James Lamb authored Apr 16, 2020
```
* [ci] Changed use of strcpy to snprintf

* fix

* fully enable cpplint
```
  9843506e
20 Mar, 2020 1 commit

Fix SWIG methods that return char** (#2850) · 91185c3a

Alberto Ferreira authored Mar 20, 2020



* [swig] Fix SWIG methods that return char** with StringArray.

+ [new] Add StringArray class to manage and manipulate arrays of fixed-length strings:

  This class is now used to wrap any char** parameters, manage memory and
  manipulate the strings.

  Such class is defined at swig/StringArray.hpp and wrapped in StringArray.i.

+ [API+fix] Wrap LGBM_BoosterGetFeatureNames it resulted in segfault before:

  Added wrapper LGBM_BoosterGetFeatureNamesSWIG(BoosterHandle) that
  only receives the booster handle and figures how much memory to allocate
  for strings and returns a StringArray which can be easily converted to String[].

+ [API+safety] For consistency, LGBM_BoosterGetEvalNamesSWIG was wrapped as well:

  * Refactor to detect any kind of errors and removed all the parameters
    besides the BoosterHandle (much simpler API to use in Java).
  * No assumptions are made about the required string space necessary (128 before).
  * The amount of required string memory is computed internally

+ [safety] No possibility of undefined behaviour

  The two methods wrapped above now compute the necessary string storage space
  prior to allocation, as the low-level C API calls would crash the process
  irreversibly if they write more memory than which is passed to them.

* Changes to C API and wrappers support char**

To support the latest SWIG changes that enable proper char**
return support that is safe, the C API was changed.

The respecive wrappers in R and Python were changed too.

* Cleanup indentation in new lightgbm_R.cpp code

* Adress review code-style comments.

* Update swig/StringArray.hpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/basic.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/lightgbm_R.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: alberto.ferreira <alberto.ferreira@feedzai.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

91185c3a

11 Mar, 2020 1 commit
- fixed cpplint errors and disable warning only for VS (#2888) · bd10918e
  Nikita Titov authored Mar 11, 2020
```
* fixed cpplint errors and disable warning only for VS

* wrap more pragma warning
```
  bd10918e
20 Feb, 2020 1 commit

Add capability to get possible max and min values for a model (#2737) · 18e7de4f

Joan Fontanals authored Feb 20, 2020



* Add capability to get possible max and min values for a model

* Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp

* Update include/LightGBM/c_api.h
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Change iteration to avoid potential overflow, add bindings to R and Python and a basic test

* Adjust test values

* Consider const correctness and multithreading protection

* Update test values

* Update test values

* Add test to check that model is exactly the same in all platforms

* Try to parse the model to get the expected values

* Try to parse the model to get the expected values

* Fix implementation, num_leaves can be lower than the leaf_value_ size

* Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value

* Change test order

* Add gpu_use_dp option in test

* Remove helper test method

* Update src/c_api.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/io/tree.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update src/io/tree.cpp
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_basic.py
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Remoove imports
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

18e7de4f

19 Feb, 2020 1 commit

[python] [R-package] refine the parameters for Dataset (#2594) · 9f79e840

Guolin Ke authored Feb 19, 2020



* reset

* fix a bug

* fix test

* Update c_api.h

* support to no filter features by min_data

* add warning in reset config

* refine warnings for override dataset's parameter

* some cleans

* clean code

* clean code

* refine C API function doxygen comments

* refined new param description

* refined doxygen comments for R API function

* removed stuff related to int8

* break long line in warning message

* removed tests which results cannot be validated anymore

* added test for warnings about unchangeable params

* write parameter from dataset to booster

* consider free_raw_data.

* fix params

* fix bug

* implementing R

* fix typo

* filter params in R

* fix R

* not min_data

* refined tests

* fixed linting

* refine

* pilint

* add docstring

* fix docstring

* R lint

* updated description for C API function

* use param aliases in Python

* fixed typo

* fixed typo

* added more params to test

* removed debug print

* fix dataset construct place

* fix merge bug

* Update feature_histogram.hpp

* add is_sparse back

* remove unused parameters

* fix lint

* add data random seed

* update

* [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767)
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

9f79e840