Commits · 00f87c52ec24b1f8358c230646685459e31b60b8 · tianlh / LightGBM-DCU

02 Dec, 2021 1 commit
- [python][sklearn] respect parameters for predictions in `init()` and `set_params()` methods (#4822) · f57ef6f4
  Nikita Titov authored Dec 02, 2021
```
* in predict(), respect params set via `set_params()` after fit()

* continue

* add test

* fix return name

* hotfix

* simplify
```
  f57ef6f4
30 Nov, 2021 1 commit
- [python][sklearn] remove `verbose` argument from `fit()` method (#4832) · 4072e9f7
  Nikita Titov authored Dec 01, 2021
  
  4072e9f7
20 Nov, 2021 1 commit

[python] Remove `silent` argument (#4800) · 2caf945f

Nikita Titov authored Nov 21, 2021

* Update test_plotting.py

* Update dask.py

* Update sklearn.py

* Update test_sklearn.py

* Update basic.py

* Update engine.py

* Update test_engine.py

* Update basic.py

* Update basic.py

* Update engine.py

2caf945f

10 Nov, 2021 1 commit

[python][sklearn] respect objective aliases (#4758) · 0a4d1908

Nikita Titov authored Nov 10, 2021

* respect objective aliases

* Update test_sklearn.py

* revert removal of blank lines

* add argument name which is being overwritten in warning message

0a4d1908

05 Nov, 2021 1 commit
- [python][sklearn] add `n_estimators_` and `n_iter_` post-fit attributes (#4753) · aab212a7
  Nikita Titov authored Nov 05, 2021
```
* add n_estimators_ and n_iter_ post-fit attributes

* address review comments
```
  aab212a7
29 Oct, 2021 1 commit
- [tests] [python] add test for non-serializable callback (#4741) · 798dc1d4
  Nikita Titov authored Oct 29, 2021
  
  798dc1d4
10 Sep, 2021 1 commit
- [python] [sklearn] respect `eval_at` aliases in keyword arguments (#4599) · 79463dfb
  Nikita Titov authored Sep 10, 2021
  
  79463dfb
05 Jul, 2021 1 commit

[python] minor refactoring of Python code (#4442) · 7eac5a63

Nikita Titov authored Jul 05, 2021

* Update test_sklearn.py

* Update test_basic.py

* Update dask.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update callback.py

7eac5a63

04 Jul, 2021 1 commit
- [python] migrate to pathlib in python tests (#4435) · cff80442
  Nikita Titov authored Jul 04, 2021
  
  cff80442
24 Feb, 2021 1 commit

[dask][python-package] include support for column array as label (#3943) · 5dacd603

jmoralez authored Feb 24, 2021

* include support for column array as label

* remove nested ifs

* fix linting errors

* include tests for sklearn regressors

* include docstring for numpy_1d_array_to_dtype

* include . at end of docstring

* remove pandas import and test for regression, classification and ranking

* check predictions of sklearn models as well

* test training only in dask. drop pandas series tests

* use PANDAS_INSTALLED and pd_Series

* inline imports

* use col array in fit for test_dask

* include review comments

5dacd603

16 Feb, 2021 1 commit
- [ci][python] run isort in CI linting job (#3990) · d6ebd063
  Nikita Titov authored Feb 16, 2021
```
* run isort in CI linting job

* workaround conda compatibility issues
```
  d6ebd063
26 Jan, 2021 2 commits

[python][tests] minor Python tests cleanup (#3860) · 9eeac3c7

Nikita Titov authored Jan 26, 2021

* Update test_engine.py

* Update test_sklearn.py

* Update test_engine.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_engine.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

* Update test_engine.py

* Update test_dual.py

* Update test_engine.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

9eeac3c7

[python-package] migrate test_sklearn.py to pytest (#3844) · 439c721a
Thomas J. Fan authored Jan 25, 2021
```
* TST Migrates test_sklearn.py to pytest

* STY Fixes linting

* FIX Adds reason

* ENH Address comments
```
439c721a

10 Nov, 2020 1 commit

[tests][python][sklearn] make sklearn integration test compatible with 0.24 (#3533) · 2315c0d1

Guillaume Lemaitre authored Nov 10, 2020

* TST make sklearn integration test compatible with 0.24

* remove useless import

* remove outdated comment

* order import

* use parametrize_with_checks

* change the reason

* skip constructible if != 0.23

* make tests behave the same across sklearn version

* linter

* address suggestions

2315c0d1

29 Oct, 2020 1 commit

[tests][python] reduce unnecessary data loading in tests (#3486) · 03c4d455

James Lamb authored Oct 29, 2020



* [ci] [python] reduce unnecessary data loading in tests

* add profiling files to gitignore

* just use cache()

* default on cache size

* patch lru_cache on Python 2.7

* linting

* reduce duplicated code

* missing warnings

* fix imports

* fix lru_cache backport

* missing kwargs

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* reduce duplicated code

* cache in test_plotting
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

03c4d455

27 Oct, 2020 1 commit

Add support to optimize for NDCG at a given truncation level (#3425) · ba0a1f8d

Pavel Metrikov authored Oct 27, 2020



* Add support to optimize for NDCG at a given truncation level

In order to correctly optimize for NDCG@_k_, one should exclude pairs containing both documents beyond the top-_k_ (as they don't affect NDCG@_k_ when swapped).

* Update rank_objective.hpp

* Apply suggestions from code review
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update rank_objective.hpp

remove the additional branching: get high_rank and low_rank by one "if".

* Update config.h

add description to lambdarank_truncation_level parameter

* Update Parameters.rst

* Update test_sklearn.py

update expected NDCG value for a test, as it was affected by the underlying change in the algorithm

* Update test_sklearn.py

update NDCG@3 reference value

* fix R learning-to-rank tests

* Update rank_objective.hpp

* Update include/LightGBM/config.h
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

* Update Parameters.rst
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

ba0a1f8d

06 Sep, 2020 1 commit

[Python] Refactor scikit-learn API to allow a list of evaluation metrics (#3254) · afc76d2c

Germán Ramírez-Espinoza authored Sep 07, 2020



* Refactors sklearn API to allow a list of evaluation metrics in the parameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality

* Simplify expression to check whether the user passed one or multiple metrics to eval_metric parameter

* Simplify new tests by using custom metrics already defined in the test file

* Update docstring to reflect the fact that the parameter "feval" from the "train" and "cv" functions can also receive a list of callables

* Remove oxford comma from docstrings

Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Use named-parameters to make sure code is compatible with future versions of scikit-learn

Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove throwaway return value to make code more succinct
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Move statement to group together the code related to feval

* Avoid modifying original args as it causes errors in scikit-learn tools

For details see: https://github.com/microsoft/LightGBM/pull/2619



* Consolidate multiple eval-metrics unit-tests into one test
Co-authored-by: German I Ramirez-Espinoza <gire@home>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

afc76d2c

02 Sep, 2020 1 commit
- be compatible with check_is_fitted sklearn function (#3329) · ca066d49
  Nikita Titov authored Sep 02, 2020
  
  ca066d49
06 Aug, 2020 1 commit

[Python] / [R] add start_iteration to python predict interface (fix #3058) (#3272) · 82e2ff7a

shiyu1994 authored Aug 06, 2020



* [python] add start_iteration to python predict interface (#3058)

* Apply suggestions from code review

* Update lightgbm_R.h

* Apply suggestions from code review

* Apply suggestions from code review

* fix R interface

* update R documentation
Co-authored-by: Guolin Ke <guolin.ke@outlook.com>

82e2ff7a

30 Jul, 2020 1 commit

[tests][python][scikit-learn] New unit tests and maintenance (#3253) · fed57520

Alex Wozniakowski authored Jul 31, 2020



* [python][scikit-learn] New unit tests and maintenance

* Includes multioutput tests
* Includes RandomizedSearchCV test
* Updates dataset parameters to eliminate FutureWarning

* Change to n_class in load_digits

* Fix spacing

* Changes after review

* Also updates validation split in grid and random search

* Include skipif for classes_ attr

* Updates checks for classes and order
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fed57520

14 Jul, 2020 1 commit

[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics... · 7b8b5151

Germán Ramírez-Espinoza authored Jul 15, 2020


[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier (#3222)

* Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier

* Move bug-fix test to the test_metrics unit-test

* Fix test to avoid issues with existing tests

* Fix coding-style error
Co-authored-by: German I Ramirez-Espinoza <gire@home>

7b8b5151

27 Jun, 2020 1 commit

[python][scikit-learn] new stacking tests and make number of features a property (#3173) · 72849466

Alex authored Jun 28, 2020

* modify attribute and include stacking tests

* backwards compatibility

* check sklearn version

* move stacking import

* Number of input features (#3173)

* Number of input features (#3173)

* Number of input features (#3173)

* Number of input features (#3173)

Split number of features and stacking tests.

* Number of input features (#3173)

Modify test name.

* Number of input features (#3173)

Update stacking tests for review comments.

* Number of input features (#3173)

* Number of input features (#3173)

* Number of input features (#3173)

* Number of input features (#3173)

Modify classifier test.

* Number of input features (#3173)

* Number of input features (#3173)

Check score.

72849466

30 Apr, 2020 1 commit
- Fix loss computation in rank cross-entropy objective (#3031) · 7076cb8a
  sbruch authored Apr 30, 2020
```
* Fix loss computation

* fix test
```
  7076cb8a
25 Apr, 2020 1 commit
- [python][tests] unused and missing imports (#3023) · eedc1a7f
  James Lamb authored Apr 25, 2020
  
  eedc1a7f
10 Apr, 2020 1 commit

[python] Re-enable scikit-learn 0.22+ support (#2949) · c633c6c2

Nikita Titov authored Apr 10, 2020

* Revert "specify the last supported version of scikit-learn (#2637)"

This reverts commit d1002776.

* ban scikit-learn 0.22.0 and skip broken test

* fix updated test

* fix lint test

* Revert "fix lint test"

This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.

c633c6c2

20 Mar, 2020 1 commit

[python] handle RandomState object in Scikit-learn Api (#2904) · cf0a992e

Lukas Pfannschmidt authored Mar 20, 2020



* Add handling of RandomState object, which is standard for sklearn methods.

LightGBM expects an integer seed instead of an object.
If passed object is RandomState, we choose random integer based on its state to seed the underlying low level code.
While chosen random integer is only in the range between 1 and 1e10 I expect it to have enough entropy (?) to not matter in practice.

* Add RandomState object to random_state docstring.

* remove blank line

* Use property to handle setting random_state.
This enables setting cloned estimators with the set_params method in sklearn.

* Add docstring to attribute.

* Fix and simplify docstring.

* Add test case.

* Use maximal int for datatype in seed derivation.

* Replace random_state property with interfacing in fit method.
Derives int seed for C code only when fitting and keeps RandomState object as param.

* Adapt unit test to property change.

* Extended test case and docstring
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Add more equality checks (feature importance, best iteration/score).

* Add equality comparison of boosters represented by strings.
Remove useless best_iteration_ comparison (we do not use early_stopping).

* fix whitespace

* Test if two subsequent fits produce different models

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

cf0a992e

26 Feb, 2020 1 commit

Code refactoring for ranking objective & Faster ndcg_xendcg (#2801) · e676af23

Guolin Ke authored Feb 26, 2020

* code refactoring

* update vcproject

* refine

* fix test

* Update tests/python_package_test/test_sklearn.py

* fix test

e676af23

25 Feb, 2020 1 commit
- [tests][python] fixed pandas deprecation warning in tests (#2819) · 745b54d6
  Nikita Titov authored Feb 25, 2020
```
* fxied pandas deprecation warning in tests

* support old versions of pandas
```
  745b54d6
03 Feb, 2020 1 commit
- [python][tests] fixed typo (#2732) · 85889901
  Nikita Titov authored Feb 03, 2020
```
* Update test_engine.py

* Update test_sklearn.py
```
  85889901
02 Feb, 2020 1 commit

Support both row-wise and col-wise multi-threading (#2699) · 509c2e50

Guolin Ke authored Feb 02, 2020



* commit

* fix a bug

* fix bug

* reset to track changes

* refine the auto choose logic

* sort the time stats output

* fix include

* change  multi_val_bin_sparse_threshold

* add cmake

* add _mm_malloc and _mm_free for cross platform

* fix cmake bug

* timer for split

* try to fix cmake

* fix tests

* refactor DataPartition::Split

* fix test

* typo

* formating

* Revert "formating"

This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222.

* add document

* [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719)

* naming

* fix gpu code

* Update include/LightGBM/bin.h
Co-Authored-By: James Lamb <jaylamb20@gmail.com>

* Update src/treelearner/ocl/histogram16.cl

* test: swap compilers for CI

* fix omp

* not avx2

* no aligned for feature histogram

* Revert "refactor DataPartition::Split"

This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8.

* slightly refactor data partition

* reduce the memory cost
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

509c2e50

30 Jan, 2020 1 commit

Implementation of XE_NDCG_MART for the ranking task (#2620) · 86530988

sbruch authored Jan 29, 2020

* Implementation of XE_NDCG loss function for ranking.

* Add citation

* Check in example usage for xe_ndcg loss.

* Seed the generator when a seed is provided in the config. Add unit-tests for xe_ndcg

* Update documentation

* Fix indentation

* Address issues raised by reviewers.

* Clean up include statements.

* Fix issues raised by reviewers.

* Regenerate parameters.rst

* Add a note to explain that reproducing xe_ndcg results requires num_threads to be one.

* Introduce objective_seed and use that in rank_xendcg instead of directly using seed

* Change default value of objective_seed

86530988

09 Dec, 2019 1 commit
- [python][sklearn] do not modify args in fit function and minor code cleanup (#2619) · eec60731
  Nikita Titov authored Dec 09, 2019
```
* clean code

* clean code

* do not modify args in fit function

* added test
```
  eec60731
05 Dec, 2019 2 commits

[python] Allow python sklearn interface's fit() to pass init_model to train() (#2447) · f3afe98b

aaiyer authored Dec 05, 2019

* allow python sklearn interface's fit() to pass init_model to train()

* Fix whitespace issues, and change ordering of parameters to be backward
compatible

* Formatting fixes

* allow python sklearn interface's fit() to pass init_model to train()

* Fix whitespace issues, and change ordering of parameters to be backward
compatible

* Formatting fixes

* Recognize LGBModel objects for init_model

* simplified condition

* updated docstring

* added test

f3afe98b

[python][R-package] warn users about untransformed values in case of custom obj (#2611) · 69c1c330
Nikita Titov authored Dec 05, 2019

69c1c330

27 Oct, 2019 2 commits
- [tests][python] refined python tests (#2483) · 1f1dc452
  Nikita Titov authored Oct 27, 2019
```
* speed up tests

* more updates

* fixed pylint

* updated tests

* Update test_sklearn.py

* test that indices are sorted internally
```
  1f1dc452
- [python] removed unused pylint directives (#2466) · 00d1e693
  Nikita Titov authored Oct 27, 2019
  
  00d1e693
15 Sep, 2019 1 commit

[python] Bug fix for first_metric_only on earlystopping. (#2209) · 84754399

kenmatsu4 authored Sep 16, 2019

* Bug fix for first_metric_only if the first metric is train metric.

* Update bug fix for feval issue.

* Disable feval for first_metric_only.

* Additional test items.

* Fix wrong assertEqual settings & formating.

* Change dataset of test.

* Fix random seed for test.

* Modiry assumed test result due to different sklearn verion between CI and local.

* Remove f-string

* Applying variable assumed test result for test.

* Fix flake8 error.

* Modifying in accordance with review comments.

* Modifying for pylint.

* simplified tests

* Deleting error criteria `if eval_metric is None`.

* Delete test items of classification.

* Simplifying if condition.

* Applying first_metric_only for sklearn wrapper.

* Modifying test_sklearn for comforming to python 2.x

* Fix flake8 error.

* Additional fix for sklearn and add tests.

* Bug fix and add test cases.

* some refactor

* fixed lint

* Fix duplicated metrics scores to pass the test.

* Fix the case first_metric_only not in params.

* Converting metrics aliases.

* Add comment.

* Modify comment for pylint.

* Modify comment for pydocstyle.

* Using split test set for two eval_set.

* added test case for metric aliases and length checks

* minor style fixes

* fixed rmse name and alias position

* Fix the case metric=[]

* Fix using env.model._train_data_name

* Fix wrong test condition.

* Move initial process to _init() func.

* Modify test setting for test_sklearn & training data matching on callback.py

* test_sklearn.py
-> A test case for training is wrong, so fixed.

* callback.py
-> A condition of if statement for detecting test dataset is wrong, so fixed.

* Support composite name metrics.

* Remove metric check process & reduce redundant test cases.

For #2273 fixed not only the order of metrics in cpp, removing metric check process at callback.py

* Revised according to the matters pointed out on a review.

* increased code readability

* Fix the issue of order of validation set.

* Changing to OrderdDict from default dict for score result.

* added missed check in cv function for first_metric_only and feval co-occurrence

* keep order only for metrics but not for datasets in best_score

* move OrderedDict initialization to init phase

* fixed minor printing issues

* move first metric detection to init phase and split can be performed without checks

* split only once during callback

* removed excess code

* fixed typo in variable name and squashed ifs

* use setdefault

* hotfix

* fixed failing test

* refined tests

* refined sklearn test

* Making "feval" effective on early stopping.

* allow feval and first_metric_only for cv

* removed unused code

* added tests for feval

* fixed printing

* add note about whitespaces in feval name

* Modifying final iteration process in case valid set is training data.

84754399

03 Sep, 2019 1 commit
- [ci][tests] install joblib for test directly (#2374) · df26b65d
  Nikita Titov authored Sep 03, 2019
  
  df26b65d
24 Aug, 2019 1 commit

normalize the lambdas in lambdamart objective (#2331) · 0dfda826

Guolin Ke authored Aug 25, 2019

* norm the lambda scores

* change default to false

* update doc

* typo

* Update Parameters.rst

* Update config.h

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update rank_objective.hpp

* Update Parameters.rst

* Update config.h

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

0dfda826

17 Aug, 2019 1 commit

sigmoid_ in grad and hess for rank objective (#2322) · aee92f63

sbruch authored Aug 16, 2019

* Lambdas and hessians need to factor sigmoid_ into the computation. Additionally, the sigmoid function has an arbitrary factor of 2 in the exponent; it is not just non-standard but the gradients are not computed correctly anyway.

* Update unit test

* Also remove a heuristic that normalizes the gradient by the difference in scores.

* Also fix unit test after removing the heuristic

aee92f63