Commits · 075513fa27f6541331c6765adb26a0e3d630abb7 · tianlh / LightGBM-DCU

05 May, 2020 1 commit
- [docs] updated docs about output values (#3037) · 796ba803
  Nikita Titov authored May 05, 2020
  
  796ba803
10 Apr, 2020 1 commit

[python] Re-enable scikit-learn 0.22+ support (#2949) · c633c6c2

Nikita Titov authored Apr 10, 2020

* Revert "specify the last supported version of scikit-learn (#2637)"

This reverts commit d1002776.

* ban scikit-learn 0.22.0 and skip broken test

* fix updated test

* fix lint test

* Revert "fix lint test"

This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.

c633c6c2

20 Mar, 2020 1 commit

[python] handle RandomState object in Scikit-learn Api (#2904) · cf0a992e

Lukas Pfannschmidt authored Mar 20, 2020



* Add handling of RandomState object, which is standard for sklearn methods.

LightGBM expects an integer seed instead of an object.
If passed object is RandomState, we choose random integer based on its state to seed the underlying low level code.
While chosen random integer is only in the range between 1 and 1e10 I expect it to have enough entropy (?) to not matter in practice.

* Add RandomState object to random_state docstring.

* remove blank line

* Use property to handle setting random_state.
This enables setting cloned estimators with the set_params method in sklearn.

* Add docstring to attribute.

* Fix and simplify docstring.

* Add test case.

* Use maximal int for datatype in seed derivation.

* Replace random_state property with interfacing in fit method.
Derives int seed for C code only when fitting and keeps RandomState object as param.

* Adapt unit test to property change.

* Extended test case and docstring
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>

* Add more equality checks (feature importance, best iteration/score).

* Add equality comparison of boosters represented by strings.
Remove useless best_iteration_ comparison (we do not use early_stopping).

* fix whitespace

* Test if two subsequent fits produce different models

* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

cf0a992e

15 Feb, 2020 1 commit
- [python] in sklearn wrapper pass cat features to Dataset constructor (#2763) · 923226b1
  Nikita Titov authored Feb 15, 2020
  
  923226b1
06 Feb, 2020 1 commit

[python] add property: feature_name_ in lgb sklearn api (#2740) · 87b6396d

zhangqibot authored Feb 07, 2020

* add property: feature_name_ in lgb sklearn api

* modify the comments

* fix linting errors and add info about new attribute: feature_name_

87b6396d

19 Dec, 2019 1 commit
- specify the last supported version of scikit-learn (#2637) · d1002776
  Nikita Titov authored Dec 19, 2019
  
  d1002776
09 Dec, 2019 1 commit
- [python][sklearn] do not modify args in fit function and minor code cleanup (#2619) · eec60731
  Nikita Titov authored Dec 09, 2019
```
* clean code

* clean code

* do not modify args in fit function

* added test
```
  eec60731
05 Dec, 2019 2 commits

[python] Allow python sklearn interface's fit() to pass init_model to train() (#2447) · f3afe98b

aaiyer authored Dec 05, 2019

* allow python sklearn interface's fit() to pass init_model to train()

* Fix whitespace issues, and change ordering of parameters to be backward
compatible

* Formatting fixes

* allow python sklearn interface's fit() to pass init_model to train()

* Fix whitespace issues, and change ordering of parameters to be backward
compatible

* Formatting fixes

* Recognize LGBModel objects for init_model

* simplified condition

* updated docstring

* added test

f3afe98b

[python][R-package] warn users about untransformed values in case of custom obj (#2611) · 69c1c330
Nikita Titov authored Dec 05, 2019

69c1c330

27 Oct, 2019 1 commit
- [python] removed unused pylint directives (#2466) · 00d1e693
  Nikita Titov authored Oct 27, 2019
  
  00d1e693
22 Oct, 2019 1 commit
- [python] handle params aliases centralized (#2489) · 5dcd4be9
  Nikita Titov authored Oct 22, 2019
```
* handle aliases centralized

* convert aliases dict to class
```
  5dcd4be9
26 Sep, 2019 1 commit
- fixed docstrings (#2451) · a0d7313b
  Nikita Titov authored Sep 26, 2019
  
  a0d7313b
15 Sep, 2019 1 commit

[python] Bug fix for first_metric_only on earlystopping. (#2209) · 84754399

kenmatsu4 authored Sep 16, 2019

* Bug fix for first_metric_only if the first metric is train metric.

* Update bug fix for feval issue.

* Disable feval for first_metric_only.

* Additional test items.

* Fix wrong assertEqual settings & formating.

* Change dataset of test.

* Fix random seed for test.

* Modiry assumed test result due to different sklearn verion between CI and local.

* Remove f-string

* Applying variable assumed test result for test.

* Fix flake8 error.

* Modifying in accordance with review comments.

* Modifying for pylint.

* simplified tests

* Deleting error criteria `if eval_metric is None`.

* Delete test items of classification.

* Simplifying if condition.

* Applying first_metric_only for sklearn wrapper.

* Modifying test_sklearn for comforming to python 2.x

* Fix flake8 error.

* Additional fix for sklearn and add tests.

* Bug fix and add test cases.

* some refactor

* fixed lint

* Fix duplicated metrics scores to pass the test.

* Fix the case first_metric_only not in params.

* Converting metrics aliases.

* Add comment.

* Modify comment for pylint.

* Modify comment for pydocstyle.

* Using split test set for two eval_set.

* added test case for metric aliases and length checks

* minor style fixes

* fixed rmse name and alias position

* Fix the case metric=[]

* Fix using env.model._train_data_name

* Fix wrong test condition.

* Move initial process to _init() func.

* Modify test setting for test_sklearn & training data matching on callback.py

* test_sklearn.py
-> A test case for training is wrong, so fixed.

* callback.py
-> A condition of if statement for detecting test dataset is wrong, so fixed.

* Support composite name metrics.

* Remove metric check process & reduce redundant test cases.

For #2273 fixed not only the order of metrics in cpp, removing metric check process at callback.py

* Revised according to the matters pointed out on a review.

* increased code readability

* Fix the issue of order of validation set.

* Changing to OrderdDict from default dict for score result.

* added missed check in cv function for first_metric_only and feval co-occurrence

* keep order only for metrics but not for datasets in best_score

* move OrderedDict initialization to init phase

* fixed minor printing issues

* move first metric detection to init phase and split can be performed without checks

* split only once during callback

* removed excess code

* fixed typo in variable name and squashed ifs

* use setdefault

* hotfix

* fixed failing test

* refined tests

* refined sklearn test

* Making "feval" effective on early stopping.

* allow feval and first_metric_only for cv

* removed unused code

* added tests for feval

* fixed printing

* add note about whitespaces in feval name

* Modifying final iteration process in case valid set is training data.

84754399

08 Sep, 2019 1 commit

[python] Improved python tree plots (#2304) · f52be9be

CharlesAuguste authored Sep 08, 2019

* Some basic changes to the plot of the trees to make them readable.

* Squeezed the information in the nodes.

* Added colouring when a dictionnary mapping the features to the constraints is passed.

* Fix spaces.

* Added data percentage as an option in the nodes.

* Squeezed the information in the leaves.

* Important information is now in bold.

* Added a legend for the color of monotone splits.

* Changed "split_gain" to "gain" and "internal_value" to "value".

* Sqeezed leaves a bit more.

* Changed description in the legend.

* Revert "Sqeezed leaves a bit more."

This reverts commit dd8bf14a3ba604b0dfae3b7bb1c64b6784d15e03.

* Increased the readability for the gain.

* Tidied up the legend.

* Added the data percentage in the leaves.

* Added the monotone constraints to the dumped model.

* Monotone constraints are now specified automatically when plotting trees.

* Raise an exception instead of the bug that was here before.

* Removed operators on the branches for a clearer design.

* Small cleaning of the code.

* Setting a monotone constraint on a categorical feature now returns an exception instead of doing nothing.

* Fix bug when monotone constraints are empty.

* Fix another bug when monotone constraints are empty.

* Variable name change.

* Added is / isn't on every edge of the trees.

* Fix test "tree_create_digraph".

* Add new test for plotting trees with monotone constraints.

* Typo.

* Update documentation of categorical features.

* Typo.

* Information in nodes more explicit.

* Used regular strings instead of raw strings.

* Small refactoring.

* Some cleaning.

* Added future statement.

* Changed output for consistency.

* Updated documentation.

* Added comments for colors.

* Changed text on edges for more clarity.

* Small refactoring.

* Modified text in leaves for consistency with nodes.

* Updated default values and documentaton for consistency.

* Replaced CHECK with Log::Fatal for user-friendliness.

* Updated tests.

* Typo.

* Simplify imports.

* Swapped count and weight to improve readibility of the leaves in the plotted trees.

* Thresholds in bold.

* Made information in nodes written in a specific order.

* Added information to clarify legend.

* Code cleaning.

f52be9be

04 Jun, 2019 2 commits
- [python] fix class_weight (#2199) · b6f65783
  Nikita Titov authored Jun 04, 2019
```
* fixed class_weight

* fixed lint

* added test

* hotfix
```
  b6f65783
- [python] removed unused import and variable (#2213) · 7d03ced3
  Nikita Titov authored Jun 04, 2019
```
* Update sklearn.py

* Update parameter_generator.py
```
  7d03ced3
27 May, 2019 1 commit

[python] fixed picklability of sklearn models with custom obj and updated... · 2459362a

Nikita Titov authored May 27, 2019

[python] fixed picklability of sklearn models with custom obj and updated docstings for custom obj (#2191)

* refactored joblib test

* fixed picklability of sklearn models with custom obj and updated docstings for custom obj

* pickled model should be able to predict without refitting

2459362a

15 May, 2019 2 commits
- [docs] fixing max_depth param description (#2155) · 3d8770af
  Laurae authored May 15, 2019
```
* PR #1879

* Update docs with parameter_generator.py

* Update wrapper doc for sklearn
```
  3d8770af
- [python] added ability to pass first_metric_only in params (#2175) · f91e5644
  Nikita Titov authored May 15, 2019
```
* added ability to pass first_metric_only in params

* simplified tests

* fixed test

* fixed punctuation
```
  f91e5644
06 May, 2019 1 commit
- added estimator's tags (#2150) · a47782f5
  Nikita Titov authored May 06, 2019
  
  a47782f5
28 Apr, 2019 1 commit
- fixed minor typos (#2119) · 24ad35f7
  Nikita Titov authored Apr 28, 2019
  
  24ad35f7
19 Apr, 2019 2 commits

[python] ignore pandas ordered categorical columns by default (#2115) · d115769c
Nikita Titov authored Apr 19, 2019
```
* ignore pandas ordered categorical columns by default

* fix tests

* fix tests

* added comments
```
d115769c

[docs] Update doc string for pred_contrib (#2116) · 89f2021a

Scott Lundberg authored Apr 18, 2019

* Update doc string for pred_contrib

See comments at the end of #1969

* Update basic.py

* Update basic.py

* update doc strings

* update equals sign in doc string

* strip whitespace and gen rst

* strip whitespace

89f2021a

18 Apr, 2019 1 commit
- [docs] added note about the spoiled probabilities (#2113) · beb35d56
  Nikita Titov authored Apr 18, 2019
  
  beb35d56
25 Mar, 2019 1 commit

[python] Use first_metric_only flag for early_stopping function. (#2049) · 011cc90a

kenmatsu4 authored Mar 25, 2019

* Use first_metric_only flag for early_stopping function.

In order to apply early stopping with only first metric, applying first_metric_only flag for early_stopping function.

* upcate comment

* Revert "upcate comment"

This reverts commit 1e75a1a415cc16cfbe795181e148ebfe91469be4.

* added test

* fixed docstring

* cut comment and save one line

* document new feature

011cc90a

04 Feb, 2019 1 commit

[python] convert datatable to numpy directly (#1970) · 2c9d3320

Guolin Ke authored Feb 05, 2019

* convert datatable to numpy directly

* fix according to comments

* updated more docstrings

* simplified isinstance check

* Update compat.py

2c9d3320

20 Dec, 2018 1 commit

[python] fix creating train_set in fit (#1916) · c9bcba44

Tsukasa OMOTO authored Dec 20, 2018

* [python] fix creating train_set in fit

https://github.com/Microsoft/LightGBM/blob/cc99f0d36ae929eb02b22a072823ab7c6d3155ab/python-package/lightgbm/sklearn.py#L519
may False even if valid_data[0] is X and valid_data[1] is y actually, because `check_X_y` might return copy of X and y.
https://scikit-learn.org/0.20/modules/generated/sklearn.utils.check_X_y.html

cf. https://github.com/Microsoft/LightGBM/pull/451

* use assertIn

c9bcba44

25 Nov, 2018 1 commit
- [python] fixed result shape in case of predict_proba with raw_score arg · cc99f0d3
  Nikita Titov authored Nov 25, 2018
  
  cc99f0d3
25 Oct, 2018 1 commit
- [docs] corrected verbose argument description in fit method (#1781) · b2a935fd
  Nikita Titov authored Oct 26, 2018
  
  b2a935fd
16 Oct, 2018 1 commit
- [docs][ci][python] added docstring style test and fixed errors in existing docstrings (#1759) · ccf2570c
  Nikita Titov authored Oct 16, 2018
```
* added docstring style test and fixed errors in existing docstrings

* hotfix

* hotfix

* fix grammar

* hotfix
```
  ccf2570c
09 Oct, 2018 1 commit
- [docs] Fixed some typos in Python API Docs (#1737) · 7949cf51
  Zafarullah Mahmood authored Oct 09, 2018
```
* Fixed some typos in Python API Docs

* FixTypo changed validation set -> sets
```
  7949cf51
03 Oct, 2018 1 commit
- do not modify users params (#1722) · 9bbfffe6
  Nikita Titov authored Oct 03, 2018
  
  9bbfffe6
28 Sep, 2018 1 commit
- [ci][python] fixes according to scikit-learn 0.20 release (#1707) · f53116af
  Nikita Titov authored Sep 28, 2018
```
* fixed FutureWarning about cv default value

* fixed according to new check_estimator API

* fixed joblib warning
```
  f53116af
25 Sep, 2018 1 commit

[python] break extremely large lines and fix classname (#1698) · 7825084f

Nikita Titov authored Sep 25, 2018

* break extremely large lines in basic.py

* break extremely large lines in callback.py

* break extremely large lines in engine.py

* break extremely large lines in sklearn.py

* hotfixes

7825084f

20 Sep, 2018 1 commit
- removed deprecated code (#1677) · 2c5409ba
  Nikita Titov authored Sep 20, 2018
  
  2c5409ba
19 Sep, 2018 1 commit
- [python][docs] fix the doc of subsample_for_bin default value (#1680) · 9f9f9106
  Chi Su authored Sep 19, 2018
  
  9f9f9106
11 Sep, 2018 1 commit
- Docs & Warning on sparse categorical features (#1636) · a58aca64
  dmitryikh authored Sep 11, 2018
```
* warning on categorical feature with sparse values

* [docs] categorical features note
```
  a58aca64
06 Sep, 2018 1 commit

[python] pass params to _InnerPredictor in train and cv and verbose fix (#1628) · bd3889f7

Nikita Titov authored Sep 06, 2018

* pass params to _InnerPredictor in train and cv

* fixed verbosity param description

* treat silent param as Fatal log level

* create Dataset in refit method silently

* do not overwrite verbose param by silent argument

bd3889f7

29 Aug, 2018 1 commit
- [docs] added note about shap package (#1620) · abd73765
  Nikita Titov authored Aug 29, 2018
  
  abd73765
27 Aug, 2018 1 commit

various improvements around metric param and early_stopping_rounds param description (#1589) · cd6d0583

Nikita Titov authored Aug 27, 2018

* bring consistency and clearness into early_stopping_rounds desc, metric desc and implementation

* hotfix

* hotfix

* used NDCG as default metric for lambdarank task

* fixed missed methods at ReadTheDocs and changed default eval_metric

* leaved only unique metrics

* fixed comment

cd6d0583