Commits · 55e9446ab5ae4d20700c9a57eb194c5fc60817ed · tianlh / LightGBM-DCU

05 Nov, 2021 1 commit
- [python] improve warning message about aliases in `cv()` function (#4766) · 55e9446a
  Nikita Titov authored Nov 05, 2021
  
  55e9446a
30 Oct, 2021 2 commits
- [docs] improve docs about `nthreads` parameter (#4756) · dac0dffe
  Nikita Titov authored Oct 31, 2021
```
* in predict(), respect params set via `set_params()` after fit()

* extract docs changes
```
  dac0dffe
- [python] Make dummy classes constructible with any arguments (#4749) · 9a7562d6
  Nikita Titov authored Oct 30, 2021
```
* Make dummy classes constructible with any arguments

* Update compat.py
```
  9a7562d6
27 Oct, 2021 1 commit
- [python][sklearn] Allow non-serializable objects in callbacks argument (#4723) · 311017ae
  Nikita Titov authored Oct 27, 2021
  
  311017ae
26 Oct, 2021 1 commit
- [python] Improve error message for plot_metric with Booster (#4709) · 1cc84b35
  Jacob Stevenson authored Oct 26, 2021
```
* Improve error message for plot_metric with Booster

* Update error message
```
  1cc84b35
15 Oct, 2021 1 commit
- fix mypy error in engine.py (#4675) · 18a300aa
  Rakshit P authored Oct 15, 2021
  
  18a300aa
09 Oct, 2021 1 commit
- [python][sklearn] use `__sklearn_is_fitted__()` in all estimator fitness checks (#4654) · f3987f37
  Nikita Titov authored Oct 10, 2021
  
  f3987f37
07 Oct, 2021 1 commit

[python] add type hints to _safe_call (#4641) · 7fa07ee2

strobel authored Oct 07, 2021


Co-authored-by: strobel <thaddaeus.strobel@ai4bd.com>
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>

7fa07ee2

05 Oct, 2021 2 commits
- [python][sklearn] add `__sklearn_is_fitted__()` method to be better compatible... · 4b140bcc
  Nikita Titov authored Oct 06, 2021
```
[python][sklearn] add `__sklearn_is_fitted__()` method to be better compatible with scikit-learn API (#4636)
```
  4b140bcc
- add param aliases from scikit-learn (#4637) · e95d5ab8
  Nikita Titov authored Oct 05, 2021
  
  e95d5ab8
25 Sep, 2021 1 commit
- [python] deprecate "auto" value of ylabel argument of plot_metric() function (#4624) · 74c7904b
  Nikita Titov authored Sep 25, 2021
  
  74c7904b
23 Sep, 2021 1 commit
- [python] add placeholders to titles in plotting functions (#4614) · b78175b7
  Nikita Titov authored Sep 23, 2021
  
  b78175b7
20 Sep, 2021 1 commit
- fix mypy error (#4615) · 702fff22
  Nikita Titov authored Sep 20, 2021
  
  702fff22
17 Sep, 2021 1 commit

[python-package] Support 2d collections as input for `init_score` in... · f1f5ba15

José Morales authored Sep 17, 2021


[python-package] Support 2d collections as input for `init_score` in multiclass classification task (#4150)

* initial implementation of init_score for multiclass classification

* check for 1d or 2d collection in init_score

* remove dataset import

* initial comments

* update dask test and docstrings

* update docstrings

* move logic to set_field. reshape back on get_field

* add type hints and update docstrings for dask. fix Dataset.set_field

* revert wrong docstrings and type hints

* add extra comma for consistency

* prefix private functions with underscore

add type hints to new functions

make commas consistent in dask and basic

* add missing spaces after type hint

* remove shape condition for dataframe in is_2d_collection
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>

f1f5ba15

15 Sep, 2021 1 commit

[python] rename `print_evaluation()` into `log_evaluation()` (#4604) · 54facc4d

Nikita Titov authored Sep 16, 2021

* Update __init__.py

* Update Python-API.rst

* Update engine.py

* Update test_utilities.py

* Update sklearn.py

* Update callback.py

* Update callback.py

* Update callback.py

54facc4d

12 Sep, 2021 1 commit
- [RFC][python] deprecate advanced args of `train()` and `cv()` functions and sklearn wrapper (#4574) · 86bda6f0
  Nikita Titov authored Sep 12, 2021
```
* deprecate advanced args of `train()` and `cv()`

* update Dask test

* improve deducing

* address review comments
```
  86bda6f0
10 Sep, 2021 1 commit
- [python] [sklearn] respect `eval_at` aliases in keyword arguments (#4599) · 79463dfb
  Nikita Titov authored Sep 10, 2021
  
  79463dfb
04 Sep, 2021 1 commit
- [python] deprecate `silent` and standalone `verbose` args. Prefer global `verbose` param (#4577) · 64f15005
  Nikita Titov authored Sep 04, 2021
```
* deprecate `silent` and standalone `verbose` args. Prefer global `verbose` param

* simplify code

* Rephrase warning messages
```
  64f15005
01 Sep, 2021 1 commit
- add 'auto' value for `importance_type` param in plotting (#4570) · 39421265
  Nikita Titov authored Sep 01, 2021
  
  39421265
30 Aug, 2021 1 commit
- [docs][python] Refer to functions as callable in docstrings (#4575) · 32445aba
  Nikita Titov authored Aug 30, 2021
  
  32445aba
29 Aug, 2021 1 commit
- [dask] Fixed Dask type annotation (#4558) · 053e888d
  Nikita Titov authored Aug 29, 2021
  
  053e888d
28 Aug, 2021 1 commit
- [python][docs] Refer to string type as `str` in docstrings (#4565) · ee5636f1
  Nikita Titov authored Aug 28, 2021
  
  ee5636f1
27 Aug, 2021 3 commits
- [python] Use double type for `init_score` array when set by predictor (#4510) · 99cc4f2f
  Nikita Titov authored Aug 27, 2021
  
  99cc4f2f
- [python][docs] Refer to string type as `str` and add commas in `list of ...` types (#4557) · c6199311
  Nikita Titov authored Aug 27, 2021
```
* Reffer to string type as `str` and and commas in `list of ...` types

* update `libpath.py` too
```
  c6199311
- [docs][python] Improve description of `eval_result` argument in `record_evaluation()` (#4559) · 2067bdc5
  Nikita Titov authored Aug 27, 2021
```
* Update callback.py

* Update engine.py
```
  2067bdc5
25 Aug, 2021 2 commits

[python] add type hints on train() in engine.py (#4544) · 13fa6d95

James Lamb authored Aug 26, 2021



* [python] add type hints on train() in engine.py

* revert dask.py and sklearn.py changes

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update docs on evals_result contents

* Update python-package/lightgbm/engine.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

13fa6d95

[docs] Clarify the fact that predict() on a file does not support saved... · 417ba192

James Lamb authored Aug 25, 2021


[docs] Clarify the fact that predict() on a file does not support saved Datasets (fixes #4034) (#4545)

* documentation changes

* add list of supported formats to error message

* add unit tests

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update per review comments

* make references consistent
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

417ba192

23 Aug, 2021 1 commit

[python] add parameter object_hook to method dump_model (#4533) · 11d7608f

Xavier Dupré authored Aug 24, 2021



* add parameter object_hook to function dump_model (python API)

* eol

* fix syntax

* lint

* better documentation

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

11d7608f

19 Aug, 2021 1 commit
- [python] add type hints to logging functions in basic.py (#4527) · c65a2e33
  James Lamb authored Aug 19, 2021
```
* [python] add type hints to logging functions in basic.py

* add hints on wrapper
```
  c65a2e33
03 Aug, 2021 2 commits

[dask] find all needed ports in each host at once (fixes #4458) (#4498) · 5fe27d59
José Morales authored Aug 03, 2021
```
* find all needed ports in each worker at once

* lint

* better naming

* use _HostWorkers in test
```
5fe27d59

Update c_api LGBM_SampleIndices() comment. (#4490) · 1dbf4382

Chen Yufei authored Aug 04, 2021



* Update c_api LGBM_SampleIndices() comment.

rand.Sample() now returns exactly given number of samples, thus the
comment should be fixed.

* Update include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

1dbf4382

31 Jul, 2021 1 commit
- [python][tests] refactor tests with Sequence input (#4495) · 661bde10
  Nikita Titov authored Jul 31, 2021
  
  661bde10
30 Jul, 2021 1 commit

[python] support Dataset.get_data for Sequence input. (#4472) · 1d21d1ad

Chen Yufei authored Jul 31, 2021



* [python] support Dataset.get_data for Sequence input.

* Tweaks according to review comments.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Add test cases.

* fix import order in test_basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

1d21d1ad

10 Jul, 2021 1 commit
- [python] migrate to pathlib in setup.py and use `absolute()` on paths first (#4444) · 96583ab5
  Nikita Titov authored Jul 10, 2021
```
* use absolute() on paths first

* migrate to pathlib in setup.py
```
  96583ab5
08 Jul, 2021 1 commit

[dask] determine output shape of array in predict (fixes #4285) (#4351) · f836fe0c

José Morales authored Jul 08, 2021

* call predict on one row of data to determine output shape

* make DaskLGBMRanker predict method equal to the others

* remove extra drop_axis

f836fe0c

07 Jul, 2021 2 commits

[python] allow to pass some params as pathlib.Path objects (#4440) · 90342e92
Nikita Titov authored Jul 07, 2021
```
* allow to pass some params as pathlib.Path objects

* fix lint

* improve indentation
```
90342e92

[dask] Make output of feature contribution predictions for sparse matrices... · b09da434

James Lamb authored Jul 07, 2021


[dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) (#4378)

* test_classifier working

* adding tests

* docs

* tests

* revert unnecessary changes in tests

* test output type

* linting

* linting

* use from_delayed() instead

* docstring pycodestyle is happy with

* isort

* put pytest skips back

* respect sparse return type

* fix doc

* remove unnecessary dask_array_concatenate()

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update predict_proba() docstring

* remove unnecessary np.array()

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix assertion

* fix test use of len()

* restore np.array() in tests

* use np.asarray() instead

* use toarray()

* remove empty functions in compat
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

b09da434

05 Jul, 2021 2 commits
- [python] minor refactoring of Python code (#4442) · 7eac5a63
  Nikita Titov authored Jul 05, 2021
```
* Update test_sklearn.py

* Update test_basic.py

* Update dask.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update callback.py
```
  7eac5a63
- [docs][python] add versionadded to Sequence class in Python wrapper (#4441) · 1525cc42
  Nikita Titov authored Jul 05, 2021
  
  1525cc42
02 Jul, 2021 1 commit

[python-package] Create Dataset from multiple data files (#4089) · c359896e

Chen Yufei authored Jul 02, 2021

* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389



* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c359896e