Commits · 1dbf4382fd7d8cd6ce92b53c3b84aa479aabfc19 · tianlh / LightGBM-DCU

08 Jul, 2021 1 commit

[dask] determine output shape of array in predict (fixes #4285) (#4351) · f836fe0c

José Morales authored Jul 08, 2021

* call predict on one row of data to determine output shape

* make DaskLGBMRanker predict method equal to the others

* remove extra drop_axis

f836fe0c

07 Jul, 2021 1 commit

[dask] Make output of feature contribution predictions for sparse matrices... · b09da434

James Lamb authored Jul 07, 2021


[dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) (#4378)

* test_classifier working

* adding tests

* docs

* tests

* revert unnecessary changes in tests

* test output type

* linting

* linting

* use from_delayed() instead

* docstring pycodestyle is happy with

* isort

* put pytest skips back

* respect sparse return type

* fix doc

* remove unnecessary dask_array_concatenate()

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update predict_proba() docstring

* remove unnecessary np.array()

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix assertion

* fix test use of len()

* restore np.array() in tests

* use np.asarray() instead

* use toarray()

* remove empty functions in compat
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

b09da434

05 Jul, 2021 1 commit

[python] minor refactoring of Python code (#4442) · 7eac5a63

Nikita Titov authored Jul 05, 2021

* Update test_sklearn.py

* Update test_basic.py

* Update dask.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update basic.py

* Update callback.py

7eac5a63

28 Jun, 2021 2 commits

[dask] fix typehint on _pad_eval_names() (#4413) · b918b5b2
James Lamb authored Jun 28, 2021

b918b5b2

[dask] add support for eval sets and custom eval functions (#4101) · b5502d19

Frank Fineis authored Jun 27, 2021



* es WiP, need to add eval_sample_weight and eval_group

* add weight, group to dask es. WiP.

* dask es reorg

* Update python-package/lightgbm/dask.py

_train_part model.fit args to lines
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_dask.py

_train_part model.fit args to lines, pt2
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py

_train_part model.fit args to lines pt3
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_dask.py

dask_model.fit args to lines
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py

use is instead of id()
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* applying changes to eval_set PR WiP

* dask support for eval_names, eval_metric, eval_stopping_rounds

* add evals_result checks and other eval_set attribute-related test checks. need to merge master - WiP

* fix lint errors in test_dask.py

* drop group_shape from _lgbmmodel_doc_fit.format for non-rankers, add support for eval_at for dask ranker

* add eval_at to test_dask eval_set ranker tests

* add back group_shape to lgbmmmodel docs, tighten tests

* drop random eval weights from early stopping, probably causing training to terminate too early

* add eval data templates to sklearn fit docs, add eval data docs to dask

* add n_features to _create_data, eval_set tests stop w/ desirable tree counts

* import alphabetically

* add back get_worker for eval_set error handling

* test_dask argmin typo

* push forgotten eval_names bugfix

* eval_stopping_rounds -> early_stopping_rounds, fix failing non-es test

* change default eval_at to tuple 1-5

* re-drop get_worker

* drop early stopping support from eval_set commits, move eval_set worker check prior to client.submit

* add eval_class_weight and eval_init_score to lightgbm/dask, WiP

* clean up eval_set tests, allow user to specify fewer eval_names, clswghts than eval_sets

* remove redundant backslash

* lint fixes

* fix eval_at, eval_metric duplication, let eval_at be Iterable not just Tuple

* use all data_outputs for test_eval_set tests

* undo newlines from first pr

* add custom_eval_metric test, correct issue with eval_at and metric names

* move _constant_metric outside of test

* dataset reference names instead of __strings__

* add padding to eval_set parts makes each part has same len(eval_set)

* eval set code clean up

* revert n_evals to be max len eval_set across all parts on worker

* pylint errors in _DatasetNames

* more pylint fixes

* pylinting...

* add by pytest.mark, mistakenly deleted during merge conflict resolution

* address code review comments

* add _pad_eval_names to handle nondeterministic evals_result_ valid set names

* change not evaluated evals_result_ test criteria

* address fit eval docs issues, switch _DatasetNames to Enum

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update eval_metrics, eval_at dask fit docstr to match sklearn, make tests reflect that l2 (rmse), logloss in evals_result_ by default

* address eval_set dict keys naming in docstr and training eval_set naming issue

* in test_dask check for obj-default metric names in eval_results, remove check for training key

* lint fixes for _pad_eval_names

* remove unnecessary breaklinen in _pad_eval_names docstr

* use Enum.member syntax not Enum.member.name

* remove str from supported eval_at types

* add whitespace and remove DaskDataframes mention from eval_ param docstrs in _train

* remove "of shape = [n_samples]" from group_shape docs

* add eval_at base_doc in DaskLGBMRanker.fit

* remove excess paren from eval_names docs in _train

* make requested changes to test_dask.py

* remove Optional() wrapper on eval_at

* add _lgbmmodel_doc_custom_eval_note to dask.py fit.__doc__

* fix ordering of .sklearn imports to attempt lint fix

* dask custom eval note to f-string pt1
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* dask custom eval note to f-string pt 2
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* dask custom eval note to f-string pt 3
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

b5502d19

26 Jun, 2021 1 commit

[dask] pass additional predict() parameters through when input is a Dask Array (#4399) · 8116d880

James Lamb authored Jun 26, 2021



* [dask] pass predict() kwargs through when input is a Dask Array

* add tests

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* add prediction early stopping params
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

8116d880

15 Jun, 2021 1 commit
- [dask] Dask Vector types for group, init_score, sample_weights (fixes #4375) (#4380) · 5af7eb7a
  Frank Fineis authored Jun 15, 2021
  
  5af7eb7a
15 May, 2021 1 commit

[python] added f-strings to python-package/lightgbm/dask.py (#4144) · 3a657c8b

NovusEdge authored May 15, 2021



* added f-string to dask.py

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Updated branch

* Updated file as per specifications

* Removed warning as per specification

* update other places

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* revert unnecessary change

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

3a657c8b

04 May, 2021 1 commit

Correct spelling (#4250) · e79716e0

Andrew Ziem authored May 04, 2021



* Correct spelling

Most changes were in comments, and there were a few changes to literals for log output.

There were no changes to variable names, function names, IDs, or functionality.

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Correct spelling

Most are code comments, but one case is a literal in a logging message.

There are a few grammar fixes too.
Co-authored-by: James Lamb <jaylamb20@gmail.com>

e79716e0

21 Apr, 2021 1 commit

[dask] Fix typo mentioned in 4101 (#4214) · 887ef4cc

Frank Fineis authored Apr 21, 2021



* fix typo in dask _train as mentioned in 4101

* Update python-package/lightgbm/dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>

887ef4cc

01 Apr, 2021 1 commit

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) (#4088) · d517ba12

jmoralez authored Apr 01, 2021

* include voting_parallel tree_learner in test_regressor, test_classifier and test_ranker

* remove test for warnings and test for error when using feature_parallel

* use real names for tree_learner intest and include test for aliases. use the error message in the test for error in feature parallel

* split all tests with rf in test_classifier

* remove task parametrization for tree_learner aliases test. smaller input data from feature_parallel error

* define task for tree_learner aliases

d517ba12

31 Mar, 2021 1 commit

[dask] make random port search more resilient to random collisions (fixes #4057) (#4133) · 1ce4b22b

James Lamb authored Mar 31, 2021

* [dask] make random port search more resilient to random collisions

* linting

* more reliable ports check

* address review comments

* add error message

1ce4b22b

30 Mar, 2021 1 commit
- [docs] fix param name typo in comments (#4139) · 841943f2
  Nikita Titov authored Mar 30, 2021
  
  841943f2
29 Mar, 2021 1 commit

[dask] run one training task on each worker (#4132) · 337103d3

James Lamb authored Mar 29, 2021

* [dask] run one training task on each worker

* add comment on pure

* missing ticks

* empty commit

337103d3

27 Mar, 2021 1 commit

[dask] Include support for raw_score in predict (fixes #3793) (#4024) · fe1b80a5

jmoralez authored Mar 27, 2021

* include test for prediction with raw_score

* close client

* initial comments

* update data creation and include ranking task

* linting

* update _create_data

* compare unique raw_predictions with values in leaves_df

fe1b80a5

16 Mar, 2021 1 commit
- [dask] remove unused imports from typing (#4079) · e9f50a59
  James Lamb authored Mar 16, 2021
  
  e9f50a59
10 Mar, 2021 1 commit

[dask] raise more informative error for duplicates in 'machines' (fixes #4057) (#4059) · 296397df

James Lamb authored Mar 10, 2021

* [dask] raise more informative error for duplicates in 'machines'

* uncomment

* avoid test failure

* Revert "avoid test failure"

This reverts commit 9442bdf00f193a19a923dc0deb46b7822cb6f601.

296397df

04 Mar, 2021 1 commit

[dask] Include support for init_score (#3950) · 37e98782

jmoralez authored Mar 04, 2021

* include support for init_score

* use dataframe from init_score and test difference with and without init_score in local model

* revert refactoring

* initial docs. test between distributed models with and without init_score

* remove ranker from tests

* test value for root node and change docs

* comma

* re-include parametrize

* fix incorrect merge

* use single init_score and the booster_ attribute

* use np.float64 instead of float

37e98782

24 Feb, 2021 2 commits

[dask] use random ports in network setup (#3823) · 0e576575

jmoralez authored Feb 23, 2021

* use socket.bind with port 0 and client.run to find random open ports

* include test for found ports

* find random open ports as default

* parametrize local_listen_port. type hint to _find_random_open_port. fid open ports only on workers with data.

* make indentation consistent and pass list of workers to client.run

* remove socket import

* change random port implementation

* fix test

0e576575

[dask] Reuse addresses saved in variable (#4016) · 7777852a
Nikita Titov authored Feb 24, 2021

7777852a

23 Feb, 2021 1 commit

[dask] allow tight control over ports (#3994) · 1f73f559

James Lamb authored Feb 23, 2021



* [dask] allow tight control over ports

* getting there, getting there

* fix params maybe

* fixing params

* remove unnecessary stuff

* fix tests

* fixes

* some minor changes

* fix flaky test

* linting

* more linting

* clarify parameter description

* add warning

* revert docs change

* Update python-package/lightgbm/dask.py

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* trying to fix stuff

* this is working

* update tests

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* indent
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

1f73f559

20 Feb, 2021 1 commit
- [dask] use more specific method names on _DaskLGBMModel (#4004) · 646267d2
  James Lamb authored Feb 20, 2021
  
  646267d2
16 Feb, 2021 1 commit
- [ci][python] run isort in CI linting job (#3990) · d6ebd063
  Nikita Titov authored Feb 16, 2021
```
* run isort in CI linting job

* workaround conda compatibility issues
```
  d6ebd063
15 Feb, 2021 3 commits

reuse len(parts) as n_parts (#3985) · d74b1be9
Frank Fineis authored Feb 15, 2021

d74b1be9
[ci][python] apply isort to python-package/lightgbm/dask.py #3958 (#3969) · 3b547001
Zhuyi Xue authored Feb 14, 2021

3b547001

[python-package] fix some warnings from mypy (#3891) · eda1effb

Tara Jawahar authored Feb 14, 2021



* minor mypy type errors fixed

* fix some warnings from mypy

* fix 3 mypy warnings

* selectively ignored some mypy errors

* minor mypy type errors fixed

* minor mypy type errors fixed

* minor mypy type errors fixed

* added import

* Update python-package/lightgbm/callback.py

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

eda1effb

10 Feb, 2021 1 commit
- [docs][python] fix shape description of returned result for predict_proba (#3933) · 15916a95
  Nikita Titov authored Feb 10, 2021
```
* Update dask.py

* Update sklearn.py
```
  15916a95
09 Feb, 2021 1 commit

[dask] [docs] Fix inaccuracies in API docs for Dask module (fixes #3871) (#3930) · 06ed4337

James Lamb authored Feb 09, 2021



* got fit() working

* add predict()

* predict_proba()

* remove custom objective docs

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix capitalization

* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

06ed4337

07 Feb, 2021 1 commit
- [dask] Add support for 'pred_leaf' in Dask estimators (fixes #3792) (#3919) · 37485fff
  James Lamb authored Feb 07, 2021
```
* fix tests

* fix tests

* fix test comments

* simplify tests

* Apply suggestions from code review
```
  37485fff
06 Feb, 2021 1 commit

[dask] Support Dask dataframes with 'category' columns (fixes #3861) (#3908) · fc6b71e0

James Lamb authored Feb 06, 2021



* add support for pandas categorical columns

* remove commented code

* quotes

* syntax error

* fix shape for ranker test

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_dask.py

* trying

* fix tests

* remove unnecessary debugging stuff

* skip accuracy checks on categorical

* use category columns as categorical features
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fc6b71e0

03 Feb, 2021 4 commits

[dask] remove extra 'client' kwarg in DaskLGBMRegressor (#3906) · fc2da2ec
James Lamb authored Feb 03, 2021

fc2da2ec

[dask] remove unused private _client attribute (#3904) · b1e000c0

Nikita Titov authored Feb 03, 2021

* Update test_dask.py

* Update dask.py

* Update .vsts-ci.yml

* Revert "Update .vsts-ci.yml"

This reverts commit 98422be5b5095f0585de333b5b5545356776ef88.

b1e000c0

[docs][dask] Add type of client_ property to docs (#3902) · 63733567
Nikita Titov authored Feb 03, 2021
```
* Update dask.py

* Update dask.py
```
63733567

[dask] remove 'client' kwarg from fit() and predict() (fixes #3808) (#3883) · c3ac77b5

James Lamb authored Feb 02, 2021



* starting on Dask client

* more docs stuff

* fix pickling

* just copy docstrings

* fit docs

* switch test order

* linting

* use client kwarg

* remove inner set_params()

* add type hints

* fix type hints

* remove commented code

* reorder

* fix tests, add client_ property

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix tests

* linting

* simplify
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c3ac77b5

29 Jan, 2021 1 commit

[dask] Add type hints in Dask package (#3866) · ea8e47ea

James Lamb authored Jan 29, 2021



* add type hints in dask module

* starting on asserts

* remove unused code

* add hints for dtypes

* replace accidentally-removed docstrings

* revert unrelated change

* Update python-package/lightgbm/dask.py

* empty commit

* fix hints on group

* capitalize array

* hide hints in signatures

* empty commit

* sphinx version

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix hint for MatrixLike

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update docstring

* empty commit
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

ea8e47ea

28 Jan, 2021 1 commit
- fix docs for machine_list_filename param (#3863) · 8ef874bd
  Nikita Titov authored Jan 28, 2021
  
  8ef874bd
27 Jan, 2021 1 commit

[dask] add tests on warnings, fix incorrect variable in log (#3865) · d4658fbb

James Lamb authored Jan 26, 2021



* [dask] add tests on warnings, fix incorrect variable in log

* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d4658fbb

26 Jan, 2021 3 commits
- [python] simplify param aliases handling (#3864) · 59153b28
  Nikita Titov authored Jan 27, 2021
```
* Update sklearn.py

* Update dask.py
```
  59153b28
- [python] use better names for imported classes from extra libraries (#3862) · 066720ef
  Nikita Titov authored Jan 27, 2021
  
  066720ef
- [dask] fix Dask docstrings and mimic sklearn wrapper importing way (#3855) · 5312b955
  Nikita Titov authored Jan 26, 2021
```
* fix Dask docstrings and mimic sklearn importing way

* Update .vsts-ci.yml

* revert CI checks

* use import aliases for Dask classes

* check Dask is installed in _predict() func

* fix lint issues introduced during resolving merge conflicts

* Update dask.py
```
  5312b955