Commits · 37485fff5da92bca8eb1f2c50b1c8fde334b75d5 · tianlh / LightGBM-DCU

07 Feb, 2021 2 commits

[dask] Add support for 'pred_leaf' in Dask estimators (fixes #3792) (#3919) · 37485fff
James Lamb authored Feb 07, 2021
```
* fix tests

* fix tests

* fix test comments

* simplify tests

* Apply suggestions from code review
```
37485fff

[dask] Add unit tests that signatures are the same between Dask and... · 6f127847

GOusignu authored Feb 07, 2021

[dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators  (#3911)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)

6f127847

06 Feb, 2021 1 commit

[dask] Support Dask dataframes with 'category' columns (fixes #3861) (#3908) · fc6b71e0

James Lamb authored Feb 06, 2021



* add support for pandas categorical columns

* remove commented code

* quotes

* syntax error

* fix shape for ranker test

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_dask.py

* trying

* fix tests

* remove unnecessary debugging stuff

* skip accuracy checks on categorical

* use category columns as categorical features
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

fc6b71e0

03 Feb, 2021 2 commits

[dask] remove unused private _client attribute (#3904) · b1e000c0

Nikita Titov authored Feb 03, 2021

* Update test_dask.py

* Update dask.py

* Update .vsts-ci.yml

* Revert "Update .vsts-ci.yml"

This reverts commit 98422be5b5095f0585de333b5b5545356776ef88.

b1e000c0

[dask] remove 'client' kwarg from fit() and predict() (fixes #3808) (#3883) · c3ac77b5

James Lamb authored Feb 02, 2021



* starting on Dask client

* more docs stuff

* fix pickling

* just copy docstrings

* fit docs

* switch test order

* linting

* use client kwarg

* remove inner set_params()

* add type hints

* fix type hints

* remove commented code

* reorder

* fix tests, add client_ property

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix tests

* linting

* simplify
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c3ac77b5

02 Feb, 2021 1 commit
- rebalance dask.array ranker input (#3892) · a4cae37c
  Frank Fineis authored Feb 02, 2021
  
  a4cae37c
29 Jan, 2021 2 commits

increase client close timeout for Dask tests (#3879) · 217642ca
Nikita Titov authored Jan 29, 2021

217642ca

[dask] fix teardown issues in Dask tests (fixes #3829) (#3869) · 42d1633a

James Lamb authored Jan 28, 2021

* [dask] reduce teardown erros in Dask tests

* azure

* show logs

* try again

* more

* submodules

* try a bunch of sdist tests

* empty commit

* empty commit

* empty commit

* use sh-ubuntu

* 10 sdist tasks

* stuff

* empty commit

* empty commit

* empty commit

* empty commit

* try bdist

* empty commit

* empty commit

* empty commit

* empty commit

* py37

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* python 3.8

* empty commit

* empty commit

* empty commit

* empty commit

* empty commit

* cuda config

* comment out cuda again

* setting timeout

* put client close in the right place

* uncomment CI, make timeout 60

42d1633a

28 Jan, 2021 1 commit

[ci] ignore CUDA-related strings in Python logger test (#3874) · 040b1c54

Nikita Titov authored Jan 28, 2021

* Update test_utilities.py

* Update cuda.yml

* Update test_utilities.py

* Update cuda_tree_learner.cpp

* Update cuda.yml

040b1c54

27 Jan, 2021 1 commit

[dask] add tests on warnings, fix incorrect variable in log (#3865) · d4658fbb

James Lamb authored Jan 26, 2021



* [dask] add tests on warnings, fix incorrect variable in log

* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d4658fbb

26 Jan, 2021 3 commits

[python][tests] minor Python tests cleanup (#3860) · 9eeac3c7

Nikita Titov authored Jan 26, 2021

* Update test_engine.py

* Update test_sklearn.py

* Update test_engine.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_sklearn.py

* Update test_engine.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

* Update test_engine.py

* Update test_dual.py

* Update test_engine.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

9eeac3c7

[python-package] respect parameter aliases for network params (#3813) · 9f70e968

James Lamb authored Jan 26, 2021



* [dask] allow parameter aliases for tree_learner and local_listen_port (fixes #3671)

* num_thread too

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* empty commit

* add _choose_param_value

* revert param order change

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* just import deepcopy

* remove machines aliases

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

9f70e968

[python-package] migrate test_sklearn.py to pytest (#3844) · 439c721a
Thomas J. Fan authored Jan 25, 2021
```
* TST Migrates test_sklearn.py to pytest

* STY Fixes linting

* FIX Adds reason

* ENH Address comments
```
439c721a

25 Jan, 2021 3 commits

[dask] merge local_predict tests into other tests (fixes #3833) (#3842) · 02e4b791

Shrill Shrestha authored Jan 25, 2021



* Merge test_<est>_local_predict and test_<est> tests for Dask module ##3833

* Merge test_<est>_local_predict to test_<est> tests in dask module - refactor #3833

* Update test_classifier and rename variables resolves #3833

* rename variables resolves #3833

* manage precision error #3833
Co-authored-by: James Lamb <jaylamb20@gmail.com>

02e4b791

[dask] factor dask-ml out of tests (fixes #3796) (#3849) · 0297719c

James Lamb authored Jan 25, 2021



* [dask] factor dask-ml out of tests (fixes #3796)

* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

0297719c

[dask][tests] skip Dask tests when Dask is not installed and improve imports in Dask tests (#3852) · f21e0efc
Nikita Titov authored Jan 25, 2021
```
* Update test_dask.py

* Update test_dask.py
```
f21e0efc

24 Jan, 2021 3 commits
- [python] Allow to register custom logger in Python-package (#3820) · b7ccdaf0
  Nikita Titov authored Jan 24, 2021
```
* centralize Python-package logging in one place

* continue

* fix test name

* removed unused import

* enhance test

* fix lint

* hotfix test

* workaround for GPU test

* remove custom logger from Dask-package

* replace one log func with flags by multiple funcs
```
  b7ccdaf0
- [dask][tests] reduce code duplication in Dask tests (#3828) · ac706e10
  Nikita Titov authored Jan 24, 2021
  
  ac706e10
- [dask][tests] move make_ranking into utils (#3827) · da443871
  Nikita Titov authored Jan 24, 2021
```
* move make_ranking into utils

* do not cache
```
  da443871
23 Jan, 2021 1 commit
- [python][tests] transfer test_save_and_load_linear to test_engine (#3821) · e754f23a
  Nikita Titov authored Jan 23, 2021
  
  e754f23a
22 Jan, 2021 7 commits

[python-package] migrate test_plotting.py to pytest (#3811) · b6386842
Thomas J. Fan authored Jan 22, 2021
```
* TST Migrates test_plotting.py to pytest

* STY Fixes linting
```
b6386842
[dask] Address flaky test_ranker tests (#3819) · bf22a25d
Frank Fineis authored Jan 22, 2021

bf22a25d

[dask] [python-package] use keyword args for internal function calls (#3755) · b8fc476e

James Lamb authored Jan 22, 2021



* [dask] use keyword args for internal function calls

* add missing comma

* Update python-package/lightgbm/dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* revert whitespace changes

* test style
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

b8fc476e

[python][tests] use default tolerance for dual GPU+CPU test (#3810) · 477cbf37

Nikita Titov authored Jan 22, 2021

* Update test_dual.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

* Update test_dual.py

* Update .vsts-ci.yml

* Update .vsts-ci.yml

* Update .vsts-ci.yml

477cbf37

[dask] Support pred_contrib in Dask predict() methods (fixes #3713) (#3774) · d9a96c90

James Lamb authored Jan 22, 2021



* adding pred_contrib support

* add tests

* linting

* remove raw_score

* add pred kwargs

* faster tests

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* changes to tests

* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d9a96c90

[python-package] [dask] Add DaskLGBMRanker (#3708) · 3c7e7e0b

Frank Fineis authored Jan 21, 2021



* ranker support wip

* fix ranker tests

* fix _make_ranking rnd gen bug, add sleep to help w stoch binding port failed exceptions

* add wait_for_workers to prevent Binding port exception

* another attempt to stabilize test_dask.py

* requested changes: docstrings, dask_ml, tuples for list_of_parts

* fix lint bug, add group param to test_ranker_local_predict

* decorator to skip tests with errors on fixture teardown

* remove gpu ranker tests, reduce make_ranking data complexity

* another attempt to
silence client, decorator does not silence fixture errors

* address requested changes on 1/20/20

* skip test_dask for all GPU tasks

* address changes requested on 1/21/21

* issubclass instead of __qualname__
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* parity in group docstr with sklearn
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* _make_ranking docstr cleanup
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

3c7e7e0b

[python][tests] migrate test_engine.py to pytest (#3800) · 6dbe736e
Thomas J. Fan authored Jan 21, 2021
```
* TST Migrates tset_engine.py to pytest

* ENH Apply suggestions

* ENH Uses temp path

* ENH Fixes typos
```
6dbe736e

21 Jan, 2021 2 commits
- [python][tests] remove unused import (#3806) · 9cc3777c
  Nikita Titov authored Jan 21, 2021
  
  9cc3777c
- TST Migrates test_consistency.py to pytest (#3798) · d4d05fa0
  Thomas J. Fan authored Jan 21, 2021
  
  d4d05fa0
20 Jan, 2021 1 commit

[dask] allow parameter aliases for local_listen_port, num_threads,... · d107872a

James Lamb authored Jan 20, 2021


[dask] allow parameter aliases for local_listen_port, num_threads, tree_learner (fixes #3671) (#3789)

* [dask] allow parameter aliases for tree_learner and local_listen_port (fixes #3671)

* num_thread too

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* empty commit
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

d107872a

19 Jan, 2021 1 commit
- [dask] fix Dask import order (#3788) · 4007b34f
  James Lamb authored Jan 19, 2021
  
  4007b34f
18 Jan, 2021 1 commit
- [dask] reduce test times (#3786) · c871496a
  James Lamb authored Jan 18, 2021
```
* speed up tests

* [dask] reduce test times
```
  c871496a
15 Jan, 2021 3 commits

completely remove tempfile from test_basic (#3767) · f2695dab
Nikita Titov authored Jan 15, 2021

f2695dab

[dask] [python-package] Search for available ports when setting up network (fixes #3753) (#3766) · f6d2dce4

James Lamb authored Jan 15, 2021



* starting work

* fixed port-binding issue on localhost

* minor cleanup

* updates

* getting closer

* definitely working for LocalCluster

* it works, it works

* docs

* add tests

* removing testing-only files

* linting

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* remove duplicated code

* remove unnecessary listen()
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

f6d2dce4

[python][tests] Migrates test_basic.py to use pytest (#3764) · 9bacf03c
Thomas J. Fan authored Jan 15, 2021
```
* TST Migrates test_basic.py to use pytest

* STY Linting

* CI Force CI to run
```
9bacf03c

04 Jan, 2021 1 commit
- [python][tests] small Python tests cleanup (#3715) · 69798c3e
  Nikita Titov authored Jan 04, 2021
  
  69798c3e
03 Jan, 2021 1 commit

[R-package] allow access to params in Booster (#3662) · 532fa914

James Lamb authored Jan 03, 2021

* [R-package] allow access to params in Booster

* remove unnecessary whitespace

* fix test on resetting params

* remove pytest_cache

* Update R-package/tests/testthat/test_custom_objective.R

532fa914

28 Dec, 2020 1 commit

small code and docs refactoring (#3681) · 5a460846

Nikita Titov authored Dec 29, 2020

* small code and docs refactoring

* Update CMakeLists.txt

* Update .vsts-ci.yml

* Update test.sh

* continue

* continue

* revert stable sort for all-unique values

5a460846

24 Dec, 2020 1 commit

Trees with linear models at leaves (#3299) · fcfd4132

Belinda Trotta authored Dec 24, 2020

* Add Eigen library.

* Working for simple test.

* Apply changes to config params.

* Handle nan data.

* Update docs.

* Add test.

* Only load raw data if boosting=gbdt_linear

* Remove unneeded code.

* Minor updates.

* Update to work with sk-learn interface.

* Update to work with chunked datasets.

* Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters.

* Save raw data in binary dataset file.

* Update docs and fix parameter checking.

* Fix dataset loading.

* Add test for regularization.

* Fix bugs when saving and loading tree.

* Add test for load/save linear model.

* Remove unneeded code.

* Fix case where not enough leaf data for linear model.

* Simplify code.

* Speed up code.

* Speed up code.

* Simplify code.

* Speed up code.

* Fix bugs.

* Working version.

* Store feature data column-wise (not fully working yet).

* Fix bugs.

* Speed up.

* Speed up.

* Remove unneeded code.

* Small speedup.

* Speed up.

* Minor updates.

* Remove unneeded code.

* Fix bug.

* Fix bug.

* Speed up.

* Speed up.

* Simplify code.

* Remove unneeded code.

* Fix bug, add more tests.

* Fix bug and add test.

* Only store numerical features

* Fix bug and speed up using templates.

* Speed up prediction.

* Fix bug with regularisation

* Visual studio files.

* Working version

* Only check nans if necessary

* Store coeff matrix as an array.

* Align cache lines

* Align cache lines

* Preallocation coefficient calculation matrices

* Small speedups

* Small speedup

* Reverse cache alignment changes

* Change to dynamic schedule

* Update docs.

* Refactor so that linear tree learner is not a separate class.

* Add refit capability.

* Speed up

* Small speedups.

* Speed up add prediction to score.

* Fix bug

* Fix bug and speed up.

* Speed up dataload.

* Speed up dataload

* Use vectors instead of pointers

* Fix bug

* Add OMP exception handling.

* Change return type of LGBM_BoosterGetLinear to bool

* Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change

* Remove unused internal_parent_ property of tree

* Remove unused parameter to CreateTreeLearner

* Remove reference to LinearTreeLearner

* Minor style issues

* Remove unneeded check

* Reverse temporary testing change

* Fix Visual Studio project files

* Restore LightGBM.vcxproj.filters

* Speed up

* Speed up

* Simplify code

* Update docs

* Simplify code

* Initialise storage space for max num threads

* Move Eigen to include directory and delete unused files

* Remove old files.

* Fix so it compiles with mingw

* Fix gpu tree learner

* Change AddPredictionToScore back to const

* Fix python lint error

* Fix C++ lint errors

* Change eigen to a submodule

* Update comment

* Add the eigen folder

* Try to fix build issues with eigen

* Remove eigen files

* Add eigen as submodule

* Fix include paths

* Exclude eigen files from Python linter

* Ignore eigen folders for pydocstyle

* Fix C++ linting errors

* Fix docs

* Fix docs

* Exclude eigen directories from doxygen

* Update manifest to include eigen

* Update build_r to include eigen files

* Fix compiler warnings

* Store raw feature data as float

* Use float for calculating linear coefficients

* Remove eigen directory from GLOB

* Don't compile linear model code when building R package

* Fix doxygen issue

* Fix lint issue

* Fix lint issue

* Remove uneeded code

* Restore delected lines

* Restore delected lines

* Change return type of has_raw to bool

* Update docs

* Rename some variables and functions for readability

* Make tree_learner parameter const in AddScore

* Fix style issues

* Pass vectors as const reference when setting tree properties

* Make temporary storage of serial_tree_learner mutable so we can make the object's methods const

* Remove get_raw_size, use num_numeric_features instead

* Fix typo

* Make contains_nan_ and any_nan_ properties immutable again

* Remove data_has_nan_ property of tree

* Remove temporary test code

* Make linear_tree a dataset param

* Fix lint error

* Make LinearTreeLearner a separate class

* Fix lint errors

* Fix lint error

* Add linear_tree_learner.o

* Simulate omp_get_max_threads if openmp is not available

* Update PushOneData to also store raw data.

* Cast size to int

* Fix bug in ReshapeRaw

* Speed up code with multithreading

* Use OMP_NUM_THREADS

* Speed up with multithreading

* Update to use ArrayToString

* Fix tests

* Fix test

* Fix bug introduced in merge

* Minor updates

* Update docs

fcfd4132

22 Dec, 2020 1 commit

[python] [dask] add initial dask integration (#3515) · d90a16d5

Jan Stiborek authored Dec 23, 2020

* migrated implementation from dask/dask-lightgbm

* relaxed tests

* tests skipped in case that MPI is used

* fixed python 2.7 import + tests disabled on windows

* python < 3.6 is not supported in tests

* tests enabled only for linux

* tests disabled for mpi interface

* dask version pinned to >= 2.0

* added @jameslamb as code owner

* added missing pandas dependency

* code refactoring, removed code duplication - lightgbm.dask.LGBMClassifier.fit is the same as lightgbm.dask.LGBMRegressor.fit

* fixed refactoring

* code deduplication - fit method moved into mixin class

* fixed CODEOWNERS

* removed unnecessary import

* skip the module execution on python < 3.6 and on platform different than linux.

* removed skip for python < 3.6

* review comments

* removed noqa, renamed API classes, renamed local variables

d90a16d5