- 12 Jun, 2022 1 commit
-
-
James Lamb authored
-
- 10 May, 2022 1 commit
-
-
Nikita Titov authored
* Update dataset_loader.cpp * Update gbdt.h * Update regression_objective.hpp * Update linker_topo.cpp * Update xentropy_objective.hpp * Update regression_objective.hpp * investigate inf test failure * avoid overflow in regression objective * remove `test_inf_handle` test Co-authored-by:Guolin Ke <guolin.ke@outlook.com>
-
- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 23 Feb, 2022 1 commit
-
-
José Morales authored
[python-package] use 2d collections for predictions, grads and hess in multiclass custom objective (#4925) * reshape predictions, grad and hess in multiclass custom objective * add sklearn test. move custom obj to utils. docs for numpy * use num_model_per_iteration to get num_classes * update docs and dask multiclass custom objective test * move reshaping to __inner_predict. add test for feval * add missing note. remove extra line
-
- 12 Feb, 2022 1 commit
-
-
Nikita Titov authored
* Update test_dask.py * Update test_engine.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_engine.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py
-
- 20 Dec, 2021 1 commit
-
-
José Morales authored
[tests][python-package] change boston dataset to synthetic dataset in tests that don't check score (#4895) * change boston dataset to synthetic dataset in tests that don't evaluate score * format imports
-
- 18 Dec, 2021 2 commits
-
-
Nikita Titov authored
* Update test_sklearn.py * Update python_package.yml * Update python_package.yml * Update callback.py * Update callback.py
-
Nikita Titov authored
* Update sklearn.py * Update sklearn.py * Update test_sklearn.py
-
- 10 Dec, 2021 2 commits
-
-
Nikita Titov authored
-
Nikita Titov authored
-
- 02 Dec, 2021 1 commit
-
-
Nikita Titov authored
* in predict(), respect params set via `set_params()` after fit() * continue * add test * fix return name * hotfix * simplify
-
- 30 Nov, 2021 1 commit
-
-
Nikita Titov authored
-
- 20 Nov, 2021 1 commit
-
-
Nikita Titov authored
* Update test_plotting.py * Update dask.py * Update sklearn.py * Update test_sklearn.py * Update basic.py * Update engine.py * Update test_engine.py * Update basic.py * Update basic.py * Update engine.py
-
- 10 Nov, 2021 1 commit
-
-
Nikita Titov authored
* respect objective aliases * Update test_sklearn.py * revert removal of blank lines * add argument name which is being overwritten in warning message
-
- 05 Nov, 2021 1 commit
-
-
Nikita Titov authored
* add n_estimators_ and n_iter_ post-fit attributes * address review comments
-
- 29 Oct, 2021 1 commit
-
-
Nikita Titov authored
-
- 10 Sep, 2021 1 commit
-
-
Nikita Titov authored
-
- 05 Jul, 2021 1 commit
-
-
Nikita Titov authored
* Update test_sklearn.py * Update test_basic.py * Update dask.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update callback.py
-
- 04 Jul, 2021 1 commit
-
-
Nikita Titov authored
-
- 24 Feb, 2021 1 commit
-
-
jmoralez authored
* include support for column array as label * remove nested ifs * fix linting errors * include tests for sklearn regressors * include docstring for numpy_1d_array_to_dtype * include . at end of docstring * remove pandas import and test for regression, classification and ranking * check predictions of sklearn models as well * test training only in dask. drop pandas series tests * use PANDAS_INSTALLED and pd_Series * inline imports * use col array in fit for test_dask * include review comments
-
- 16 Feb, 2021 1 commit
-
-
Nikita Titov authored
* run isort in CI linting job * workaround conda compatibility issues
-
- 26 Jan, 2021 2 commits
-
-
Nikita Titov authored
* Update test_engine.py * Update test_sklearn.py * Update test_engine.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_engine.py * Update .vsts-ci.yml * Update .vsts-ci.yml * Update test_engine.py * Update test_dual.py * Update test_engine.py * Update .vsts-ci.yml * Update .vsts-ci.yml
-
Thomas J. Fan authored
* TST Migrates test_sklearn.py to pytest * STY Fixes linting * FIX Adds reason * ENH Address comments
-
- 10 Nov, 2020 1 commit
-
-
Guillaume Lemaitre authored
* TST make sklearn integration test compatible with 0.24 * remove useless import * remove outdated comment * order import * use parametrize_with_checks * change the reason * skip constructible if != 0.23 * make tests behave the same across sklearn version * linter * address suggestions
-
- 29 Oct, 2020 1 commit
-
-
James Lamb authored
* [ci] [python] reduce unnecessary data loading in tests * add profiling files to gitignore * just use cache() * default on cache size * patch lru_cache on Python 2.7 * linting * reduce duplicated code * missing warnings * fix imports * fix lru_cache backport * missing kwargs * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * reduce duplicated code * cache in test_plotting Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 27 Oct, 2020 1 commit
-
-
Pavel Metrikov authored
* Add support to optimize for NDCG at a given truncation level In order to correctly optimize for NDCG@_k_, one should exclude pairs containing both documents beyond the top-_k_ (as they don't affect NDCG@_k_ when swapped). * Update rank_objective.hpp * Apply suggestions from code review Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * Update rank_objective.hpp remove the additional branching: get high_rank and low_rank by one "if". * Update config.h add description to lambdarank_truncation_level parameter * Update Parameters.rst * Update test_sklearn.py update expected NDCG value for a test, as it was affected by the underlying change in the algorithm * Update test_sklearn.py update NDCG@3 reference value * fix R learning-to-rank tests * Update rank_objective.hpp * Update include/LightGBM/config.h Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> * Update Parameters.rst Co-authored-by:
Guolin Ke <guolin.ke@outlook.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 06 Sep, 2020 1 commit
-
-
Germán Ramírez-Espinoza authored
* Refactors sklearn API to allow a list of evaluation metrics in the parameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality * Simplify expression to check whether the user passed one or multiple metrics to eval_metric parameter * Simplify new tests by using custom metrics already defined in the test file * Update docstring to reflect the fact that the parameter "feval" from the "train" and "cv" functions can also receive a list of callables * Remove oxford comma from docstrings Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Use named-parameters to make sure code is compatible with future versions of scikit-learn Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Remove throwaway return value to make code more succinct Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Move statement to group together the code related to feval * Avoid modifying original args as it causes errors in scikit-learn tools For details see: https://github.com/microsoft/LightGBM/pull/2619 * Consolidate multiple eval-metrics unit-tests into one test Co-authored-by:
German I Ramirez-Espinoza <gire@home> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 02 Sep, 2020 1 commit
-
-
Nikita Titov authored
-
- 06 Aug, 2020 1 commit
-
-
shiyu1994 authored
* [python] add start_iteration to python predict interface (#3058) * Apply suggestions from code review * Update lightgbm_R.h * Apply suggestions from code review * Apply suggestions from code review * fix R interface * update R documentation Co-authored-by:Guolin Ke <guolin.ke@outlook.com>
-
- 30 Jul, 2020 1 commit
-
-
Alex Wozniakowski authored
* [python][scikit-learn] New unit tests and maintenance * Includes multioutput tests * Includes RandomizedSearchCV test * Updates dataset parameters to eliminate FutureWarning * Change to n_class in load_digits * Fix spacing * Changes after review * Also updates validation split in grid and random search * Include skipif for classes_ attr * Updates checks for classes and order Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 14 Jul, 2020 1 commit
-
-
Germán Ramírez-Espinoza authored
[python][scikit-learn] Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier (#3222) * Fixes a bug that prevented using multiple eval_metrics in LGBMClassifier * Move bug-fix test to the test_metrics unit-test * Fix test to avoid issues with existing tests * Fix coding-style error Co-authored-by:German I Ramirez-Espinoza <gire@home>
-
- 27 Jun, 2020 1 commit
-
-
Alex authored
* modify attribute and include stacking tests * backwards compatibility * check sklearn version * move stacking import * Number of input features (#3173) * Number of input features (#3173) * Number of input features (#3173) * Number of input features (#3173) Split number of features and stacking tests. * Number of input features (#3173) Modify test name. * Number of input features (#3173) Update stacking tests for review comments. * Number of input features (#3173) * Number of input features (#3173) * Number of input features (#3173) * Number of input features (#3173) Modify classifier test. * Number of input features (#3173) * Number of input features (#3173) Check score.
-
- 30 Apr, 2020 1 commit
-
-
sbruch authored
* Fix loss computation * fix test
-
- 25 Apr, 2020 1 commit
-
-
James Lamb authored
-
- 10 Apr, 2020 1 commit
-
-
Nikita Titov authored
* Revert "specify the last supported version of scikit-learn (#2637)" This reverts commit d1002776. * ban scikit-learn 0.22.0 and skip broken test * fix updated test * fix lint test * Revert "fix lint test" This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.
-
- 20 Mar, 2020 1 commit
-
-
Lukas Pfannschmidt authored
* Add handling of RandomState object, which is standard for sklearn methods. LightGBM expects an integer seed instead of an object. If passed object is RandomState, we choose random integer based on its state to seed the underlying low level code. While chosen random integer is only in the range between 1 and 1e10 I expect it to have enough entropy (?) to not matter in practice. * Add RandomState object to random_state docstring. * remove blank line * Use property to handle setting random_state. This enables setting cloned estimators with the set_params method in sklearn. * Add docstring to attribute. * Fix and simplify docstring. * Add test case. * Use maximal int for datatype in seed derivation. * Replace random_state property with interfacing in fit method. Derives int seed for C code only when fitting and keeps RandomState object as param. * Adapt unit test to property change. * Extended test case and docstring Co-Authored-By:
Nikita Titov <nekit94-08@mail.ru> * Add more equality checks (feature importance, best iteration/score). * Add equality comparison of boosters represented by strings. Remove useless best_iteration_ comparison (we do not use early_stopping). * fix whitespace * Test if two subsequent fits produce different models * Apply suggestions from code review Co-Authored-By:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 26 Feb, 2020 1 commit
-
-
Guolin Ke authored
* code refactoring * update vcproject * refine * fix test * Update tests/python_package_test/test_sklearn.py * fix test
-
- 25 Feb, 2020 1 commit
-
-
Nikita Titov authored
* fxied pandas deprecation warning in tests * support old versions of pandas
-
- 03 Feb, 2020 1 commit
-
-
Nikita Titov authored
* Update test_engine.py * Update test_sklearn.py
-
- 02 Feb, 2020 1 commit
-
-
Guolin Ke authored
* commit * fix a bug * fix bug * reset to track changes * refine the auto choose logic * sort the time stats output * fix include * change multi_val_bin_sparse_threshold * add cmake * add _mm_malloc and _mm_free for cross platform * fix cmake bug * timer for split * try to fix cmake * fix tests * refactor DataPartition::Split * fix test * typo * formating * Revert "formating" This reverts commit 5b8de4f7fb9d975ee23701d276a66d40ee6d4222. * add document * [R-package] Added tests on use of force_col_wise and force_row_wise in training (#2719) * naming * fix gpu code * Update include/LightGBM/bin.h Co-Authored-By:
James Lamb <jaylamb20@gmail.com> * Update src/treelearner/ocl/histogram16.cl * test: swap compilers for CI * fix omp * not avx2 * no aligned for feature histogram * Revert "refactor DataPartition::Split" This reverts commit 256e6d9641ade966a1f54da1752e998a1149b6f8. * slightly refactor data partition * reduce the memory cost Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-