- 01 Dec, 2024 1 commit
-
-
Oliver Borchert authored
Co-authored-by:Nikita Titov <nekit94-08@mail.ru>
-
- 13 Oct, 2024 1 commit
-
-
Atanas Dimitrov authored
Co-authored-by:
Atanas Dimitrov <nasko119@abv.bg> Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 03 Sep, 2024 1 commit
-
-
vnherdeiro authored
Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 18 Mar, 2024 1 commit
-
-
Oliver Borchert authored
-
- 21 Feb, 2024 1 commit
-
-
James Lamb authored
-
- 12 Sep, 2023 1 commit
-
-
shiyu1994 authored
* fix leaf splits update after split in quantized training * fix preparation ordered gradients for quantized training * remove force_row_wise in distributed test for quantized training * Update src/treelearner/leaf_splits.hpp --------- Co-authored-by:James Lamb <jaylamb20@gmail.com>
-
- 20 Jun, 2023 1 commit
-
-
José Morales authored
-
- 16 May, 2023 1 commit
-
-
James Lamb authored
-
- 05 May, 2023 1 commit
-
-
shiyu1994 authored
* add quantized training (first stage) * add histogram construction functions for integer gradients * add stochastic rounding * update docs * fix compilation errors by adding template instantiations * update files for compilation * fix compilation of gpu version * initialize gradient discretizer before share states * add a test case for quantized training * add quantized training for data distributed training * Delete origin.pred * Delete ifelse.pred * Delete LightGBM_model.txt * remove useless changes * fix lint error * remove debug loggings * fix mismatch of vector and allocator types * remove changes in main.cpp * fix bugs with uninitialized gradient discretizer * initialize ordered gradients in gradient discretizer * disable quantized training with gpu and cuda fix msvc compilation errors and warnings * fix bug in data parallel tree learner * make quantized training test deterministic * make quantized training in test case more accurate * refactor test_quantized_training * fix leaf splits initialization with quantized training * check distributed quantized training result
-
- 07 Mar, 2023 1 commit
-
-
James Lamb authored
-
- 25 Feb, 2023 1 commit
-
-
James Lamb authored
-
- 15 Feb, 2023 1 commit
-
-
James Lamb authored
-
- 01 Feb, 2023 1 commit
-
-
James Lamb authored
* [ci] speed up if-else, swig, and lint conda setup * add 'source activate' * python constraint * start removing cuda v1 * comment out CI * remove more references * revert some unnecessaary changes * revert a few more mistakes * revert another change that ignored params * sigh * remove CUDATreeLearner * fix tests, docs * fix quoting in setup.py * restore all CI * Apply suggestions from code review Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * Apply suggestions from code review * completely remove cuda_exp, update docs --------- Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com>
-
- 12 Jan, 2023 1 commit
-
-
James Lamb authored
-
- 03 Jan, 2023 1 commit
-
-
Jonathan Giannuzzi authored
-
- 25 Nov, 2022 1 commit
-
-
Nikita Titov authored
-
- 21 Nov, 2022 1 commit
-
-
José Morales authored
-
- 07 Oct, 2022 1 commit
-
-
James Lamb authored
[ci] prefer CPython in Windows test environment and use safer approach for cleaning up network (fixes #5509) (#5510)
-
- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 17 Mar, 2022 1 commit
-
-
Antoni Baum authored
* Turn `early_stopping` into a Callable class * Fix * Lint * Remove print * Fix order * Revert "Lint" This reverts commit 7ca8b557572446888cf793c0082d9a7efd1e29a7. * Apply suggestion from code review * Nit * Lint * Move callable class outside the func for pickling * Move _pickle and _unpickle to tests utils * Add early stopping callback picklability test * Nit * Fix * Lint * Improve type hint * Lint * Lint * Add cloudpickle to test_windows * Update tests/python_package_test/test_engine.py * Fix * Apply suggestions from code review
-
- 23 Feb, 2022 1 commit
-
-
José Morales authored
[python-package] use 2d collections for predictions, grads and hess in multiclass custom objective (#4925) * reshape predictions, grad and hess in multiclass custom objective * add sklearn test. move custom obj to utils. docs for numpy * use num_model_per_iteration to get num_classes * update docs and dask multiclass custom objective test * move reshaping to __inner_predict. add test for feval * add missing note. remove extra line
-
- 12 Feb, 2022 1 commit
-
-
Nikita Titov authored
* Update test_dask.py * Update test_engine.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py * Update test_engine.py * Update test_sklearn.py * Update test_sklearn.py * Update test_sklearn.py
-
- 17 Jan, 2022 1 commit
-
-
James Lamb authored
* add test for custom objective with regressor * add test for custom binary classification objective with classifier * isort * got tests working for multiclass * update docs * train deeper model for classifier * Apply suggestions from code review Co-authored-by:
José Morales <jmoralz92@gmail.com> * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update multiclass tests * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * fix multiclass probabilities * linting Co-authored-by:
José Morales <jmoralz92@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 06 Dec, 2021 1 commit
-
-
James Lamb authored
* [python-package][dask] handle failures parsing work host names * add tests * revert local testing changes
-
- 05 Dec, 2021 1 commit
-
-
Nikita Titov authored
* unify values of `best_iteration` for sklearn and standard APIs * update Dask test
-
- 02 Dec, 2021 1 commit
-
-
Nikita Titov authored
* fix argument types in custom eval function for Dask estimators * revert changes to docstrings * fix argument names in Dask test
-
- 30 Nov, 2021 1 commit
-
-
Nikita Titov authored
-
- 17 Sep, 2021 1 commit
-
-
José Morales authored
[python-package] Support 2d collections as input for `init_score` in multiclass classification task (#4150) * initial implementation of init_score for multiclass classification * check for 1d or 2d collection in init_score * remove dataset import * initial comments * update dask test and docstrings * update docstrings * move logic to set_field. reshape back on get_field * add type hints and update docstrings for dask. fix Dataset.set_field * revert wrong docstrings and type hints * add extra comma for consistency * prefix private functions with underscore add type hints to new functions make commas consistent in dask and basic * add missing spaces after type hint * remove shape condition for dataframe in is_2d_collection Co-authored-by:Nikita Titov <nekit94-12@hotmail.com>
-
- 09 Sep, 2021 2 commits
-
-
José Morales authored
Co-authored-by:Nikita Titov <nekit94-12@hotmail.com>
-
James Lamb authored
-
- 09 Aug, 2021 1 commit
-
-
José Morales authored
* reduce number of collisions tests * measure tests execution time * measure tests execution time in bdist task * remove durations in bdist task
-
- 03 Aug, 2021 1 commit
-
-
José Morales authored
* find all needed ports in each worker at once * lint * better naming * use _HostWorkers in test
-
- 10 Jul, 2021 1 commit
-
-
Nikita Titov authored
-
- 07 Jul, 2021 1 commit
-
-
James Lamb authored
[dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) (#4378) * test_classifier working * adding tests * docs * tests * revert unnecessary changes in tests * test output type * linting * linting * use from_delayed() instead * docstring pycodestyle is happy with * isort * put pytest skips back * respect sparse return type * fix doc * remove unnecessary dask_array_concatenate() * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update predict_proba() docstring * remove unnecessary np.array() * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * fix assertion * fix test use of len() * restore np.array() in tests * use np.asarray() instead * use toarray() * remove empty functions in compat Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 04 Jul, 2021 1 commit
-
-
Nikita Titov authored
-
- 28 Jun, 2021 1 commit
-
-
Frank Fineis authored
* es WiP, need to add eval_sample_weight and eval_group * add weight, group to dask es. WiP. * dask es reorg * Update python-package/lightgbm/dask.py _train_part model.fit args to lines Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_dask.py _train_part model.fit args to lines, pt2 Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py _train_part model.fit args to lines pt3 Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_dask.py dask_model.fit args to lines Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py use is instead of id() Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update tests/python_package_test/test_dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Update python-package/lightgbm/dask.py Co-authored-by:
James Lamb <jaylamb20@gmail.com> * applying changes to eval_set PR WiP * dask support for eval_names, eval_metric, eval_stopping_rounds * add evals_result checks and other eval_set attribute-related test checks. need to merge master - WiP * fix lint errors in test_dask.py * drop group_shape from _lgbmmodel_doc_fit.format for non-rankers, add support for eval_at for dask ranker * add eval_at to test_dask eval_set ranker tests * add back group_shape to lgbmmmodel docs, tighten tests * drop random eval weights from early stopping, probably causing training to terminate too early * add eval data templates to sklearn fit docs, add eval data docs to dask * add n_features to _create_data, eval_set tests stop w/ desirable tree counts * import alphabetically * add back get_worker for eval_set error handling * test_dask argmin typo * push forgotten eval_names bugfix * eval_stopping_rounds -> early_stopping_rounds, fix failing non-es test * change default eval_at to tuple 1-5 * re-drop get_worker * drop early stopping support from eval_set commits, move eval_set worker check prior to client.submit * add eval_class_weight and eval_init_score to lightgbm/dask, WiP * clean up eval_set tests, allow user to specify fewer eval_names, clswghts than eval_sets * remove redundant backslash * lint fixes * fix eval_at, eval_metric duplication, let eval_at be Iterable not just Tuple * use all data_outputs for test_eval_set tests * undo newlines from first pr * add custom_eval_metric test, correct issue with eval_at and metric names * move _constant_metric outside of test * dataset reference names instead of __strings__ * add padding to eval_set parts makes each part has same len(eval_set) * eval set code clean up * revert n_evals to be max len eval_set across all parts on worker * pylint errors in _DatasetNames * more pylint fixes * pylinting... * add by pytest.mark, mistakenly deleted during merge conflict resolution * address code review comments * add _pad_eval_names to handle nondeterministic evals_result_ valid set names * change not evaluated evals_result_ test criteria * address fit eval docs issues, switch _DatasetNames to Enum * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update eval_metrics, eval_at dask fit docstr to match sklearn, make tests reflect that l2 (rmse), logloss in evals_result_ by default * address eval_set dict keys naming in docstr and training eval_set naming issue * in test_dask check for obj-default metric names in eval_results, remove check for training key * lint fixes for _pad_eval_names * remove unnecessary breaklinen in _pad_eval_names docstr * use Enum.member syntax not Enum.member.name * remove str from supported eval_at types * add whitespace and remove DaskDataframes mention from eval_ param docstrs in _train * remove "of shape = [n_samples]" from group_shape docs * add eval_at base_doc in DaskLGBMRanker.fit * remove excess paren from eval_names docs in _train * make requested changes to test_dask.py * remove Optional() wrapper on eval_at * add _lgbmmodel_doc_custom_eval_note to dask.py fit.__doc__ * fix ordering of .sklearn imports to attempt lint fix * dask custom eval note to f-string pt1 Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * dask custom eval note to f-string pt 2 Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * dask custom eval note to f-string pt 3 Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 27 Jun, 2021 1 commit
-
-
James Lamb authored
-
- 26 Jun, 2021 1 commit
-
-
James Lamb authored
* [dask] pass predict() kwargs through when input is a Dask Array * add tests * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * add prediction early stopping params Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 12 Jun, 2021 1 commit
-
-
Nikita Titov authored
-
- 09 Jun, 2021 1 commit
-
-
sayantan sadhu authored
[python] improving the syntax of the fstring in the file : tests/python_package_test/test_dask.py (#4358) * updated the old syntax with fstrings * Updated the strings with + catenation to fstrings * Updated the strings with + catenation to fstrings * Update tests/python_package_test/test_dask.py Co-authored-by:James Lamb <jaylamb20@gmail.com>
-