- 23 Mar, 2022 1 commit
-
-
shiyu1994 authored
* new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by:
Yu Shi <shiyu1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 17 Mar, 2022 1 commit
-
-
Antoni Baum authored
* Turn `early_stopping` into a Callable class * Fix * Lint * Remove print * Fix order * Revert "Lint" This reverts commit 7ca8b557572446888cf793c0082d9a7efd1e29a7. * Apply suggestion from code review * Nit * Lint * Move callable class outside the func for pickling * Move _pickle and _unpickle to tests utils * Add early stopping callback picklability test * Nit * Fix * Lint * Improve type hint * Lint * Lint * Add cloudpickle to test_windows * Update tests/python_package_test/test_engine.py * Fix * Apply suggestions from code review
-
- 11 Mar, 2022 1 commit
-
-
Nikita Titov authored
* Update test_windows.ps1 * Update .appveyor.yml * Update test_windows.ps1 * Update .appveyor.yml
-
- 19 Feb, 2022 1 commit
-
-
James Lamb authored
* [ci] [docs] use mamba for readthedocs builds (fixes #4954) * update docs * simplify build script and add docs flag to gitignore * exit with non-0 if build fails * update CI job * add doxygen * remove outdated requirement_base.txt reference * use conda create instead of conda env create * fix conda create flags * add nodefaults to env.yml * Update docs/README.rst Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * try to fix check-docs CI job * additional changes * switch from mamba to miniforge * simplify docker command and fix issues in local build script * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * update docs and conda * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 12 Feb, 2022 1 commit
-
-
Nikita Titov authored
* Update dockerfile-python * Update README.md * Update dockerfile.gpu * Update dockerfile.gpu * Update .vsts-ci.yml * Update .appveyor.yml * Update test_windows.ps1
-
- 11 Feb, 2022 1 commit
-
-
James Lamb authored
* [ci] use conda-forge in CI jobs (fixes #4948) * comment out more jobs * try reverting graphviz patch, running more cuda jobs * get graphviz from PyPI and try removing some patches for r-lintr * start running appveyor again * use conda-forge if using conda * fix commands * conda install graphviz * try newer openmp * pin below openmp 11.x * focus on gpu task * trying to narrow down error * maybe gcc11 is the issue * start adding other tests back * pin openmp too * maybe need to pin to gcc less than 10.x * pin libgfortran and libstdcxx as well * pin to gcc 9.3.0 * move constraints up to initial environment * add all CI jobs back * try installing python-graphviz separately * try new lightgbm/vsts-agent image * fix typo * test if pinning gcc for linux gpu_source build is still necessary * ok yes, pinning gcc is necessary * test if Linux gpu_source works with Python 3.9.6 * no special exception for Linux gpu_source job * pin to Python 3.9.6 in Linux gpu_source * try explicitly asking for libstdcxx-ng for every linux build * swap compilers * switch compilers back * revert accidental whitespace change * comment out CI * try Linux gpu_source with different Python versions * Revert "try Linux gpu_source with different Python versions" This reverts commit f6f63cbb9b4a9cf138f3580ae4223a8acdd0e94a. * Revert "comment out CI" This reverts commit ece191f01e3650c2f325e80ff86bfc8c485fb7bc. * remove libxml2 install, change CONDA path * avoid installing conda in rchk job * empty commit 1 * empty commit 2 * empty commit 3 * empty commit 4 * add more verbose logging around installation of python-graphviz * empty commit 1 * get mamba info * get more conda info * add another mamba info call * allow for other macOS environments in GHA configuration * Revert "allow for other macOS environments in GHA configuration" This reverts commit a3c7a19926be94e3719f5ae9100fbe30e87b35da. * get more logs from mamba * get Build.ArtifactsStagingDirectory * get more logs and try to force re-installing everything * clean cache after every step * remove --update-all and make logs less verbose * remove more print statements and uncomment jobs * test if conda-clean issue fixes segfaults for gpu_source * pin python version for gpu_source * empty commit 1 * use miniforge instead * empty commit 1 * Apply suggestions from code review * bring workarounds back * remove duplicated graphviz system-wide installation (reverts #4095, #4097, #4238) * empty commit 1 * empty commit 2 * empty commit 3 * empty commit 4 * empty commit 5 * empty commit 6 * empty commit 7 * empty commit 8 * empty commit 9 * empty commit 10 * empty commit 10 * empty commit 10 * empty commit 10 * empty commit 11 * one more try * try to downgrade Python version for Linux GPU job * swap compilers * Revert "swap compilers" This reverts commit f04dc27b17920a69cbcba1254a8e109ce9791154. Co-authored-by:
Nikita Titov <nekit94-12@hotmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 01 Feb, 2022 1 commit
-
-
James Lamb authored
* [ci] manually create symlinks to R entrypoints on macOS (fixes #4988) * exclude non-R CI jobs * upgrade to R 4.1.2 * get logs for R install * pin R 4.1.x jobs to macOS-10.15 * update to R 4.1.2 on Windows * allow for non-latest macOS builds in GHA configuration * fix prefix check * fix config check * more direct check for mac version * uncomment other CIs * update R version in CI job names
-
- 23 Jan, 2022 1 commit
-
-
Nikita Titov authored
* Revert "[ci] ignore certificates for kitware apt channel in CUDA jobs (fixes #4646) (#4648)" This reverts commit 10e0edc4. * update cuda at CI
-
- 18 Dec, 2021 1 commit
-
-
Nikita Titov authored
* Update .appveyor.yml * Update .vsts-ci.yml * Update python_package.yml * Update setup.py * Update test.sh
-
- 04 Dec, 2021 1 commit
-
-
Nikita Titov authored
* Update .vsts-ci.yml * Update setup.sh * Update .vsts-ci.yml * Update test.sh * Update README.rst
-
- 18 Nov, 2021 1 commit
-
-
James Lamb authored
* [R-package] [docs] add intro vignette (#3946) * add 10 test vignettes * Revert "add 10 test vignettes" This reverts commit 40fb2e2f1982402798776ee44e4ec82fc4644d3d. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Michael Mayer <mayermichael79@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 17 Nov, 2021 1 commit
-
-
Nikita Titov authored
* Update get_workflow_status.py * Update get_workflow_status.py
-
- 14 Nov, 2021 2 commits
-
-
Nikita Titov authored
* Update test.sh * Update test.sh * Update test.sh * Update FindLibR.cmake * Update r_package.yml * Update FindLibR.cmake * Update r_package.yml
-
Nikita Titov authored
* Update CMakeLists.txt * Update CMakeLists.txt * Update static_analysis.yml * Update CMakeLists.txt * Update test.sh * Update CMakeLists.txt * Update static_analysis.yml
-
- 10 Nov, 2021 1 commit
-
-
James Lamb authored
* [R-package] parallelize compilation in CMake-based builds * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * working on adding -j * pass -j through to install.libs.R * add docs on -j * use -j4 * Update R-package/README.md Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 06 Nov, 2021 1 commit
-
-
Nikita Titov authored
This reverts commit d62378b4.
-
- 05 Nov, 2021 1 commit
-
-
Nikita Titov authored
* pin Dask version at CI * Update .vsts-ci.yml * Update .vsts-ci.yml * workaround for Python 3.6 * Update test.sh
-
- 03 Nov, 2021 1 commit
-
-
James Lamb authored
* [ci] use wch1/r-debug image in Solaris tests * no git in valgrind tests
-
- 31 Oct, 2021 1 commit
-
-
James Lamb authored
* [R-package] allow use of custom R executable building CRAN package * Update build-cran-package.sh Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 30 Oct, 2021 1 commit
-
-
Nikita Titov authored
* indicate support of Monterey and drop Catalina * Update test.sh * restore Mojave
-
- 29 Oct, 2021 1 commit
-
-
Nikita Titov authored
* Update test_windows.ps1 * Update .appveyor.yml * Update .appveyor.yml
-
- 24 Oct, 2021 1 commit
-
-
Nikita Titov authored
-
- 23 Oct, 2021 1 commit
-
-
Nikita Titov authored
-
- 05 Oct, 2021 1 commit
-
-
Nikita Titov authored
* Use the latest gcc version in macOS CI jobs * test: swap compilers * Revert "test: swap compilers" * Revert "test: swap compilers"
-
- 04 Oct, 2021 1 commit
-
-
James Lamb authored
* [ci] ignore certificates for kitware apt channel in CUDA jobs (fixes #4646) * try building from a fork * Update .gitmodules
-
- 22 Sep, 2021 1 commit
-
-
Nikita Titov authored
-
- 27 Aug, 2021 1 commit
-
-
Nikita Titov authored
* Reffer to string type as `str` and and commas in `list of ...` types * update `libpath.py` too
-
- 26 Aug, 2021 1 commit
-
-
James Lamb authored
* [ci] upgrade R to 4.1.1 * try moving miktex errors
-
- 22 Aug, 2021 1 commit
-
-
Nikita Titov authored
-
- 20 Aug, 2021 1 commit
-
-
Nikita Titov authored
-
- 19 Aug, 2021 1 commit
-
-
Nikita Titov authored
-
- 14 Aug, 2021 2 commits
-
-
James Lamb authored
* [R-package] use C++ compiler for pre-compile checks on Windows * install Matrix in valgrind test * Add {Matrix} in more places in CI and docs * use CXX11 * use flags specific to C++11 * missing backtick Co-authored-by:Nikita Titov <nekit94-12@hotmail.com>
-
James Lamb authored
-
- 10 Aug, 2021 1 commit
-
-
James Lamb authored
* [ci] move Solaris and valgrind test steps into scripts * Update .github/workflows/r_solaris.yml Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * relatiev paths Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 11 Jul, 2021 1 commit
-
-
Nikita Titov authored
* remove preinstalled possibly conflicting software from PATH in CI jobs * preserve pandoc
-
- 10 Jul, 2021 1 commit
-
-
James Lamb authored
* [ci] add CI job running rchk * try commenting out more stuff * ignore R internal error * pipes * remove PROTECT() * try removing testthat * revert temporary testing changes
-
- 08 Jul, 2021 1 commit
-
-
José Morales authored
* call predict on one row of data to determine output shape * make DaskLGBMRanker predict method equal to the others * remove extra drop_axis
-
- 04 Jul, 2021 1 commit
-
-
Nikita Titov authored
-
- 02 Jul, 2021 1 commit
-
-
Chen Yufei authored
* [python-package] create Dataset from sampled data. * [python-package] create Dataset from List[Sequence]. 1. Use random access for data sampling 2. Support read data from multiple input files 3. Read data in batch so no need to hold all data in memory * [python-package] example: create Dataset from multiple HDF5 file. * fix: revert is_class implementation for seq * fix: unwanted memory view reference for seq * fix: seq is_class accepts sklearn matrices * fix: requirements for example * fix: pycode * feat: print static code linting stage * fix: linting: avoid shell str regex conversion * code style: doc style * code style: isort * fix ci dependency: h5py on windows * [py] remove rm files in test seq https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623 * docs(python): init_from_sample summary https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389 * remove dataset dump sample data debugging code. * remove typo fix. Create separate PR for this. * fix typo in src/c_api.cpp Co-authored-by:
James Lamb <jaylamb20@gmail.com> * style(linting): py3 type hint for seq * test(basic): os.path style path handling * Revert "feat: print static code linting stage" This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d. * feat(python): sequence on validation set * minor(python): comment * minor(python): test option hint * style(python): fix code linting * style(python): add pydoc for ref_dataset * doc(python): sequence Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * revert(python): sequence class abc * chore(python): remove rm_files * Remove useless static_assert. * refactor: test_basic test for sequence. * fix lint complaint. * remove dataset._dump_text in sequence test. * Fix reverting typo fix. * Apply suggestions from code review Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Fix type hint, code and doc style. * fix failing test_basic. * Remove TODO about keep constant in sync with cpp. * Install h5py only when running python-examples. * Fix lint complaint. * Apply suggestions from code review Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Doc fixes, remove unused params_str in __init_from_seqs. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Remove unnecessary conda install in windows ci script. * Keep param as example in dataset_from_multi_hdf5.py * Add _get_sample_count function to remove code duplication. * Use batch_size parameter in generate_hdf. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Fix after applying suggestions. * Fix test, check idx is instance of numbers.Integral. * Update python-package/lightgbm/basic.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Expose Sequence class in Python-API doc. * Handle Sequence object not having batch_size. * Fix isort lint complaint. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update docstring to mention Sequence as data input. * Remove get_one_line in test_basic.py * Make Sequence an abstract class. * Reduce number of tests for test_sequence. * Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices. * empty commit to trigger ci * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t. Also rename total_nrow to num_total_row in c_api.h for consistency. * Doc about Sequence in docs/Python-Intro.rst. * Fix: basic.py change LGBM_SampleIndices out_len to int32. * Add create_valid test case with Dataset from Sequence. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT. * Update python-package/lightgbm/basic.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Willian Zhang <willian@willian.email> Co-authored-by:
Willian Z <Willian@Willian-Zhang.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 26 Jun, 2021 1 commit
-
-
Nikita Titov authored
* run cpp tests with sanitizers * re-trigger CI * continue * small cleanup * restore cpp test
-