"include/git@developer.sourcefind.cn:tianlh/lightgbm-dcu.git" did not exist on "74dfd9052f6fb9dbd822960c6942ee614318a324"
- 07 Jul, 2021 1 commit
-
-
Nikita Titov authored
* allow to pass some params as pathlib.Path objects * fix lint * improve indentation
-
- 05 Jul, 2021 2 commits
-
-
Nikita Titov authored
* Update test_sklearn.py * Update test_basic.py * Update dask.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update callback.py
-
Nikita Titov authored
-
- 02 Jul, 2021 1 commit
-
-
Chen Yufei authored
* [python-package] create Dataset from sampled data. * [python-package] create Dataset from List[Sequence]. 1. Use random access for data sampling 2. Support read data from multiple input files 3. Read data in batch so no need to hold all data in memory * [python-package] example: create Dataset from multiple HDF5 file. * fix: revert is_class implementation for seq * fix: unwanted memory view reference for seq * fix: seq is_class accepts sklearn matrices * fix: requirements for example * fix: pycode * feat: print static code linting stage * fix: linting: avoid shell str regex conversion * code style: doc style * code style: isort * fix ci dependency: h5py on windows * [py] remove rm files in test seq https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623 * docs(python): init_from_sample summary https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389 * remove dataset dump sample data debugging code. * remove typo fix. Create separate PR for this. * fix typo in src/c_api.cpp Co-authored-by:
James Lamb <jaylamb20@gmail.com> * style(linting): py3 type hint for seq * test(basic): os.path style path handling * Revert "feat: print static code linting stage" This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d. * feat(python): sequence on validation set * minor(python): comment * minor(python): test option hint * style(python): fix code linting * style(python): add pydoc for ref_dataset * doc(python): sequence Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * revert(python): sequence class abc * chore(python): remove rm_files * Remove useless static_assert. * refactor: test_basic test for sequence. * fix lint complaint. * remove dataset._dump_text in sequence test. * Fix reverting typo fix. * Apply suggestions from code review Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Fix type hint, code and doc style. * fix failing test_basic. * Remove TODO about keep constant in sync with cpp. * Install h5py only when running python-examples. * Fix lint complaint. * Apply suggestions from code review Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Doc fixes, remove unused params_str in __init_from_seqs. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Remove unnecessary conda install in windows ci script. * Keep param as example in dataset_from_multi_hdf5.py * Add _get_sample_count function to remove code duplication. * Use batch_size parameter in generate_hdf. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Fix after applying suggestions. * Fix test, check idx is instance of numbers.Integral. * Update python-package/lightgbm/basic.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Expose Sequence class in Python-API doc. * Handle Sequence object not having batch_size. * Fix isort lint complaint. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update docstring to mention Sequence as data input. * Remove get_one_line in test_basic.py * Make Sequence an abstract class. * Reduce number of tests for test_sequence. * Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices. * empty commit to trigger ci * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t. Also rename total_nrow to num_total_row in c_api.h for consistency. * Doc about Sequence in docs/Python-Intro.rst. * Fix: basic.py change LGBM_SampleIndices out_len to int32. * Add create_valid test case with Dataset from Sequence. * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> * Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT. * Update python-package/lightgbm/basic.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Willian Zhang <willian@willian.email> Co-authored-by:
Willian Z <Willian@Willian-Zhang.com> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
shiyu1994 <shiyu_k1994@qq.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 27 Jun, 2021 1 commit
-
-
Nikita Titov authored
-
- 26 Jun, 2021 1 commit
-
-
Nikita Titov authored
-
- 18 Jun, 2021 1 commit
-
-
Chen Yufei authored
-
- 21 May, 2021 1 commit
-
-
Nikita Titov authored
* handle arbitrary length feature names in Python-package * added tests
-
- 20 May, 2021 1 commit
-
-
Nikita Titov authored
-
- 17 May, 2021 1 commit
-
-
Nikita Titov authored
-
- 15 May, 2021 1 commit
-
-
NovusEdge authored
* added f-string * fix missing parentheses and other string formatting * remove extra trailing parenthesis * one more missing parenthesis * fix pandas categoricals * update uses of + * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
James Lamb <jaylamb20@gmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 10 May, 2021 1 commit
-
-
James Lamb authored
[docs] clarify docs for LGBM_BoosterGetEvalNames and LGBM_BoosterGetEvalCounts (fixes #4264) (#4270)
-
- 04 May, 2021 1 commit
-
-
Andrew Ziem authored
* Correct spelling Most changes were in comments, and there were a few changes to literals for log output. There were no changes to variable names, function names, IDs, or functionality. * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Clarify a phrase in a comment Co-authored-by:
James Lamb <jaylamb20@gmail.com> * Correct spelling Most are code comments, but one case is a literal in a logging message. There are a few grammar fixes too. Co-authored-by:
James Lamb <jaylamb20@gmail.com>
-
- 02 May, 2021 1 commit
-
-
Nikita Titov authored
-
- 15 Mar, 2021 1 commit
-
-
James Lamb authored
* [python-package] add type hints on Booster.set_network() * change behavior
-
- 24 Feb, 2021 1 commit
-
-
jmoralez authored
* include support for column array as label * remove nested ifs * fix linting errors * include tests for sklearn regressors * include docstring for numpy_1d_array_to_dtype * include . at end of docstring * remove pandas import and test for regression, classification and ranking * check predictions of sklearn models as well * test training only in dask. drop pandas series tests * use PANDAS_INSTALLED and pd_Series * inline imports * use col array in fit for test_dask * include review comments
-
- 19 Feb, 2021 1 commit
-
-
James Lamb authored
* [docs] Change some 'parallel learning' references to 'distributed learning' * found a few more * one more reference
-
- 17 Feb, 2021 1 commit
-
-
Alex Ford authored
Approximately %80 of runtime when loading "low column count, high row count" DataFrames into Datasets is consumed in `np.fromiter`, called as part of the `Dataset.get_field` method. This is particularly pernicious hotspot, as unlike other ctypes-based methods this is a hot loop over a python iterator loop and causes significant GIL-contention in multi-threaded applications. Replace `np.fromiter` with a direct call to `np.ctypeslib.as_array`, which allows a single-shot `copy` of the underlying array. This reduces the load time of a ~35 million row categorical dataframe with 1 column from ~5 seconds to ~1 second, and allows multi-threaded execution.
-
- 16 Feb, 2021 2 commits
-
-
Nikita Titov authored
* run isort in CI linting job * workaround conda compatibility issues
-
Zhuyi Xue authored
-
- 28 Jan, 2021 1 commit
-
-
Nikita Titov authored
-
- 26 Jan, 2021 3 commits
-
-
Nikita Titov authored
-
Nikita Titov authored
* fix Dask docstrings and mimic sklearn importing way * Update .vsts-ci.yml * revert CI checks * use import aliases for Dask classes * check Dask is installed in _predict() func * fix lint issues introduced during resolving merge conflicts * Update dask.py
-
James Lamb authored
* [dask] allow parameter aliases for tree_learner and local_listen_port (fixes #3671) * num_thread too * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * empty commit * add _choose_param_value * revert param order change * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Update python-package/lightgbm/dask.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * just import deepcopy * remove machines aliases * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 24 Jan, 2021 2 commits
-
-
Nikita Titov authored
* Update dask.py * Update basic.py * hotfix pop
-
Nikita Titov authored
* centralize Python-package logging in one place * continue * fix test name * removed unused import * enhance test * fix lint * hotfix test * workaround for GPU test * remove custom logger from Dask-package * replace one log func with flags by multiple funcs
-
- 20 Jan, 2021 1 commit
-
-
James Lamb authored
[dask] allow parameter aliases for local_listen_port, num_threads, tree_learner (fixes #3671) (#3789) * [dask] allow parameter aliases for tree_learner and local_listen_port (fixes #3671) * num_thread too * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * empty commit Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 19 Jan, 2021 1 commit
-
-
Nikita Titov authored
* fix docs * Update basic.py * Update engine.py
-
- 18 Jan, 2021 1 commit
-
-
James Lamb authored
* [python-package] expand documentation on 'group' for ranking task * add R package * update Query Data section * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * fix typo in group example * regenerate parameters * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * regenerate R docs Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 24 Dec, 2020 1 commit
-
-
Belinda Trotta authored
* Add Eigen library. * Working for simple test. * Apply changes to config params. * Handle nan data. * Update docs. * Add test. * Only load raw data if boosting=gbdt_linear * Remove unneeded code. * Minor updates. * Update to work with sk-learn interface. * Update to work with chunked datasets. * Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters. * Save raw data in binary dataset file. * Update docs and fix parameter checking. * Fix dataset loading. * Add test for regularization. * Fix bugs when saving and loading tree. * Add test for load/save linear model. * Remove unneeded code. * Fix case where not enough leaf data for linear model. * Simplify code. * Speed up code. * Speed up code. * Simplify code. * Speed up code. * Fix bugs. * Working version. * Store feature data column-wise (not fully working yet). * Fix bugs. * Speed up. * Speed up. * Remove unneeded code. * Small speedup. * Speed up. * Minor updates. * Remove unneeded code. * Fix bug. * Fix bug. * Speed up. * Speed up. * Simplify code. * Remove unneeded code. * Fix bug, add more tests. * Fix bug and add test. * Only store numerical features * Fix bug and speed up using templates. * Speed up prediction. * Fix bug with regularisation * Visual studio files. * Working version * Only check nans if necessary * Store coeff matrix as an array. * Align cache lines * Align cache lines * Preallocation coefficient calculation matrices * Small speedups * Small speedup * Reverse cache alignment changes * Change to dynamic schedule * Update docs. * Refactor so that linear tree learner is not a separate class. * Add refit capability. * Speed up * Small speedups. * Speed up add prediction to score. * Fix bug * Fix bug and speed up. * Speed up dataload. * Speed up dataload * Use vectors instead of pointers * Fix bug * Add OMP exception handling. * Change return type of LGBM_BoosterGetLinear to bool * Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change * Remove unused internal_parent_ property of tree * Remove unused parameter to CreateTreeLearner * Remove reference to LinearTreeLearner * Minor style issues * Remove unneeded check * Reverse temporary testing change * Fix Visual Studio project files * Restore LightGBM.vcxproj.filters * Speed up * Speed up * Simplify code * Update docs * Simplify code * Initialise storage space for max num threads * Move Eigen to include directory and delete unused files * Remove old files. * Fix so it compiles with mingw * Fix gpu tree learner * Change AddPredictionToScore back to const * Fix python lint error * Fix C++ lint errors * Change eigen to a submodule * Update comment * Add the eigen folder * Try to fix build issues with eigen * Remove eigen files * Add eigen as submodule * Fix include paths * Exclude eigen files from Python linter * Ignore eigen folders for pydocstyle * Fix C++ linting errors * Fix docs * Fix docs * Exclude eigen directories from doxygen * Update manifest to include eigen * Update build_r to include eigen files * Fix compiler warnings * Store raw feature data as float * Use float for calculating linear coefficients * Remove eigen directory from GLOB * Don't compile linear model code when building R package * Fix doxygen issue * Fix lint issue * Fix lint issue * Remove uneeded code * Restore delected lines * Restore delected lines * Change return type of has_raw to bool * Update docs * Rename some variables and functions for readability * Make tree_learner parameter const in AddScore * Fix style issues * Pass vectors as const reference when setting tree properties * Make temporary storage of serial_tree_learner mutable so we can make the object's methods const * Remove get_raw_size, use num_numeric_features instead * Fix typo * Make contains_nan_ and any_nan_ properties immutable again * Remove data_has_nan_ property of tree * Remove temporary test code * Make linear_tree a dataset param * Fix lint error * Make LinearTreeLearner a separate class * Fix lint errors * Fix lint error * Add linear_tree_learner.o * Simulate omp_get_max_threads if openmp is not available * Update PushOneData to also store raw data. * Cast size to int * Fix bug in ReshapeRaw * Speed up code with multithreading * Use OMP_NUM_THREADS * Speed up with multithreading * Update to use ArrayToString * Fix tests * Fix test * Fix bug introduced in merge * Minor updates * Update docs
-
- 15 Dec, 2020 1 commit
-
-
penolove authored
-
- 09 Dec, 2020 1 commit
-
-
Nikita Titov authored
* Update setup.py * Update .appveyor.yml * Update .travis.yml * Update .vsts-ci.yml * Update __init__.py * Update test.sh * Update test_windows.ps1 * Update advanced_example.py * Update requirements_base.txt * Update conf.py * Update conf.py * Update test_engine.py * Update utils.py * Update dockerfile-r * Update README.md * Update dockerfile.gpu * Update test_consistency.py * Update basic.py * Update compat.py * Update engine.py * Update sklearn.py * Update sklearn.py * Update callback.py * Update setup.py * Update __init__.py * Update plotting.py * Update sklearn.py * Update engine.py * Update compat.py * Update callback.py * Update basic.py * Update compat.py * Update basic.py * Update basic.py * Update compat.py * Update compat.py * Update plotting.py * Update engine.py * Update basic.py * Update sklearn.py * Update compat.py * Update engine.py * Update engine.py * Update callback.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update sklearn.py * Update sklearn.py * Update plotting.py * Update sklearn.py * Update compat.py * Update compat.py * Update engine.py * Update plotting.py * Update sklearn.py * Update basic.py * Update basic.py * Update basic.py * Update basic.py * Update compat.py * Update compat.py * Update compat.py * Update engine.py * Update basic.py * Update compat.py * Update basic.py * Update basic.py * Update basic.py * Update compat.py * Update compat.py * Update basic.py * Update basic.py * Update .vsts-ci.yml * Update .vsts-ci.yml * Update conf.py * Revert "Update dockerfile-r" This reverts commit 4ff6ffc7e3eeda24cc6a59a3bb0c973f02d9d71c.
-
- 07 Dec, 2020 1 commit
-
-
James Lamb authored
[python][docs] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() (#3618) * [python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() * fixing warnings * fix warnings * undo unnecessary space * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * single line, better weight descriptions * Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * column names * Update python-package/lightgbm/plotting.py Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 26 Oct, 2020 1 commit
-
-
Guolin Ke authored
* fix subset bug * typo * add fixme tag * bin mapper * fix test * fix add_features_from * Update dataset.cpp * fix merge bug * added Python merge code * added test for add_features * Update dataset.cpp * Update src/io/dataset.cpp * continue implementing * warn users about categorical features Co-authored-by:
StrikerRUS <nekit94-12@hotmail.com> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 30 Sep, 2020 2 commits
-
-
Nikita Titov authored
-
Belinda Trotta authored
-
- 11 Sep, 2020 1 commit
-
-
James Lamb authored
-
- 06 Sep, 2020 1 commit
-
-
Germán Ramírez-Espinoza authored
* Refactors sklearn API to allow a list of evaluation metrics in the parameter eval_metric of the class (and subclasses of) LGBMModel. Also adds unit tests for this functionality * Simplify expression to check whether the user passed one or multiple metrics to eval_metric parameter * Simplify new tests by using custom metrics already defined in the test file * Update docstring to reflect the fact that the parameter "feval" from the "train" and "cv" functions can also receive a list of callables * Remove oxford comma from docstrings Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Use named-parameters to make sure code is compatible with future versions of scikit-learn Apply suggestions from code review Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Remove throwaway return value to make code more succinct Co-authored-by:
Nikita Titov <nekit94-08@mail.ru> * Move statement to group together the code related to feval * Avoid modifying original args as it causes errors in scikit-learn tools For details see: https://github.com/microsoft/LightGBM/pull/2619 * Consolidate multiple eval-metrics unit-tests into one test Co-authored-by:
German I Ramirez-Espinoza <gire@home> Co-authored-by:
Nikita Titov <nekit94-08@mail.ru>
-
- 11 Aug, 2020 1 commit
-
-
Nikita Titov authored
simplify start_iteration param for predict in Python and some code cleanup for start_iteration (#3288) * simplify start_iteration param for predict in Python and some code cleanup for start_iteration * revert docs changes about the prediction result shape
-
- 06 Aug, 2020 1 commit
-
-
shiyu1994 authored
* [python] add start_iteration to python predict interface (#3058) * Apply suggestions from code review * Update lightgbm_R.h * Apply suggestions from code review * Apply suggestions from code review * fix R interface * update R documentation Co-authored-by:Guolin Ke <guolin.ke@outlook.com>
-