1. 23 Feb, 2022 1 commit
    • José Morales's avatar
      [python-package] use 2d collections for predictions, grads and hess in... · d670a4d6
      José Morales authored
      [python-package] use 2d collections for predictions, grads and hess in multiclass custom objective (#4925)
      
      * reshape predictions, grad and hess in multiclass custom objective
      
      * add sklearn test. move custom obj to utils. docs for numpy
      
      * use num_model_per_iteration to get num_classes
      
      * update docs and dask multiclass custom objective test
      
      * move reshaping to __inner_predict. add test for feval
      
      * add missing note. remove extra line
      d670a4d6
  2. 20 Feb, 2022 1 commit
  3. 17 Feb, 2022 1 commit
  4. 16 Feb, 2022 2 commits
  5. 22 Jan, 2022 1 commit
    • Miguel Trejo Marrufo's avatar
      [python-package] support customizing Dataset creation in Booster.refit() (fixes #3038) (#4894) · e6a2f716
      Miguel Trejo Marrufo authored
      * feat: refit additional kwargs for dataset and predict
      
      * test: kwargs for refit method
      
      * fix: __init__ got multiple values for argument
      
      * fix: pycodestyle E302 error
      
      * refactor: dataset_params to avoid breaking change
      
      * refactor: expose all Dataset params in refit
      
      * feat: dataset_params updates new_params
      
      * fix: remove unnecessary params to test
      
      * test: parameters input are the same
      
      * docs: address StrikeRUS changes
      
      * test: refit test changes in train dataset
      
      * test: set init_score and decay_rate to zero
      e6a2f716
  6. 30 Dec, 2021 1 commit
    • Yaqub Alwan's avatar
      [python] raise an informative error instead of segfaulting when custom... · af5b40e1
      Yaqub Alwan authored
      
      [python] raise an informative error instead of segfaulting when custom objective produces incorrect output (#4815)
      
      * fix for bad grads causing segfault
      
      * adjust checking criteria to properly reflect reality of multi-class classifiers
      
      * fix styling
      
      * Line break before operator
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * add a note to the C-API docs
      
      * rearrange text s;ightly
      
      * add some tests to python package
      
      * Update include/LightGBM/c_api.h
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * PR comments
      
      * match argument is a regex and our expression has brackets ..
      
      * rework tests
      
      * isorting imports
      
      * updating test to relfect that the python APi does not take pres/labels as a fobj function
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      af5b40e1
  7. 11 Dec, 2021 1 commit
  8. 03 Dec, 2021 1 commit
  9. 02 Dec, 2021 1 commit
  10. 26 Nov, 2021 1 commit
  11. 23 Nov, 2021 1 commit
  12. 20 Nov, 2021 1 commit
    • Nikita Titov's avatar
      [python] Remove `silent` argument (#4800) · 2caf945f
      Nikita Titov authored
      * Update test_plotting.py
      
      * Update dask.py
      
      * Update sklearn.py
      
      * Update test_sklearn.py
      
      * Update basic.py
      
      * Update engine.py
      
      * Update test_engine.py
      
      * Update basic.py
      
      * Update basic.py
      
      * Update engine.py
      2caf945f
  13. 15 Nov, 2021 1 commit
    • Drew Miller's avatar
      [c_api] Improve ANSI compatibility by avoiding <stdbool.h> (#4697) · bfb346c1
      Drew Miller authored
      * [c_api] Improve ANSI compatibility by avoiding <stdbool.h>
      
      * fixes in response to CI linting
      
      * inline NOLINT instead of separate test
      
      * moving length declaration to non-ANSI C conditional
      
      * [c_api] Align expected return type in `basic.py` with new c_api type.
      bfb346c1
  14. 12 Nov, 2021 1 commit
    • Roman Shaptala's avatar
      [python] Faster categorical column names selection (#4787) · 6cbb3586
      Roman Shaptala authored
      * Faster categorical column names selection (#1)
      
      * Faster categorical column names selection
      
      Change slow and redundant dataframe query by select_dtypes into a dataframe.dtypes list comprehension
      
      * Update compat with CategoricalDtype
      
      * sort imports
      
      * import CategoricalDtype from pandas.api.types
      
      * add categorical import try/except
      6cbb3586
  15. 11 Nov, 2021 1 commit
  16. 08 Nov, 2021 1 commit
  17. 07 Oct, 2021 1 commit
  18. 05 Oct, 2021 1 commit
  19. 17 Sep, 2021 1 commit
    • José Morales's avatar
      [python-package] Support 2d collections as input for `init_score` in... · f1f5ba15
      José Morales authored
      
      [python-package] Support 2d collections as input for `init_score` in multiclass classification task (#4150)
      
      * initial implementation of init_score for multiclass classification
      
      * check for 1d or 2d collection in init_score
      
      * remove dataset import
      
      * initial comments
      
      * update dask test and docstrings
      
      * update docstrings
      
      * move logic to set_field. reshape back on get_field
      
      * add type hints and update docstrings for dask. fix Dataset.set_field
      
      * revert wrong docstrings and type hints
      
      * add extra comma for consistency
      
      * prefix private functions with underscore
      
      add type hints to new functions
      
      make commas consistent in dask and basic
      
      * add missing spaces after type hint
      
      * remove shape condition for dataframe in is_2d_collection
      Co-authored-by: default avatarNikita Titov <nekit94-12@hotmail.com>
      f1f5ba15
  20. 04 Sep, 2021 1 commit
  21. 30 Aug, 2021 1 commit
  22. 27 Aug, 2021 2 commits
  23. 25 Aug, 2021 1 commit
  24. 23 Aug, 2021 1 commit
  25. 19 Aug, 2021 1 commit
  26. 03 Aug, 2021 1 commit
  27. 31 Jul, 2021 1 commit
  28. 30 Jul, 2021 1 commit
  29. 07 Jul, 2021 1 commit
  30. 05 Jul, 2021 2 commits
  31. 02 Jul, 2021 1 commit
    • Chen Yufei's avatar
      [python-package] Create Dataset from multiple data files (#4089) · c359896e
      Chen Yufei authored
      * [python-package] create Dataset from sampled data.
      
      * [python-package] create Dataset from List[Sequence].
      
      1. Use random access for data sampling
      2. Support read data from multiple input files
      3. Read data in batch so no need to hold all data in memory
      
      * [python-package] example: create Dataset from multiple HDF5 file.
      
      * fix: revert is_class implementation for seq
      
      * fix: unwanted memory view reference for seq
      
      * fix: seq is_class accepts sklearn matrices
      
      * fix: requirements for example
      
      * fix: pycode
      
      * feat: print static code linting stage
      
      * fix: linting: avoid shell str regex conversion
      
      * code style: doc style
      
      * code style: isort
      
      * fix ci dependency: h5py on windows
      
      * [py] remove rm files in test seq
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623
      
      * docs(python): init_from_sample summary
      
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389
      
      
      
      * remove dataset dump sample data debugging code.
      
      * remove typo fix.
      
      Create separate PR for this.
      
      * fix typo in src/c_api.cpp
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * style(linting): py3 type hint for seq
      
      * test(basic): os.path style path handling
      
      * Revert "feat: print static code linting stage"
      
      This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.
      
      * feat(python): sequence on validation set
      
      * minor(python): comment
      
      * minor(python): test option hint
      
      * style(python): fix code linting
      
      * style(python): add pydoc for ref_dataset
      
      * doc(python): sequence
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * revert(python): sequence class abc
      
      * chore(python): remove rm_files
      
      * Remove useless static_assert.
      
      * refactor: test_basic test for sequence.
      
      * fix lint complaint.
      
      * remove dataset._dump_text in sequence test.
      
      * Fix reverting typo fix.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Fix type hint, code and doc style.
      
      * fix failing test_basic.
      
      * Remove TODO about keep constant in sync with cpp.
      
      * Install h5py only when running python-examples.
      
      * Fix lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Doc fixes, remove unused params_str in __init_from_seqs.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Remove unnecessary conda install in windows ci script.
      
      * Keep param as example in dataset_from_multi_hdf5.py
      
      * Add _get_sample_count function to remove code duplication.
      
      * Use batch_size parameter in generate_hdf.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Fix after applying suggestions.
      
      * Fix test, check idx is instance of numbers.Integral.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Expose Sequence class in Python-API doc.
      
      * Handle Sequence object not having batch_size.
      
      * Fix isort lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update docstring to mention Sequence as data input.
      
      * Remove get_one_line in test_basic.py
      
      * Make Sequence an abstract class.
      
      * Reduce number of tests for test_sequence.
      
      * Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.
      
      * empty commit to trigger ci
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.
      
      Also rename total_nrow to num_total_row in c_api.h for consistency.
      
      * Doc about Sequence in docs/Python-Intro.rst.
      
      * Fix: basic.py change LGBM_SampleIndices out_len to int32.
      
      * Add create_valid test case with Dataset from Sequence.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarWillian Zhang <willian@willian.email>
      Co-authored-by: default avatarWillian Z <Willian@Willian-Zhang.com>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      c359896e
  32. 27 Jun, 2021 1 commit
  33. 26 Jun, 2021 1 commit
  34. 18 Jun, 2021 1 commit
  35. 21 May, 2021 1 commit
  36. 20 May, 2021 1 commit
  37. 17 May, 2021 1 commit