1. 02 Jul, 2021 1 commit
    • Chen Yufei's avatar
      [python-package] Create Dataset from multiple data files (#4089) · c359896e
      Chen Yufei authored
      * [python-package] create Dataset from sampled data.
      
      * [python-package] create Dataset from List[Sequence].
      
      1. Use random access for data sampling
      2. Support read data from multiple input files
      3. Read data in batch so no need to hold all data in memory
      
      * [python-package] example: create Dataset from multiple HDF5 file.
      
      * fix: revert is_class implementation for seq
      
      * fix: unwanted memory view reference for seq
      
      * fix: seq is_class accepts sklearn matrices
      
      * fix: requirements for example
      
      * fix: pycode
      
      * feat: print static code linting stage
      
      * fix: linting: avoid shell str regex conversion
      
      * code style: doc style
      
      * code style: isort
      
      * fix ci dependency: h5py on windows
      
      * [py] remove rm files in test seq
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623
      
      * docs(python): init_from_sample summary
      
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389
      
      
      
      * remove dataset dump sample data debugging code.
      
      * remove typo fix.
      
      Create separate PR for this.
      
      * fix typo in src/c_api.cpp
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * style(linting): py3 type hint for seq
      
      * test(basic): os.path style path handling
      
      * Revert "feat: print static code linting stage"
      
      This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.
      
      * feat(python): sequence on validation set
      
      * minor(python): comment
      
      * minor(python): test option hint
      
      * style(python): fix code linting
      
      * style(python): add pydoc for ref_dataset
      
      * doc(python): sequence
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * revert(python): sequence class abc
      
      * chore(python): remove rm_files
      
      * Remove useless static_assert.
      
      * refactor: test_basic test for sequence.
      
      * fix lint complaint.
      
      * remove dataset._dump_text in sequence test.
      
      * Fix reverting typo fix.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Fix type hint, code and doc style.
      
      * fix failing test_basic.
      
      * Remove TODO about keep constant in sync with cpp.
      
      * Install h5py only when running python-examples.
      
      * Fix lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Doc fixes, remove unused params_str in __init_from_seqs.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Remove unnecessary conda install in windows ci script.
      
      * Keep param as example in dataset_from_multi_hdf5.py
      
      * Add _get_sample_count function to remove code duplication.
      
      * Use batch_size parameter in generate_hdf.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Fix after applying suggestions.
      
      * Fix test, check idx is instance of numbers.Integral.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Expose Sequence class in Python-API doc.
      
      * Handle Sequence object not having batch_size.
      
      * Fix isort lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update docstring to mention Sequence as data input.
      
      * Remove get_one_line in test_basic.py
      
      * Make Sequence an abstract class.
      
      * Reduce number of tests for test_sequence.
      
      * Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.
      
      * empty commit to trigger ci
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.
      
      Also rename total_nrow to num_total_row in c_api.h for consistency.
      
      * Doc about Sequence in docs/Python-Intro.rst.
      
      * Fix: basic.py change LGBM_SampleIndices out_len to int32.
      
      * Add create_valid test case with Dataset from Sequence.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarWillian Zhang <willian@willian.email>
      Co-authored-by: default avatarWillian Z <Willian@Willian-Zhang.com>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      c359896e
  2. 04 May, 2021 1 commit
  3. 07 Feb, 2021 1 commit
  4. 11 Jan, 2021 1 commit
  5. 11 Sep, 2020 1 commit
  6. 10 Apr, 2020 1 commit
    • Nikita Titov's avatar
      [python] Re-enable scikit-learn 0.22+ support (#2949) · c633c6c2
      Nikita Titov authored
      * Revert "specify the last supported version of scikit-learn (#2637)"
      
      This reverts commit d1002776.
      
      * ban scikit-learn 0.22.0 and skip broken test
      
      * fix updated test
      
      * fix lint test
      
      * Revert "fix lint test"
      
      This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.
      c633c6c2
  7. 19 Dec, 2019 1 commit
  8. 14 Oct, 2019 1 commit
  9. 18 May, 2019 1 commit
  10. 15 May, 2019 1 commit
  11. 08 May, 2019 1 commit
  12. 10 Apr, 2019 1 commit
  13. 02 Apr, 2019 1 commit
  14. 26 Mar, 2019 1 commit
  15. 25 Mar, 2019 1 commit
    • kenmatsu4's avatar
      [python] Use first_metric_only flag for early_stopping function. (#2049) · 011cc90a
      kenmatsu4 authored
      * Use first_metric_only flag for early_stopping function.
      
      In order to apply early stopping with only first metric, applying first_metric_only flag for early_stopping function.
      
      * upcate comment
      
      * Revert "upcate comment"
      
      This reverts commit 1e75a1a415cc16cfbe795181e148ebfe91469be4.
      
      * added test
      
      * fixed docstring
      
      * cut comment and save one line
      
      * document new feature
      011cc90a
  16. 21 Feb, 2019 1 commit
  17. 18 Feb, 2019 2 commits
  18. 04 Feb, 2019 1 commit
  19. 16 Oct, 2018 1 commit
  20. 10 Oct, 2018 1 commit
  21. 08 Sep, 2018 1 commit
    • Nikita Titov's avatar
      [docs] minor docs enhancements (#1647) · 536f5dde
      Nikita Titov authored
      * added links to corresponding params in Quick-Start guide
      
      * updated description of possible input types in python
      
      * clarify list of numpy arrays input type in docs
      536f5dde
  22. 27 Aug, 2018 1 commit
  23. 03 Jun, 2018 1 commit
  24. 26 May, 2018 1 commit
    • Zach Kurtz's avatar
      [docs] Edits for grammer and clarity (#1389) · af401561
      Zach Kurtz authored
      * A nitpicky grammer edit with minor clarifications added.
      
      * fix link
      
      * strike s
      
      * try a different optimal-split link, clarify experimental details
      
      * smoothing the FAQ
      
      * edit Features.rst
      
      * several minor edits throughout docs
      
      * historgram-based
      af401561
  25. 24 May, 2018 1 commit
  26. 05 May, 2018 1 commit
  27. 01 Jan, 2018 1 commit
  28. 30 Dec, 2017 1 commit
  29. 12 Oct, 2017 1 commit
    • Nikita Titov's avatar
      [docs] documentation improvement (#976) · 4aa32967
      Nikita Titov authored
      * fixed typos and hotfixes
      
      * converted gcc-tips.Rmd; added ref to gcc-tips
      
      * renamed files
      
      * renamed Advanced-Topics
      
      * renamed README
      
      * renamed Parameters-Tuning
      
      * renamed FAQ
      
      * fixed refs to FAQ
      
      * fixed undecodable source characters
      
      * renamed Features
      
      * renamed Quick-Start
      
      * fixed undecodable source characters in Features
      
      * renamed Python-Intro
      
      * renamed GPU-Tutorial
      
      * renamed GPU-Windows
      
      * fixed markdown
      
      * fixed undecodable source characters in GPU-Windows
      
      * renamed Parameters
      
      * fixed markdown
      
      * removed recommonmark dependence
      
      * hotfixes
      
      * added anchors to links
      
      * fixed 404
      
      * fixed typos
      
      * added more anchors
      
      * removed sphinxcontrib-napoleon dependence
      
      * removed outdated line in Travis config
      
      * fixed max-width of the ReadTheDocs theme
      
      * added horizontal align to images
      4aa32967