1. 15 Sep, 2021 1 commit
  2. 02 Jul, 2021 1 commit
    • Chen Yufei's avatar
      [python-package] Create Dataset from multiple data files (#4089) · c359896e
      Chen Yufei authored
      * [python-package] create Dataset from sampled data.
      
      * [python-package] create Dataset from List[Sequence].
      
      1. Use random access for data sampling
      2. Support read data from multiple input files
      3. Read data in batch so no need to hold all data in memory
      
      * [python-package] example: create Dataset from multiple HDF5 file.
      
      * fix: revert is_class implementation for seq
      
      * fix: unwanted memory view reference for seq
      
      * fix: seq is_class accepts sklearn matrices
      
      * fix: requirements for example
      
      * fix: pycode
      
      * feat: print static code linting stage
      
      * fix: linting: avoid shell str regex conversion
      
      * code style: doc style
      
      * code style: isort
      
      * fix ci dependency: h5py on windows
      
      * [py] remove rm files in test seq
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623
      
      * docs(python): init_from_sample summary
      
      https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389
      
      
      
      * remove dataset dump sample data debugging code.
      
      * remove typo fix.
      
      Create separate PR for this.
      
      * fix typo in src/c_api.cpp
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * style(linting): py3 type hint for seq
      
      * test(basic): os.path style path handling
      
      * Revert "feat: print static code linting stage"
      
      This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.
      
      * feat(python): sequence on validation set
      
      * minor(python): comment
      
      * minor(python): test option hint
      
      * style(python): fix code linting
      
      * style(python): add pydoc for ref_dataset
      
      * doc(python): sequence
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * revert(python): sequence class abc
      
      * chore(python): remove rm_files
      
      * Remove useless static_assert.
      
      * refactor: test_basic test for sequence.
      
      * fix lint complaint.
      
      * remove dataset._dump_text in sequence test.
      
      * Fix reverting typo fix.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Fix type hint, code and doc style.
      
      * fix failing test_basic.
      
      * Remove TODO about keep constant in sync with cpp.
      
      * Install h5py only when running python-examples.
      
      * Fix lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      
      * Doc fixes, remove unused params_str in __init_from_seqs.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Remove unnecessary conda install in windows ci script.
      
      * Keep param as example in dataset_from_multi_hdf5.py
      
      * Add _get_sample_count function to remove code duplication.
      
      * Use batch_size parameter in generate_hdf.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Fix after applying suggestions.
      
      * Fix test, check idx is instance of numbers.Integral.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Expose Sequence class in Python-API doc.
      
      * Handle Sequence object not having batch_size.
      
      * Fix isort lint complaint.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update docstring to mention Sequence as data input.
      
      * Remove get_one_line in test_basic.py
      
      * Make Sequence an abstract class.
      
      * Reduce number of tests for test_sequence.
      
      * Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.
      
      * empty commit to trigger ci
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.
      
      Also rename total_nrow to num_total_row in c_api.h for consistency.
      
      * Doc about Sequence in docs/Python-Intro.rst.
      
      * Fix: basic.py change LGBM_SampleIndices out_len to int32.
      
      * Add create_valid test case with Dataset from Sequence.
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      
      * Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.
      
      * Update python-package/lightgbm/basic.py
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarWillian Zhang <willian@willian.email>
      Co-authored-by: default avatarWillian Z <Willian@Willian-Zhang.com>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      Co-authored-by: default avatarshiyu1994 <shiyu_k1994@qq.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      c359896e
  3. 10 Feb, 2021 1 commit
  4. 25 Jan, 2021 1 commit
  5. 24 Jan, 2021 1 commit
  6. 02 Aug, 2020 1 commit
  7. 10 Apr, 2020 1 commit
    • Nikita Titov's avatar
      [python] Re-enable scikit-learn 0.22+ support (#2949) · c633c6c2
      Nikita Titov authored
      * Revert "specify the last supported version of scikit-learn (#2637)"
      
      This reverts commit d1002776.
      
      * ban scikit-learn 0.22.0 and skip broken test
      
      * fix updated test
      
      * fix lint test
      
      * Revert "fix lint test"
      
      This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.
      c633c6c2
  8. 19 Dec, 2019 1 commit
  9. 27 Jul, 2019 1 commit
    • Alexander L. Hayes's avatar
      [docs] 🎨 Sphinx Autosummary for generating Python-API documentation (#2286) · 207bb3ef
      Alexander L. Hayes authored
      * 🎨 `sphinx.ext.autosummary` for generating Python-API summaries
      
      Add `docs/.gitignore` to not track autosummary stubs
      Add `sphinx.ext.autosummary` in `docs/conf.py`
        Add 'members' and 'inherited-members' as default parameters
        Add 'autosummary = True' for setting output with `:toctree:`
      Add `.. autosummary::` tags to replace `.. autoclass::`
      
      Previously the `Python-API.rst` dumped all of the Python API onto
      a single page.
      
      This replaces the Python-API documentation with an index listing
      all modules, and paginates all functions and classes onto
      separate pages.
      
      * ️ Corrections following feedback
      
      Drop `docs/.gitignore` to use the general `.gitignore`
      Add `show-inheritance` to `autodoc_default_flags` in `docs/conf.py`
      Fix `both` to `class` in `autoclass_content` in `docs/conf.py`
      
      * ️ Replacing deprecated Sphinx parameter
      
      Fix deprecated `autodoc_default_flags` to `autodoc_default_options`
      
      * ️ Adding `autodoc_default_flags` in to support early Sphinx versions
      
      Add `autodoc_default_flags` with parameters from
        `autodoc_default_options`
      207bb3ef
  10. 01 May, 2019 1 commit
    • Nikita Titov's avatar
      [python] added plot_split_value_histogram function (#2043) · 611cf5d4
      Nikita Titov authored
      * added plot_split_value_histogram function
      
      * updated init module
      
      * added plot split value histogram example
      
      * added plot_split_value_histogram to notebook
      
      * added test
      
      * fixed pylint
      
      * updated API docs
      
      * fixed grammar
      
      * set y ticks to int value in more sufficient way
      611cf5d4
  11. 27 Aug, 2018 1 commit
  12. 07 Oct, 2017 1 commit
    • Nikita Titov's avatar
      [docs] move wiki to Read the Docs (#945) · 6d34fb86
      Nikita Titov authored
      * fixed Python-API references
      
      * moved Features section to ReadTheDocs
      
      * fixed index of ReadTheDocs
      
      * moved Experiments section to ReadTheDocs
      
      * fixed capital letter
      
      * fixed citing
      
      * moved Parallel Learning section to ReadTheDocs
      
      * fixed markdown
      
      * fixed Python-API
      
      * fixed link to Quick-Start
      
      * fixed gpu docker README
      
      * moved Installation Guide from wiki to ReadTheDocs
      
      * removed references to wiki
      
      * fixed capital letters in headings
      
      * hotfixes
      
      * fixed non-Unicode symbols and reference to Python API
      
      * fixed citing references
      
      * fixed links in .md files
      
      * fixed links in .rst files
      
      * store images locally in the repo
      
      * fixed missed word
      
      * fixed indent in Experiments.rst
      
      * fixed 'Duplicate implicit target name' message which is successfully
      resolved by adding anchors
      
      * less verbose
      
      * prevented maito: ref creation
      
      * fixed indents
      
      * fixed 404
      
      * fixed 403
      
      * fixed 301
      
      * fixed fake anchors
      
      * fixed file extentions
      
      * fixed Sphinx warnings
      
      * added StrikerRUS profile link to FAQ
      
      * added henry0312 profile link to FAQ
      6d34fb86
  13. 07 May, 2017 1 commit
  14. 05 May, 2017 1 commit