Commits · ec5492f8f915f9bab99e52417cff6808dbdb8ddd · tianlh / LightGBM-DCU

27 Apr, 2025 1 commit
- [python-package] drop support for h2o datatable (#6894) · ec5492f8
  James Lamb authored Apr 27, 2025
  
  ec5492f8
15 Oct, 2024 1 commit
- [python-package] deprecate support for H2O 'datatable' (#6670) · 668bf5da
  James Lamb authored Oct 15, 2024
```
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
```
  668bf5da
04 Jun, 2024 1 commit
- [python-package] remove uses of deprecated NumPy random number generation... · e0cda880
  James Lamb authored Jun 03, 2024
```
[python-package] remove uses of deprecated NumPy random number generation APIs, require 'numpy>=1.17.0' (#6468)
```
  e0cda880
26 Dec, 2021 1 commit
- [python] remove `early_stopping_rounds` argument of `train()` and `cv()` functions (#4908) · ce486e5b
  Nikita Titov authored Dec 26, 2021
  
  ce486e5b
25 Aug, 2021 1 commit

[docs] Clarify the fact that predict() on a file does not support saved... · 417ba192

James Lamb authored Aug 25, 2021


[docs] Clarify the fact that predict() on a file does not support saved Datasets (fixes #4034) (#4545)

* documentation changes

* add list of supported formats to error message

* add unit tests

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update per review comments

* make references consistent
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

417ba192

02 Jul, 2021 1 commit

[python-package] Create Dataset from multiple data files (#4089) · c359896e

Chen Yufei authored Jul 02, 2021

* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389



* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8258bea8e61c3abb8c9c7e4456a916d.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c359896e

04 May, 2021 1 commit

Correct spelling (#4250) · e79716e0

Andrew Ziem authored May 04, 2021



* Correct spelling

Most changes were in comments, and there were a few changes to literals for log output.

There were no changes to variable names, function names, IDs, or functionality.

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Correct spelling

Most are code comments, but one case is a literal in a logging message.

There are a few grammar fixes too.
Co-authored-by: James Lamb <jaylamb20@gmail.com>

e79716e0

07 Feb, 2021 1 commit

[docs] fix typo: one-hot coding should be one-hot encoding (#3898) · c10b0430

Gaurav Chopra authored Feb 08, 2021



* Update Python-Intro.rst

* Update docs/Python-Intro.rst
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

c10b0430

11 Jan, 2021 1 commit

[docs][python] add conda-forge install instructions (#3544) · 78d31d9a

Ray Bell authored Jan 11, 2021



* DOC: add conda-forge install instructions

* DOC: add conda-forge instructions

* DOC: fix hyperlink

* DOC: point to installation guide

* add detailed

* Update python-package/README.rst
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update python-package/README.rst
Co-authored-by: James Lamb <jaylamb20@gmail.com>

* rm characters

* add pip install

* add :

* Update python-package/README.rst
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update python-package/README.rst
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* remove pip from header

* channel
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

78d31d9a

11 Sep, 2020 1 commit
- [docs] Simplify the python installation instruction (#3378) · 0d45ebd6
  Guolin Ke authored Sep 12, 2020
```
* Update Python-Intro.rst

* Update README.rst
```
  0d45ebd6
10 Apr, 2020 1 commit

[python] Re-enable scikit-learn 0.22+ support (#2949) · c633c6c2

Nikita Titov authored Apr 10, 2020

* Revert "specify the last supported version of scikit-learn (#2637)"

This reverts commit d1002776.

* ban scikit-learn 0.22.0 and skip broken test

* fix updated test

* fix lint test

* Revert "fix lint test"

This reverts commit 8b4db0805fe7a9e7f7eb0be3eac231f85026d196.

c633c6c2

19 Dec, 2019 1 commit
- specify the last supported version of scikit-learn (#2637) · d1002776
  Nikita Titov authored Dec 19, 2019
  
  d1002776
14 Oct, 2019 1 commit
- [docs] clarified support of LibSVM zero-based format files (#2504) · 4848776f
  Nikita Titov authored Oct 14, 2019
  
  4848776f
18 May, 2019 1 commit
- [docs] remove duplicated param in Python-Intro.rst (#2181) · 6a1a538f
  leasunhy authored May 19, 2019
```
`num_round` is redundant here because it will be overrideen by `num_trees` in the `param` dictionary.
```
  6a1a538f
15 May, 2019 1 commit
- [python] added ability to pass first_metric_only in params (#2175) · f91e5644
  Nikita Titov authored May 15, 2019
```
* added ability to pass first_metric_only in params

* simplified tests

* fixed test

* fixed punctuation
```
  f91e5644
08 May, 2019 1 commit
- [docs] updated Microsoft GitHub URL (#2152) · 94fbe5bb
  Guolin Ke authored May 08, 2019
```
* fix travis badge

* updated GitHub Microsoft URL
```
  94fbe5bb
10 Apr, 2019 1 commit
- [docs] Python wrapper doesn't support params in form of list of pairs (#2078) · b3c31c40
  Nikita Titov authored Apr 10, 2019
```
* fixed Python intro

* fixed typos

* scikit-learn added support of https
```
  b3c31c40
02 Apr, 2019 1 commit
- [docs] Fix typo in Python-Intro.rst (#2074) · fe115bbb
  sheikheddy authored Apr 03, 2019
  
  fe115bbb
26 Mar, 2019 1 commit

[docs] Small aesthetic improvements to RTD docs (#2060) · 572ae400

James Lamb authored Mar 26, 2019

* Small aesthetic improvements to RTD docs

* fixed markdown table in Development-Guide

* removed unnecessary blank line in conf.py

572ae400

25 Mar, 2019 1 commit

[python] Use first_metric_only flag for early_stopping function. (#2049) · 011cc90a

kenmatsu4 authored Mar 25, 2019

* Use first_metric_only flag for early_stopping function.

In order to apply early stopping with only first metric, applying first_metric_only flag for early_stopping function.

* upcate comment

* Revert "upcate comment"

This reverts commit 1e75a1a415cc16cfbe795181e148ebfe91469be4.

* added test

* fixed docstring

* cut comment and save one line

* document new feature

011cc90a

21 Feb, 2019 1 commit
- [python] update DataTable handling (#2020) · c5cfe3e3
  Nikita Titov authored Feb 21, 2019
  
  c5cfe3e3
18 Feb, 2019 2 commits
- Fix wording (#2015) · 7ebf80f8
  Harry Moreno authored Feb 18, 2019
  
  7ebf80f8
- Change variable name test_data > validation_data (#2018) · a777aedd
  Harry Moreno authored Feb 18, 2019
```
* it is confusing to name validation data `test_data` especially as terms like train, validation, test splits are common in ML. Change variable name in python quick start.
```
  a777aedd
04 Feb, 2019 1 commit

[python] convert datatable to numpy directly (#1970) · 2c9d3320

Guolin Ke authored Feb 05, 2019

* convert datatable to numpy directly

* fix according to comments

* updated more docstrings

* simplified isinstance check

* Update compat.py

2c9d3320

16 Oct, 2018 1 commit

[docs] corrected misleading note about best_iteration (#1758) · f3dce7e6

Nikita Titov authored Oct 16, 2018

* removed misleading note about best_iteration

* Update engine.py

* Update Python-Intro.rst

* Updated Engine.py

* Updated Python-Intro.rst

* add article 'the best', break huge line and remove excess empty line

f3dce7e6

10 Oct, 2018 1 commit
- [docs] fixed some typos and grammatical errors (#1738) · ac6951d3
  Alex authored Oct 10, 2018
  
  ac6951d3
08 Sep, 2018 1 commit

[docs] minor docs enhancements (#1647) · 536f5dde

Nikita Titov authored Sep 08, 2018

* added links to corresponding params in Quick-Start guide

* updated description of possible input types in python

* clarify list of numpy arrays input type in docs

536f5dde

27 Aug, 2018 1 commit

various improvements around metric param and early_stopping_rounds param description (#1589) · cd6d0583

Nikita Titov authored Aug 27, 2018

* bring consistency and clearness into early_stopping_rounds desc, metric desc and implementation

* hotfix

* hotfix

* used NDCG as default metric for lambdarank task

* fixed missed methods at ReadTheDocs and changed default eval_metric

* leaved only unique metrics

* fixed comment

cd6d0583

03 Jun, 2018 1 commit

[docs][python] made OS detection more reliable and little docs improvements (#1414) · a39c848e

Nikita Titov authored Jun 03, 2018

* added missed description of plot_example in python_guide folder and fixed consistency for packages naming

* more reliable OS detection

* fixed grammar

* made pylint happy

a39c848e

26 May, 2018 1 commit

[docs] Edits for grammer and clarity (#1389) · af401561

Zach Kurtz authored May 26, 2018

* A nitpicky grammer edit with minor clarifications added.

* fix link

* strike s

* try a different optimal-split link, clarify experimental details

* smoothing the FAQ

* edit Features.rst

* several minor edits throughout docs

* historgram-based

af401561

24 May, 2018 1 commit

[docs][python][R] early_stopping_rounds doesn't check all of eval_set (#1393) · e2a0de50

Fujii Hironori authored May 24, 2018

The document of `early_stopping_rounds` says it will check all of
eval_set. But, this is not true. It doesn't check the dataset
specified as the training data.

This change appends an extra phrase "except the training data" to all
of the sentences "If there's more than one, will check all of them" in
documents.

e2a0de50

05 May, 2018 1 commit

[python][docs] add info on adaptive learning rate in the sklearn API (#1354) · d1fd52e9

Misha Lisovyi authored May 05, 2018

* add info on adaptive learning rate in the sklearn API

* adjust learning rate documentation following the PR discussion

* fix early stopping documentation

* improve wording

* fixing trailing spaces

d1fd52e9

01 Jan, 2018 1 commit
- [docs] Typo on #119 (#1166) · 819df012
  Darío Hereñú authored Jan 01, 2018
  
  819df012
30 Dec, 2017 1 commit
- fixed typos (#1155) · 968a353f
  Nikita Titov authored Dec 30, 2017
  
  968a353f
12 Oct, 2017 1 commit

[docs] documentation improvement (#976) · 4aa32967

Nikita Titov authored Oct 12, 2017

* fixed typos and hotfixes

* converted gcc-tips.Rmd; added ref to gcc-tips

* renamed files

* renamed Advanced-Topics

* renamed README

* renamed Parameters-Tuning

* renamed FAQ

* fixed refs to FAQ

* fixed undecodable source characters

* renamed Features

* renamed Quick-Start

* fixed undecodable source characters in Features

* renamed Python-Intro

* renamed GPU-Tutorial

* renamed GPU-Windows

* fixed markdown

* fixed undecodable source characters in GPU-Windows

* renamed Parameters

* fixed markdown

* removed recommonmark dependence

* hotfixes

* added anchors to links

* fixed 404

* fixed typos

* added more anchors

* removed sphinxcontrib-napoleon dependence

* removed outdated line in Travis config

* fixed max-width of the ReadTheDocs theme

* added horizontal align to images

4aa32967