1. 05 May, 2020 1 commit
  2. 13 Apr, 2020 1 commit
  3. 10 Apr, 2020 2 commits
  4. 20 Mar, 2020 2 commits
    • Alberto Ferreira's avatar
      Fix SWIG methods that return char** (#2850) · 91185c3a
      Alberto Ferreira authored
      
      
      * [swig] Fix SWIG methods that return char** with StringArray.
      
      + [new] Add StringArray class to manage and manipulate arrays of fixed-length strings:
      
        This class is now used to wrap any char** parameters, manage memory and
        manipulate the strings.
      
        Such class is defined at swig/StringArray.hpp and wrapped in StringArray.i.
      
      + [API+fix] Wrap LGBM_BoosterGetFeatureNames it resulted in segfault before:
      
        Added wrapper LGBM_BoosterGetFeatureNamesSWIG(BoosterHandle) that
        only receives the booster handle and figures how much memory to allocate
        for strings and returns a StringArray which can be easily converted to String[].
      
      + [API+safety] For consistency, LGBM_BoosterGetEvalNamesSWIG was wrapped as well:
      
        * Refactor to detect any kind of errors and removed all the parameters
          besides the BoosterHandle (much simpler API to use in Java).
        * No assumptions are made about the required string space necessary (128 before).
        * The amount of required string memory is computed internally
      
      + [safety] No possibility of undefined behaviour
      
        The two methods wrapped above now compute the necessary string storage space
        prior to allocation, as the low-level C API calls would crash the process
        irreversibly if they write more memory than which is passed to them.
      
      * Changes to C API and wrappers support char**
      
      To support the latest SWIG changes that enable proper char**
      return support that is safe, the C API was changed.
      
      The respecive wrappers in R and Python were changed too.
      
      * Cleanup indentation in new lightgbm_R.cpp code
      
      * Adress review code-style comments.
      
      * Update swig/StringArray.hpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update python-package/lightgbm/basic.py
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/lightgbm_R.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avataralberto.ferreira <alberto.ferreira@feedzai.com>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      91185c3a
    • Lukas Pfannschmidt's avatar
      [python] handle RandomState object in Scikit-learn Api (#2904) · cf0a992e
      Lukas Pfannschmidt authored
      
      
      * Add handling of RandomState object, which is standard for sklearn methods.
      
      LightGBM expects an integer seed instead of an object.
      If passed object is RandomState, we choose random integer based on its state to seed the underlying low level code.
      While chosen random integer is only in the range between 1 and 1e10 I expect it to have enough entropy (?) to not matter in practice.
      
      * Add RandomState object to random_state docstring.
      
      * remove blank line
      
      * Use property to handle setting random_state.
      This enables setting cloned estimators with the set_params method in sklearn.
      
      * Add docstring to attribute.
      
      * Fix and simplify docstring.
      
      * Add test case.
      
      * Use maximal int for datatype in seed derivation.
      
      * Replace random_state property with interfacing in fit method.
      Derives int seed for C code only when fitting and keeps RandomState object as param.
      
      * Adapt unit test to property change.
      
      * Extended test case and docstring
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Add more equality checks (feature importance, best iteration/score).
      
      * Add equality comparison of boosters represented by strings.
      Remove useless best_iteration_ comparison (we do not use early_stopping).
      
      * fix whitespace
      
      * Test if two subsequent fits produce different models
      
      * Apply suggestions from code review
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      cf0a992e
  5. 16 Mar, 2020 2 commits
  6. 06 Mar, 2020 1 commit
  7. 26 Feb, 2020 1 commit
  8. 20 Feb, 2020 1 commit
    • Joan Fontanals's avatar
      Add capability to get possible max and min values for a model (#2737) · 18e7de4f
      Joan Fontanals authored
      
      
      * Add capability to get possible max and min values for a model
      
      * Change implementation to have return value in tree.cpp, change naming to upper and lower bound, move implementation to gdbt.cpp
      
      * Update include/LightGBM/c_api.h
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Change iteration to avoid potential overflow, add bindings to R and Python and a basic test
      
      * Adjust test values
      
      * Consider const correctness and multithreading protection
      
      * Update test values
      
      * Update test values
      
      * Add test to check that model is exactly the same in all platforms
      
      * Try to parse the model to get the expected values
      
      * Try to parse the model to get the expected values
      
      * Fix implementation, num_leaves can be lower than the leaf_value_ size
      
      * Do not check for num_leaves to be smaller than actual size and get back to test with hardcoded value
      
      * Change test order
      
      * Add gpu_use_dp option in test
      
      * Remove helper test method
      
      * Update src/c_api.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/io/tree.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update src/io/tree.cpp
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Update tests/python_package_test/test_basic.py
      Co-Authored-By: default avatarNikita Titov <nekit94-08@mail.ru>
      
      * Remoove imports
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      18e7de4f
  9. 19 Feb, 2020 1 commit
    • Guolin Ke's avatar
      [python] [R-package] refine the parameters for Dataset (#2594) · 9f79e840
      Guolin Ke authored
      
      
      * reset
      
      * fix a bug
      
      * fix test
      
      * Update c_api.h
      
      * support to no filter features by min_data
      
      * add warning in reset config
      
      * refine warnings for override dataset's parameter
      
      * some cleans
      
      * clean code
      
      * clean code
      
      * refine C API function doxygen comments
      
      * refined new param description
      
      * refined doxygen comments for R API function
      
      * removed stuff related to int8
      
      * break long line in warning message
      
      * removed tests which results cannot be validated anymore
      
      * added test for warnings about unchangeable params
      
      * write parameter from dataset to booster
      
      * consider free_raw_data.
      
      * fix params
      
      * fix bug
      
      * implementing R
      
      * fix typo
      
      * filter params in R
      
      * fix R
      
      * not min_data
      
      * refined tests
      
      * fixed linting
      
      * refine
      
      * pilint
      
      * add docstring
      
      * fix docstring
      
      * R lint
      
      * updated description for C API function
      
      * use param aliases in Python
      
      * fixed typo
      
      * fixed typo
      
      * added more params to test
      
      * removed debug print
      
      * fix dataset construct place
      
      * fix merge bug
      
      * Update feature_histogram.hpp
      
      * add is_sparse back
      
      * remove unused parameters
      
      * fix lint
      
      * add data random seed
      
      * update
      
      * [R-package] centrallized Dataset parameter aliases and added tests on Dataset parameter updating (#2767)
      Co-authored-by: default avatarNikita Titov <nekit94-08@mail.ru>
      Co-authored-by: default avatarJames Lamb <jaylamb20@gmail.com>
      9f79e840
  10. 15 Feb, 2020 1 commit
  11. 06 Feb, 2020 1 commit
  12. 03 Feb, 2020 2 commits
  13. 16 Jan, 2020 1 commit
  14. 14 Jan, 2020 2 commits
  15. 12 Jan, 2020 1 commit
  16. 10 Jan, 2020 1 commit
    • Patrick Ford's avatar
      [python] Output model to a pandas DataFrame (#2592) · 301402c8
      Patrick Ford authored
      * trees_to_df method and unit test added. PEP 8 fixes for integration.
      
      * Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
      
      Post-review changes
      
      * changes from second round of reviews from striker
      
      * third round of review. formatting and added 2 more tests
      
      * replaced pandas dot attribute accessor with string attribute accessor
      
      * dealt with single tree edge case and minor refactor of tests
      
      * slight refactor for checking if tree is a single node
      301402c8
  17. 02 Jan, 2020 1 commit
  18. 29 Dec, 2019 1 commit
  19. 19 Dec, 2019 1 commit
  20. 09 Dec, 2019 1 commit
  21. 08 Dec, 2019 1 commit
  22. 05 Dec, 2019 2 commits
  23. 27 Oct, 2019 1 commit
  24. 22 Oct, 2019 1 commit
  25. 21 Oct, 2019 1 commit
  26. 16 Oct, 2019 1 commit
  27. 15 Oct, 2019 1 commit
  28. 13 Oct, 2019 1 commit
  29. 01 Oct, 2019 1 commit
  30. 26 Sep, 2019 4 commits
  31. 15 Sep, 2019 1 commit
    • kenmatsu4's avatar
      [python] Bug fix for first_metric_only on earlystopping. (#2209) · 84754399
      kenmatsu4 authored
      * Bug fix for first_metric_only if the first metric is train metric.
      
      * Update bug fix for feval issue.
      
      * Disable feval for first_metric_only.
      
      * Additional test items.
      
      * Fix wrong assertEqual settings & formating.
      
      * Change dataset of test.
      
      * Fix random seed for test.
      
      * Modiry assumed test result due to different sklearn verion between CI and local.
      
      * Remove f-string
      
      * Applying variable  assumed test result for test.
      
      * Fix flake8 error.
      
      * Modifying  in accordance with review comments.
      
      * Modifying for pylint.
      
      * simplified tests
      
      * Deleting error criteria `if eval_metric is None`.
      
      * Delete test items of classification.
      
      * Simplifying if condition.
      
      * Applying first_metric_only for sklearn wrapper.
      
      * Modifying test_sklearn for comforming to python 2.x
      
      * Fix flake8 error.
      
      * Additional fix for sklearn and add tests.
      
      * Bug fix and add test cases.
      
      * some refactor
      
      * fixed lint
      
      * fixed lint
      
      * Fix duplicated metrics scores to pass the test.
      
      * Fix the case first_metric_only not in params.
      
      * Converting metrics aliases.
      
      * Add comment.
      
      * Modify comment for pylint.
      
      * Modify comment for pydocstyle.
      
      * Using split test set for two eval_set.
      
      * added test case for metric aliases and length checks
      
      * minor style fixes
      
      * fixed rmse name and alias position
      
      * Fix the case metric=[]
      
      * Fix using env.model._train_data_name
      
      * Fix wrong test condition.
      
      * Move initial process to _init() func.
      
      * Modify test setting for test_sklearn & training data matching on callback.py
      
      * test_sklearn.py
      -> A test case for training is wrong, so fixed.
      
      * callback.py
      -> A condition of if statement for detecting test dataset is wrong, so fixed.
      
      * Support composite name metrics.
      
      * Remove metric check process & reduce redundant test cases.
      
      For #2273 fixed not only the order of metrics in cpp, removing metric check process at callback.py
      
      * Revised according to the matters pointed out on a review.
      
      * increased code readability
      
      * Fix the issue of order of validation set.
      
      * Changing to OrderdDict from default dict for score result.
      
      * added missed check in cv function for first_metric_only and feval co-occurrence
      
      * keep order only for metrics but not for datasets in best_score
      
      * move OrderedDict initialization to init phase
      
      * fixed minor printing issues
      
      * move first metric detection to init phase and split can be performed without checks
      
      * split only once during callback
      
      * removed excess code
      
      * fixed typo in variable name and squashed ifs
      
      * use setdefault
      
      * hotfix
      
      * fixed failing test
      
      * refined tests
      
      * refined sklearn test
      
      * Making "feval" effective on early stopping.
      
      * allow feval and first_metric_only for cv
      
      * removed unused code
      
      * added tests for feval
      
      * fixed printing
      
      * add note about whitespaces in feval name
      
      * Modifying final iteration process in case valid set is  training data.
      84754399