1. 10 Mar, 2021 5 commits
  2. 09 Mar, 2021 2 commits
  3. 05 Mar, 2021 1 commit
  4. 04 Mar, 2021 2 commits
  5. 03 Mar, 2021 1 commit
  6. 02 Mar, 2021 2 commits
  7. 24 Feb, 2021 6 commits
  8. 23 Feb, 2021 2 commits
  9. 22 Feb, 2021 1 commit
  10. 21 Feb, 2021 2 commits
    • mjmckp's avatar
      Fix evalution of linear trees with a single leaf. (#3987) · 605c97b5
      mjmckp authored
      
      
      * Fix index out-of-range exception generated by BaggingHelper on small datasets.
      
      Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.
      
      * Update goss.hpp
      
      * Update goss.hpp
      
      * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)
      
      * Fix incorrect upstream merge
      
      * Add link to LightGBM.NET
      
      * Fix indenting to 2 spaces
      
      * Dummy edit to trigger CI
      
      * Dummy edit to trigger CI
      
      * remove duplicate functions from merge
      
      * Fix evalution of linear trees with a single leaf.
      
      Note that trees without linear models at the leaf always handle num_leaves = 1 as a special case and directly output the leaf value.  Linear trees were missing this special case handling, and hence would have the following issues:
       * Calling Tree::Predict or Tree::PredictByMap would cause an access violation exception attempting to access the first value of the empty split_feature_ array in GetLeaf.
       * PredictionFunLinear would either cause an access violation or go into an infinite loop when attempting to do the equivalent of GetLeaf.
      
      Note also that PredictionFun does not need the same changes as PredictionFunLinear, since both are only called by Tree::AddPredictionToScore, which has a special case for (!is_linear_ && num_leaves_ <= 1) that precludes calling PredictionFun.
      Co-authored-by: default avatarmatthew-peacock <matthew.peacock@whiteoakam.com>
      Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
      605c97b5
    • James Lamb's avatar
      [ci] prefer older binary to new source for R packages on Mac builds (fixes #4008) (#4010) · b1d382ee
      James Lamb authored
      * [ci] prefer older binary to new source for R packages
      
      * back to binary
      
      * preserve choice on Linux
      b1d382ee
  11. 20 Feb, 2021 1 commit
  12. 19 Feb, 2021 3 commits
    • mjmckp's avatar
      Use high precision conversion from double to string in Tree::ToString() for... · 7f91dc66
      mjmckp authored
      
      Use high precision conversion from double to string in Tree::ToString() for new linear tree members (#3938)
      
      * Fix index out-of-range exception generated by BaggingHelper on small datasets.
      
      Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.
      
      * Update goss.hpp
      
      * Update goss.hpp
      
      * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)
      
      * Fix incorrect upstream merge
      
      * Add link to LightGBM.NET
      
      * Fix indenting to 2 spaces
      
      * Dummy edit to trigger CI
      
      * Dummy edit to trigger CI
      
      * remove duplicate functions from merge
      
      * In Tree::ToString() method, print double values for linear tree models with high precision, so that the tree may be accurately reproduced elsewhere (LightGBM.Net in particular)
      
      * Need to use more precise StringToArray instead of StringToArrayFast when parsing double valued arrays for linear trees, to ensure models round-trip via string or file correctly.
      Co-authored-by: default avatarmatthew-peacock <matthew.peacock@whiteoakam.com>
      Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
      7f91dc66
    • James Lamb's avatar
      [docs] Change some 'parallel learning' references to 'distributed learning' (#4000) · 7880b79f
      James Lamb authored
      * [docs] Change some 'parallel learning' references to 'distributed learning'
      
      * found a few more
      
      * one more reference
      7880b79f
    • James Lamb's avatar
  13. 18 Feb, 2021 2 commits
  14. 17 Feb, 2021 2 commits
    • mjmckp's avatar
      Fix for CreatePredictor function and VS2017 Debug build (#3937) · 5321fef6
      mjmckp authored
      
      
      * Fix index out-of-range exception generated by BaggingHelper on small datasets.
      
      Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.
      
      * Update goss.hpp
      
      * Update goss.hpp
      
      * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)
      
      * Fix incorrect upstream merge
      
      * Add link to LightGBM.NET
      
      * Fix indenting to 2 spaces
      
      * Dummy edit to trigger CI
      
      * Dummy edit to trigger CI
      
      * remove duplicate functions from merge
      
      * Fix for CreatePredictor function: for VS2017 in Debug build, the previous version would end up giving an uninitialised prediction function that would throw access violation exceptions when invoked.
      Co-authored-by: default avatarmatthew-peacock <matthew.peacock@whiteoakam.com>
      Co-authored-by: default avatarGuolin Ke <guolin.ke@outlook.com>
      5321fef6
    • Alex Ford's avatar
      Optimize array-from-ctypes in basic.py (#3927) · de8c6105
      Alex Ford authored
      Approximately %80 of runtime when loading "low column count, high row
      count" DataFrames into Datasets is consumed in `np.fromiter`, called
      as part of the `Dataset.get_field` method.
      
      This is particularly pernicious hotspot, as unlike other ctypes-based
      methods this is a hot loop over a python iterator loop and causes
      significant GIL-contention in multi-threaded applications.
      
      Replace `np.fromiter` with a direct call to `np.ctypeslib.as_array`,
      which allows a single-shot `copy` of the underlying array.
      
      This reduces the load time of a ~35 million row categorical dataframe
      with 1 column from ~5 seconds to ~1 second, and allows multi-threaded
      execution.
      de8c6105
  15. 16 Feb, 2021 8 commits