[docs] removed mentions of sparse_threshold (#2758)

704e0830 · Nikita Titov · GitHub · 1e5049a1 · 704e0830 · 704e0830
Unverified Commit 704e0830 authored Feb 13, 2020 by Nikita Titov Committed by GitHub Feb 13, 2020
Showing with 1 addition and 6 deletions

docs/GPU-Performance.rst docs/GPU-Performance.rst +1 -4

docs/GPU-Tutorial.rst docs/GPU-Tutorial.rst +0 -1

include/LightGBM/feature_group.h include/LightGBM/feature_group.h +0 -1

No files found.
--- a/docs/GPU-Performance.rst
+++ b/docs/GPU-Performance.rst
@@ -38,9 +38,7 @@ How to Achieve Good Speedup on GPU
    Also make sure your system is idle (especially when using a shared computer) to get accuracy performance measurements.

 #.  GPU works best on large scale and dense datasets. If dataset is too small, computing it on GPU is inefficient as the data transfer overhead can be significant.
-    For dataset with a mixture of sparse and dense features, you can control the ``sparse_threshold`` parameter to make sure there are enough dense features to process on the GPU.
    If you have categorical features, use the ``categorical_column`` option and input them into LightGBM directly; do not convert them into one-hot variables.
-    Make sure to check the run log and look at the reported number of sparse and dense features.

 #.  To get good speedup with GPU, it is suggested to use a smaller number of bins.
    Setting ``max_bin=63`` is recommended, as it usually does not noticeably affect training accuracy on large datasets, but GPU training can be significantly faster than using the default bin size of 255.
@@ -119,14 +117,13 @@ The following shows the training configuration we used:
    min_data_in_leaf = 1
    min_sum_hessian_in_leaf = 100
    ndcg_eval_at = 1,3,5,10
-    sparse_threshold=1.0
    device = gpu
    gpu_platform_id = 0
    gpu_device_id = 0
    num_thread = 28

 We use the configuration shown above, except for the Bosch dataset, we use a smaller ``learning_rate=0.015`` and set ``min_sum_hessian_in_leaf=5``.
-For all GPU training we set ``sparse_threshold=1``, and vary the max number of bins (255, 63 and 15).
+For all GPU training we vary the max number of bins (255, 63 and 15).
 The GPU implementation is from commit `0bb4a82`_ of LightGBM, when the GPU support was just merged in.

 The following table lists the accuracy on test set that CPU and GPU learner can achieve after 500 iterations.

--- a/docs/GPU-Tutorial.rst
+++ b/docs/GPU-Tutorial.rst
@@ -118,7 +118,6 @@ Now we create a configuration file for LightGBM by running the following command
    min_data_in_leaf = 1
    min_sum_hessian_in_leaf = 100
    ndcg_eval_at = 1,3,5,10
-    sparse_threshold = 1.0
    device = gpu
    gpu_platform_id = 0
    gpu_device_id = 0

--- a/include/LightGBM/feature_group.h
+++ b/include/LightGBM/feature_group.h
@@ -28,7 +28,6 @@ class FeatureGroup {
  * \param bin_mappers Bin mapper for features
  * \param num_data Total number of data
  * \param is_enable_sparse True if enable sparse feature
-  * \param sparse_threshold Threshold for treating a feature as a sparse feature
  */
  FeatureGroup(int num_feature, bool is_multi_val,
    std::vector<std::unique_ptr<BinMapper>>* bin_mappers,