[docs] fixed some typos and grammatical errors (#1738)

ac6951d3 · Alex · Guolin Ke · 7949cf51 · ac6951d3 · ac6951d3
Commit ac6951d3 authored Oct 10, 2018 by Alex Committed by Guolin Ke Oct 10, 2018
7 changed files
--- a/docs/FAQ.rst
+++ b/docs/FAQ.rst
@@ -56,7 +56,7 @@ LightGBM

 --------------

-  **Question 2**: On datasets with million of features, training does not start (or starts after a very long time).
+-  **Question 2**: On datasets with millions of features, training does not start (or starts after a very long time).

 -  **Solution 2**: Use a smaller value for ``bin_construct_sample_cnt`` and a larger value for ``min_data``.


--- a/docs/Features.rst
+++ b/docs/Features.rst
@@ -93,7 +93,7 @@ Feature parallel aims to parallelize the "Find Best Split" in the decision tree.

 4. Worker with best split to perform split, then send the split result of data to other workers.

-5. Other workers split data according received data.
+5. Other workers split data according to received data.

 The shortcomings of traditional feature parallel:


--- a/docs/GPU-Windows.rst
+++ b/docs/GPU-Windows.rst
@@ -75,7 +75,7 @@ OpenCL SDK Installation
 -----------------------

 Installing the appropriate OpenCL SDK requires you to download the correct vendor source SDK.
-You need to know on what you are going to use LightGBM!:
+You need to know what you are going to use LightGBM!:

 -  For running on Intel, get `Intel SDK for OpenCL`_ (NOT RECOMMENDED)


--- a/docs/Parameters.rst
+++ b/docs/Parameters.rst
@@ -485,7 +485,7 @@ IO Parameters

 -  ``sparse_threshold`` :raw-html:`<a id="sparse_threshold" title="Permalink to this parameter" href="#sparse_threshold">&#x1F517;&#xFE0E;</a>`, default = ``0.8``, type = double, constraints: ``0.0 < sparse_threshold <= 1.0``

-   -  the threshold of zero elements precentage for treating a feature as a sparse one
+   -  the threshold of zero elements percentage for treating a feature as a sparse one

 -  ``use_missing`` :raw-html:`<a id="use_missing" title="Permalink to this parameter" href="#use_missing">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool

@@ -493,7 +493,7 @@ IO Parameters

 -  ``zero_as_missing`` :raw-html:`<a id="zero_as_missing" title="Permalink to this parameter" href="#zero_as_missing">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool

-   -  set this to ``true`` to treat all zero as missing values (including the unshown values in libsvm/sparse matrics)
+   -  set this to ``true`` to treat all zero as missing values (including the unshown values in libsvm/sparse matrices)

   -  set this to ``false`` to use ``na`` for representing missing values

@@ -573,7 +573,7 @@ IO Parameters

   -  **Note**: all values should be less than ``Int32.MaxValue`` (2147483647)

-   -  **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers started from zero
+   -  **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers starting from zero

   -  **Note**: all negative values will be treated as **missing values**

@@ -656,7 +656,7 @@ Objective Parameters

   -  used only in ``binary`` application

-   -  set this to ``true`` if training data are unbalance
+   -  set this to ``true`` if training data are unbalanced

   -  **Note**: this parameter cannot be used at the same time with ``scale_pos_weight``, choose only **one** of them

@@ -872,7 +872,7 @@ It means the initial score of the first data row is ``0.5``, second is ``-0.1``,
 The initial score file corresponds with data file line by line, and has per score per line.

 And if the name of data file is ``train.txt``, the initial score file should be named as ``train.txt.init`` and in the same folder as the data file.
-In this case LightGBM will auto load initial score file if it exists.
+In this case, LightGBM will auto load initial score file if it exists.

 Otherwise, you should specify the path to the custom named file with initial scores by the ``initscore_filename`` `parameter <#initscore_filename>`__.

@@ -892,7 +892,7 @@ It means the weight of the first data row is ``1.0``, second is ``0.5``, and so
 The weight file corresponds with data file line by line, and has per weight per line.

 And if the name of data file is ``train.txt``, the weight file should be named as ``train.txt.weight`` and placed in the same folder as the data file.
-In this case LightGBM will load the weight file automatically if it exists.
+In this case, LightGBM will load the weight file automatically if it exists.

 Also, you can include weight column in your data file. Please refer to the ``weight_column`` `parameter <#weight_column>`__ in above.

@@ -914,7 +914,7 @@ It means first ``27`` lines samples belong to one query and next ``18`` lines be
 **Note**: data should be ordered by the query.

 If the name of data file is ``train.txt``, the query file should be named as ``train.txt.query`` and placed in the same folder as the data file.
-In this case LightGBM will load the query file automatically if it exists.
+In this case, LightGBM will load the query file automatically if it exists.

 Also, you can include query/group id column in your data file. Please refer to the ``group_column`` `parameter <#group_column>`__ in above.


--- a/docs/Python-Intro.rst
+++ b/docs/Python-Intro.rst
@@ -112,11 +112,11 @@ or

 And you can use ``Dataset.set_init_score()`` to set initial score, and ``Dataset.set_group()`` to set group/query data for ranking tasks.

-**Memory efficent usage:**
+**Memory efficient usage:**

 The ``Dataset`` object in LightGBM is very memory-efficient, due to it only need to save discrete bins.
 However, Numpy/Array/Pandas object is memory cost.
-If you concern about your memory consumption, you can save memory according to following:
+If you concern about your memory consumption, you can save memory according to the following:

 1. Let ``free_raw_data=True`` (default is ``True``) when constructing the ``Dataset``

@@ -204,7 +204,7 @@ Note that if you specify more than one evaluation metric, all of them will be us
 Prediction
 ----------

-A model that has been trained or loaded can perform predictions on data sets:
+A model that has been trained or loaded can perform predictions on datasets:

 .. code:: python


--- a/docs/Quick-Start.rst
+++ b/docs/Quick-Start.rst
@@ -61,8 +61,8 @@ Run LightGBM

    "./lightgbm" config=your_config_file other_args ...

-Parameters can be set both in config file and command line, and the parameters in command line have higher priority than in config file.
-For example, the following command line will keep ``num_trees=10`` and ignore the same parameter in config file.
+Parameters can be set both in the config file and command line, and the parameters in command line have higher priority than in the config file.
+For example, the following command line will keep ``num_trees=10`` and ignore the same parameter in the config file.

 ::


--- a/include/LightGBM/config.h
+++ b/include/LightGBM/config.h
@@ -470,13 +470,13 @@ public:

  // check = >0.0
  // check = <=1.0
-  // desc = the threshold of zero elements precentage for treating a feature as a sparse one
+  // desc = the threshold of zero elements percentage for treating a feature as a sparse one
  double sparse_threshold = 0.8;

  // desc = set this to ``false`` to disable the special handle of missing value
  bool use_missing = true;

-  // desc = set this to ``true`` to treat all zero as missing values (including the unshown values in libsvm/sparse matrics)
+  // desc = set this to ``true`` to treat all zero as missing values (including the unshown values in libsvm/sparse matrices)
  // desc = set this to ``false`` to use ``na`` for representing missing values
  bool zero_as_missing = false;

@@ -539,7 +539,7 @@ public:
  // desc = **Note**: only supports categorical with ``int`` type
  // desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
  // desc = **Note**: all values should be less than ``Int32.MaxValue`` (2147483647)
-  // desc = **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers started from zero
+  // desc = **Note**: using large values could be memory consuming. Tree decision rule works best when categorical features are presented by consecutive integers starting from zero
  // desc = **Note**: all negative values will be treated as **missing values**
  std::string categorical_feature = "";

@@ -601,7 +601,7 @@ public:

  // alias = unbalance, unbalanced_sets
  // desc = used only in ``binary`` application
-  // desc = set this to ``true`` if training data are unbalance
+  // desc = set this to ``true`` if training data are unbalanced
  // desc = **Note**: this parameter cannot be used at the same time with ``scale_pos_weight``, choose only **one** of them
  bool is_unbalance = false;