[R-package][docs] made roxygen2 tags explicit and cleaned up documentation (#2688)

* [R-package] made roxygen2 tags explicit and cleaned up documentation * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/man/lightgbm.Rd Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * [R-package] moved @name to the top of roxygen blocks and removed some inaccurate information in documentation on parameters Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

[R-package][docs] made roxygen2 tags explicit and cleaned up documentation (#2688)
* [R-package] made roxygen2 tags explicit and cleaned up documentation * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/man/lightgbm.Rd Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * [R-package] moved @name to the top of roxygen blocks and removed some inaccurate information in documentation on parameters Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
f2afb2cd · James Lamb · Nikita Titov · c7ae833e · f2afb2cd · f2afb2cd
Commit f2afb2cd authored Jan 20, 2020 by James Lamb Committed by Nikita Titov Jan 20, 2020
20 changed files
--- a/R-package/man/agaricus.test.Rd
+++ b/R-package/man/agaricus.test.Rd
@@ -11,15 +11,13 @@ data(agaricus.test)
 }
 \description{
 This data set is originally from the Mushroom data set,
-UCI Machine Learning Repository.
-}
-\details{
-This data set includes the following fields:
+             UCI Machine Learning Repository.
+             This data set includes the following fields:

-\itemize{
- \item \code{label} the label for each record
- \item \code{data} a sparse Matrix of \code{dgCMatrix} class, with 126 columns.
-}
+             \itemize{
+                 \item{\code{label}: the label for each record}
+                 \item{\code{data}: a sparse Matrix of \code{dgCMatrix} class, with 126 columns.}
+             }
 }
 \references{
 https://archive.ics.uci.edu/ml/datasets/Mushroom

--- a/R-package/man/agaricus.train.Rd
+++ b/R-package/man/agaricus.train.Rd
@@ -11,15 +11,13 @@ data(agaricus.train)
 }
 \description{
 This data set is originally from the Mushroom data set,
-UCI Machine Learning Repository.
-}
-\details{
-This data set includes the following fields:
+             UCI Machine Learning Repository.
+             This data set includes the following fields:

-\itemize{
- \item \code{label} the label for each record
- \item \code{data} a sparse Matrix of \code{dgCMatrix} class, with 126 columns.
-}
+              \itemize{
+                  \item{\code{label}: the label for each record}
+                  \item{\code{data}: a sparse Matrix of \code{dgCMatrix} class, with 126 columns.}
+               }
 }
 \references{
 https://archive.ics.uci.edu/ml/datasets/Mushroom

--- a/R-package/man/bank.Rd
+++ b/R-package/man/bank.Rd
@@ -10,11 +10,10 @@ data(bank)
 }
 \description{
 This data set is originally from the Bank Marketing data set,
-UCI Machine Learning Repository.
-}
-\details{
-It contains only the following: bank.csv with 10% of the examples and 17 inputs,
-randomly selected from 3 (older version of this dataset with less inputs).
+             UCI Machine Learning Repository.
+
+             It contains only the following: bank.csv with 10% of the examples and 17 inputs,
+             randomly selected from 3 (older version of this dataset with less inputs).
 }
 \references{
 http://archive.ics.uci.edu/ml/datasets/Bank+Marketing

--- a/R-package/man/dimnames.lgb.Dataset.Rd
+++ b/R-package/man/dimnames.lgb.Dataset.Rd
@@ -17,7 +17,7 @@ and the second one is column names}
 }
 \description{
 Only column names are supported for \code{lgb.Dataset}, thus setting of
-row names would have no effect and returned row names would be NULL.
+             row names would have no effect and returned row names would be NULL.
 }
 \details{
 Generic \code{dimnames} methods are used by \code{colnames}.

--- a/R-package/man/getinfo.Rd
+++ b/R-package/man/getinfo.Rd
@@ -20,7 +20,7 @@ getinfo(dataset, ...)
 info data
 }
 \description{
-Get information of an \code{lgb.Dataset} object
+Get one attribute of a \code{lgb.Dataset}
 }
 \details{
 The \code{name} field can be one of the following:

--- a/R-package/man/lgb.Dataset.Rd
+++ b/R-package/man/lgb.Dataset.Rd
@@ -37,7 +37,7 @@ constructed dataset
 }
 \description{
 Construct \code{lgb.Dataset} object from dense matrix, sparse matrix
-or local file (that was created previously by saving an \code{lgb.Dataset}).
+             or local file (that was created previously by saving an \code{lgb.Dataset}).
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.Dataset.save.Rd
+++ b/R-package/man/lgb.Dataset.save.Rd
@@ -16,7 +16,7 @@ passed dataset
 }
 \description{
 Please note that \code{init_score} is not saved in binary file.
-If you need it, please set it again after loading Dataset.
+             If you need it, please set it again after loading Dataset.
 }
 \examples{
 library(lightgbm)
@@ -24,5 +24,4 @@ data(agaricus.train, package = "lightgbm")
 train <- agaricus.train
 dtrain <- lgb.Dataset(train$data, label = train$label)
 lgb.Dataset.save(dtrain, "data.bin")
-
 }
--- a/R-package/man/lgb.Dataset.set.categorical.Rd
+++ b/R-package/man/lgb.Dataset.set.categorical.Rd
@@ -9,13 +9,16 @@ lgb.Dataset.set.categorical(dataset, categorical_feature)
 \arguments{
 \item{dataset}{object of class \code{lgb.Dataset}}

-\item{categorical_feature}{categorical features}
+\item{categorical_feature}{categorical features. This can either be a character vector of feature
+names or an integer vector with the indices of the features (e.g.
+\code{c(1L, 10L)} to say "the first and tenth columns").}
 }
 \value{
 passed dataset
 }
 \description{
-Set categorical feature of \code{lgb.Dataset}
+Set the categorical features of an \code{lgb.Dataset} object. Use this function
+             to tell LightGBM which features should be treated as categorical.
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.cv.Rd
+++ b/R-package/man/lgb.cv.Rd
@@ -66,9 +66,9 @@ the \code{nfold} and \code{stratified} parameters are ignored.}

 \item{colnames}{feature names, if not null, will use this to overwrite the names in dataset}

-\item{categorical_feature}{list of str or int
-type int represents index,
-type str represents feature names}
+\item{categorical_feature}{categorical features. This can either be a character vector of feature
+names or an integer vector with the indices of the features (e.g.
+\code{c(1L, 10L)} to say "the first and tenth columns").}

 \item{early_stopping_rounds}{int. Activates early stopping. Requires at least one validation data
 and one metric. If there's more than one, will check all of them
@@ -82,11 +82,11 @@ into a predictor model which frees up memory and the original datasets}

 \item{...}{other parameters, see Parameters.rst for more information. A few key parameters:
 \itemize{
-    \item{boosting}{Boosting type. \code{"gbdt"} or \code{"dart"}}
-    \item{num_leaves}{number of leaves in one tree. defaults to 127}
-    \item{max_depth}{Limit the max depth for tree model. This is used to deal with
+    \item{\code{boosting}: Boosting type. \code{"gbdt"}, \code{"rf"}, \code{"dart"} or \code{"goss"}.}
+    \item{\code{num_leaves}: Maximum number of leaves in one tree.}
+    \item{\code{max_depth}: Limit the max depth for tree model. This is used to deal with
                     overfit when #data is small. Tree still grow by leaf-wise.}
-    \item{num_threads}{Number of threads for LightGBM. For the best speed, set this to
+    \item{\code{num_threads}: Number of threads for LightGBM. For the best speed, set this to
                       the number of real CPU cores, not the number of threads (most
                       CPU using hyper-threading to generate 2 threads per CPU core).}
 }}

--- a/R-package/man/lgb.importance.Rd
+++ b/R-package/man/lgb.importance.Rd
@@ -14,10 +14,10 @@ lgb.importance(model, percentage = TRUE)
 \value{
 For a tree model, a \code{data.table} with the following columns:
 \itemize{
-  \item \code{Feature} Feature names in the model.
-  \item \code{Gain} The total gain of this feature's splits.
-  \item \code{Cover} The number of observation related to this feature.
-  \item \code{Frequency} The number of times a feature splited in trees.
+  \item{\code{Feature}: Feature names in the model.}
+  \item{\code{Gain}: The total gain of this feature's splits.}
+  \item{\code{Cover}: The number of observation related to this feature.}
+  \item{\code{Frequency}: The number of times a feature splited in trees.}
 }
 }
 \description{

--- a/R-package/man/lgb.interprete.Rd
+++ b/R-package/man/lgb.interprete.Rd
@@ -19,8 +19,8 @@ lgb.interprete(model, data, idxset, num_iteration = NULL)
 For regression, binary classification and lambdarank model, a \code{list} of \code{data.table}
        with the following columns:
        \itemize{
-            \item \code{Feature} Feature names in the model.
-            \item \code{Contribution} The total contribution of this feature's splits.
+            \item{\code{Feature}: Feature names in the model.}
+            \item{\code{Contribution}: The total contribution of this feature's splits.}
        }
        For multiclass classification, a \code{list} of \code{data.table} with the Feature column and
        Contribution columns to each class.

--- a/R-package/man/lgb.load.Rd
+++ b/R-package/man/lgb.load.Rd
@@ -15,9 +15,8 @@ lgb.load(filename = NULL, model_str = NULL)
 lgb.Booster
 }
 \description{
-Load LightGBM model from saved model file or string
-Load LightGBM takes in either a file path or model string
-If both are provided, Load will default to loading from file
+Load LightGBM takes in either a file path or model string.
+              If both are provided, Load will default to loading from file
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.model.dt.tree.Rd
+++ b/R-package/man/lgb.model.dt.tree.Rd
@@ -18,21 +18,21 @@ A \code{data.table} with detailed information about model trees' nodes and leafs
 The columns of the \code{data.table} are:

 \itemize{
- \item \code{tree_index}: ID of a tree in a model (integer)
- \item \code{split_index}: ID of a node in a tree (integer)
- \item \code{split_feature}: for a node, it's a feature name (character);
-                             for a leaf, it simply labels it as \code{"NA"}
- \item \code{node_parent}: ID of the parent node for current node (integer)
- \item \code{leaf_index}: ID of a leaf in a tree (integer)
- \item \code{leaf_parent}: ID of the parent node for current leaf (integer)
- \item \code{split_gain}: Split gain of a node
- \item \code{threshold}: Splitting threshold value of a node
- \item \code{decision_type}: Decision type of a node
- \item \code{default_left}: Determine how to handle NA value, TRUE -> Left, FALSE -> Right
- \item \code{internal_value}: Node value
- \item \code{internal_count}: The number of observation collected by a node
- \item \code{leaf_value}: Leaf value
- \item \code{leaf_count}: The number of observation collected by a leaf
+ \item{\code{tree_index}: ID of a tree in a model (integer)}
+ \item{\code{split_index}: ID of a node in a tree (integer)}
+ \item{\code{split_feature}: for a node, it's a feature name (character);
+                             for a leaf, it simply labels it as \code{"NA"}}
+ \item{\code{node_parent}: ID of the parent node for current node (integer)}
+ \item{\code{leaf_index}: ID of a leaf in a tree (integer)}
+ \item{\code{leaf_parent}: ID of the parent node for current leaf (integer)}
+ \item{\code{split_gain}: Split gain of a node}
+ \item{\code{threshold}: Splitting threshold value of a node}
+ \item{\code{decision_type}: Decision type of a node}
+ \item{\code{default_left}: Determine how to handle NA value, TRUE -> Left, FALSE -> Right}
+ \item{\code{internal_value}: Node value}
+ \item{\code{internal_count}: The number of observation collected by a node}
+ \item{\code{leaf_value}: Leaf value}
+ \item{\code{leaf_count}: The number of observation collected by a leaf}
 }
 }
 \description{

--- a/R-package/man/lgb.prepare.Rd
+++ b/R-package/man/lgb.prepare.Rd
@@ -15,8 +15,8 @@ The cleaned dataset. It must be converted to a matrix format (\code{as.matrix})
 }
 \description{
 Attempts to prepare a clean dataset to prepare to put in a \code{lgb.Dataset}.
-Factors and characters are converted to numeric without integers. Please use
-\code{lgb.prepare_rules} if you want to apply this transformation to other datasets.
+             Factors and characters are converted to numeric without integers. Please use
+             \code{\link{lgb.prepare_rules}} if you want to apply this transformation to other datasets.
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.prepare2.Rd
+++ b/R-package/man/lgb.prepare2.Rd
@@ -15,11 +15,11 @@ The cleaned dataset. It must be converted to a matrix format (\code{as.matrix})
 }
 \description{
 Attempts to prepare a clean dataset to prepare to put in a \code{lgb.Dataset}.
-Factors and characters are converted to numeric (specifically: integer).
-Please use \code{lgb.prepare_rules2} if you want to apply this transformation to other datasets.
-This is useful if you have a specific need for integer dataset instead of numeric dataset.
-Note that there are programs which do not support integer-only input. Consider this as a half
-memory technique which is dangerous, especially for LightGBM.
+             Factors and characters are converted to numeric (specifically: integer).
+             Please use \code{\link{lgb.prepare_rules2}} if you want to apply this transformation to
+             other datasets. This is useful if you have a specific need for integer dataset instead
+             of numeric dataset. Note that there are programs which do not support integer-only
+             input. Consider this as a half memory technique which is dangerous, especially for LightGBM.
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.prepare_rules.Rd
+++ b/R-package/man/lgb.prepare_rules.Rd
@@ -18,8 +18,8 @@ A list with the cleaned dataset (\code{data}) and the rules (\code{rules}).
 }
 \description{
 Attempts to prepare a clean dataset to prepare to put in a \code{lgb.Dataset}.
-Factors and characters are converted to numeric. In addition, keeps rules created
-so you can convert other datasets using this converter.
+             Factors and characters are converted to numeric. In addition, keeps rules created
+             so you can convert other datasets using this converter.
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.prepare_rules2.Rd
+++ b/R-package/man/lgb.prepare_rules2.Rd
@@ -18,11 +18,11 @@ A list with the cleaned dataset (\code{data}) and the rules (\code{rules}).
 }
 \description{
 Attempts to prepare a clean dataset to prepare to put in a \code{lgb.Dataset}.
-Factors and characters are converted to numeric (specifically: integer).
-In addition, keeps rules created so you can convert other datasets using this converter.
-This is useful if you have a specific need for integer dataset instead of numeric dataset.
-Note that there are programs which do not support integer-only input.
-Consider this as a half memory technique which is dangerous, especially for LightGBM.
+             Factors and characters are converted to numeric (specifically: integer).
+             In addition, keeps rules created so you can convert other datasets using this converter.
+             This is useful if you have a specific need for integer dataset instead of numeric dataset.
+             Note that there are programs which do not support integer-only input.
+             Consider this as a half memory technique which is dangerous, especially for LightGBM.
 }
 \examples{
 library(lightgbm)

--- a/R-package/man/lgb.save.Rd
+++ b/R-package/man/lgb.save.Rd
@@ -39,5 +39,4 @@ model <- lgb.train(
  , early_stopping_rounds = 5L
 )
 lgb.save(model, "model.txt")
-
 }
--- a/R-package/man/lgb.train.Rd
+++ b/R-package/man/lgb.train.Rd
@@ -65,11 +65,11 @@ original datasets}

 \item{...}{other parameters, see Parameters.rst for more information. A few key parameters:
 \itemize{
-    \item{boosting}{Boosting type. \code{"gbdt"} or \code{"dart"}}
-    \item{num_leaves}{number of leaves in one tree. defaults to 127}
-    \item{max_depth}{Limit the max depth for tree model. This is used to deal with
+    \item{\code{boosting}: Boosting type. \code{"gbdt"}, \code{"rf"}, \code{"dart"} or \code{"goss"}.}
+    \item{\code{num_leaves}: Maximum number of leaves in one tree.}
+    \item{\code{max_depth}: Limit the max depth for tree model. This is used to deal with
                     overfit when #data is small. Tree still grow by leaf-wise.}
-    \item{num_threads}{Number of threads for LightGBM. For the best speed, set this to
+    \item{\code{num_threads}: Number of threads for LightGBM. For the best speed, set this to
                       the number of real CPU cores, not the number of threads (most
                       CPU using hyper-threading to generate 2 threads per CPU core).}
 }}

--- a/R-package/man/lgb.unloader.Rd
+++ b/R-package/man/lgb.unloader.Rd
@@ -21,9 +21,9 @@ environment. Defaults to \code{FALSE} which means to not remove them.}
 NULL invisibly.
 }
 \description{
-Attempts to unload LightGBM packages so you can remove objects cleanly without having to restart R.
-This is useful for instance if an object becomes stuck for no apparent reason and you do not want
-to restart R to fix the lost object.
+Attempts to unload LightGBM packages so you can remove objects cleanly without
+             having to restart R. This is useful for instance if an object becomes stuck for no
+             apparent reason and you do not want to restart R to fix the lost object.
 }
 \examples{
 library(lightgbm)