Commit 6bb61ea3 authored by James Lamb's avatar James Lamb Committed by Guolin Ke
Browse files

Fixed misc CRAN issues (#1260)

* Fixed misc CRAN issues

* Added additional details to R-package DESCRIPTION
parent ebb07f01
^build_package.R$
...@@ -5,13 +5,16 @@ Version: 2.1.0 ...@@ -5,13 +5,16 @@ Version: 2.1.0
Date: 2018-01-25 Date: 2018-01-25
Author: Guolin Ke <guolin.ke@microsoft.com> Author: Guolin Ke <guolin.ke@microsoft.com>
Maintainer: Guolin Ke <guolin.ke@microsoft.com> Maintainer: Guolin Ke <guolin.ke@microsoft.com>
Description: LightGBM is a gradient boosting framework that uses tree based learning algorithms. Description: Tree based algorithms can be improved by introducing boosting frameworks. LightGBM is one such framework, and this package offers an R interface to work with it.
It is designed to be distributed and efficient with the following advantages: It is designed to be distributed and efficient with the following advantages:
1. Faster training speed and higher efficiency. 1. Faster training speed and higher efficiency.
2. Lower memory usage. 2. Lower memory usage.
3. Better accuracy. 3. Better accuracy.
4. Parallel learning supported. 4. Parallel learning supported.
5. Capable of handling large-scale data. 5. Capable of handling large-scale data.
In recognition of these advantages, LightGBM has being widely-used in many winning solutions of machine learning competitions.
Comparison experiments on public datasets suggest that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. In addition, parallel experiments suggest that in certain circumstances, LightGBM can achieve a linear speed-up in training time by using multiple machines.
License: MIT + file LICENSE License: MIT + file LICENSE
URL: https://github.com/Microsoft/LightGBM URL: https://github.com/Microsoft/LightGBM
BugReports: https://github.com/Microsoft/LightGBM/issues BugReports: https://github.com/Microsoft/LightGBM/issues
...@@ -30,6 +33,7 @@ Depends: ...@@ -30,6 +33,7 @@ Depends:
R (>= 3.0), R (>= 3.0),
R6 (>= 2.0) R6 (>= 2.0)
Imports: Imports:
graphics,
methods, methods,
Matrix (>= 1.1-0), Matrix (>= 1.1-0),
data.table (>= 1.9.6), data.table (>= 1.9.6),
......
...@@ -38,6 +38,9 @@ export(slice) ...@@ -38,6 +38,9 @@ export(slice)
import(methods) import(methods)
importFrom(R6,R6Class) importFrom(R6,R6Class)
importFrom(data.table,":=") importFrom(data.table,":=")
importFrom(data.table,set)
importFrom(graphics,barplot)
importFrom(graphics,par)
importFrom(magrittr,"%>%") importFrom(magrittr,"%>%")
importFrom(magrittr,"%T>%") importFrom(magrittr,"%T>%")
useDynLib(lib_lightgbm) useDynLib(lib_lightgbm)
...@@ -619,7 +619,8 @@ Booster <- R6Class( ...@@ -619,7 +619,8 @@ Booster <- R6Class(
#' @param header only used for prediction for text file. True if text file has header #' @param header only used for prediction for text file. True if text file has header
#' @param reshape whether to reshape the vector of predictions to a matrix form when there are several #' @param reshape whether to reshape the vector of predictions to a matrix form when there are several
#' prediction outputs per case. #' prediction outputs per case.
#' @param ... Additional named arguments passed to the \code{predict()} method of
#' the \code{lgb.Booster} object passed to \code{object}.
#' @return #' @return
#' For regression or binary classification, it returns a vector of length \code{nrows(data)}. #' For regression or binary classification, it returns a vector of length \code{nrows(data)}.
#' For multiclass classification, either a \code{num_class * nrows(data)} vector or #' For multiclass classification, either a \code{num_class * nrows(data)} vector or
......
...@@ -16,10 +16,8 @@ CVBooster <- R6Class( ...@@ -16,10 +16,8 @@ CVBooster <- R6Class(
) )
) )
#' Main CV logic for LightGBM #' @title Main CV logic for LightGBM
#' #' @name lgb.cv
#' Main CV logic for LightGBM
#'
#' @param params List of parameters #' @param params List of parameters
#' @param data a \code{lgb.Dataset} object, used for CV #' @param data a \code{lgb.Dataset} object, used for CV
#' @param nrounds number of CV rounds #' @param nrounds number of CV rounds
......
...@@ -3,6 +3,8 @@ ...@@ -3,6 +3,8 @@
#' Parse a LightGBM model json dump into a \code{data.table} structure. #' Parse a LightGBM model json dump into a \code{data.table} structure.
#' #'
#' @param model object of class \code{lgb.Booster} #' @param model object of class \code{lgb.Booster}
#' @param num_iteration number of iterations you want to predict with. NULL or
#' <= 0 means use best iteration
#' #'
#' @return #' @return
#' A \code{data.table} with detailed information about model trees' nodes and leafs. #' A \code{data.table} with detailed information about model trees' nodes and leafs.
......
...@@ -31,7 +31,7 @@ ...@@ -31,7 +31,7 @@
#' tree_imp <- lgb.importance(model, percentage = TRUE) #' tree_imp <- lgb.importance(model, percentage = TRUE)
#' lgb.plot.importance(tree_imp, top_n = 10, measure = "Gain") #' lgb.plot.importance(tree_imp, top_n = 10, measure = "Gain")
#' } #' }
#' #' @importFrom graphics barplot par
#' @export #' @export
lgb.plot.importance <- function(tree_imp, lgb.plot.importance <- function(tree_imp,
top_n = 10, top_n = 10,
...@@ -54,22 +54,24 @@ lgb.plot.importance <- function(tree_imp, ...@@ -54,22 +54,24 @@ lgb.plot.importance <- function(tree_imp,
} }
# Refresh plot # Refresh plot
op <- par(no.readonly = TRUE) op <- graphics::par(no.readonly = TRUE)
on.exit(par(op)) on.exit(graphics::par(op))
# Do some magic plotting # Do some magic plotting
par(mar = op$mar %>% magrittr::inset(., 2, left_margin)) graphics::par(mar = op$mar %>% magrittr::inset(., 2, left_margin))
# Do plot # Do plot
tree_imp[.N:1, tree_imp[.N:1,
barplot(height = get(measure), graphics::barplot(
height = get(measure),
names.arg = Feature, names.arg = Feature,
horiz = TRUE, horiz = TRUE,
border = NA, border = NA,
main = "Feature Importance", main = "Feature Importance",
xlab = measure, xlab = measure,
cex.names = cex, cex.names = cex,
las = 1)] las = 1
)]
# Return invisibly # Return invisibly
invisible(tree_imp) invisible(tree_imp)
......
...@@ -36,7 +36,7 @@ ...@@ -36,7 +36,7 @@
#' tree_interpretation <- lgb.interprete(model, test$data, 1:5) #' tree_interpretation <- lgb.interprete(model, test$data, 1:5)
#' lgb.plot.interpretation(tree_interpretation[[1]], top_n = 10) #' lgb.plot.interpretation(tree_interpretation[[1]], top_n = 10)
#' } #' }
#' #' @importFrom graphics barplot par
#' @export #' @export
lgb.plot.interpretation <- function(tree_interpretation_dt, lgb.plot.interpretation <- function(tree_interpretation_dt,
top_n = 10, top_n = 10,
...@@ -48,11 +48,11 @@ lgb.plot.interpretation <- function(tree_interpretation_dt, ...@@ -48,11 +48,11 @@ lgb.plot.interpretation <- function(tree_interpretation_dt,
num_class <- ncol(tree_interpretation_dt) - 1 num_class <- ncol(tree_interpretation_dt) - 1
# Refresh plot # Refresh plot
op <- par(no.readonly = TRUE) op <- graphics::par(no.readonly = TRUE)
on.exit(par(op)) on.exit(graphics::par(op))
# Do some magic plotting # Do some magic plotting
par(mar = op$mar %>% magrittr::inset(., 1:3, c(3, left_margin, 2))) graphics::par(mar = op$mar %>% magrittr::inset(., 1:3, c(3, left_margin, 2)))
# Check for number of classes # Check for number of classes
if (num_class == 1) { if (num_class == 1) {
...@@ -70,7 +70,7 @@ lgb.plot.interpretation <- function(tree_interpretation_dt, ...@@ -70,7 +70,7 @@ lgb.plot.interpretation <- function(tree_interpretation_dt,
ncol = cols, nrow = ceiling(num_class / cols)) ncol = cols, nrow = ceiling(num_class / cols))
# Shape output # Shape output
par(mfcol = c(nrow(layout_mat), ncol(layout_mat))) graphics::par(mfcol = c(nrow(layout_mat), ncol(layout_mat)))
# Loop throughout all classes # Loop throughout all classes
for (i in seq_len(num_class)) { for (i in seq_len(num_class)) {
...@@ -102,14 +102,16 @@ multiple.tree.plot.interpretation <- function(tree_interpretation, ...@@ -102,14 +102,16 @@ multiple.tree.plot.interpretation <- function(tree_interpretation,
# Do plot # Do plot
tree_interpretation[.N:1, tree_interpretation[.N:1,
barplot(height = Contribution, graphics::barplot(
height = Contribution,
names.arg = Feature, names.arg = Feature,
horiz = TRUE, horiz = TRUE,
col = ifelse(Contribution > 0, "firebrick", "steelblue"), col = ifelse(Contribution > 0, "firebrick", "steelblue"),
border = NA, border = NA,
main = title, main = title,
cex.names = cex, cex.names = cex,
las = 1)] las = 1
)]
# Return invisibly # Return invisibly
invisible(NULL) invisible(NULL)
......
#' Main training logic for LightGBM #' @title Main training logic for LightGBM
#' #' @name lgb.train
#' @param params List of parameters #' @param params List of parameters
#' @param data a \code{lgb.Dataset} object, used for training #' @param data a \code{lgb.Dataset} object, used for training
#' @param nrounds number of training rounds #' @param nrounds number of training rounds
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
#' #'
#' Attempts to unload LightGBM packages so you can remove objects cleanly without having to restart R. This is useful for instance if an object becomes stuck for no apparent reason and you do not want to restart R to fix the lost object. #' Attempts to unload LightGBM packages so you can remove objects cleanly without having to restart R. This is useful for instance if an object becomes stuck for no apparent reason and you do not want to restart R to fix the lost object.
#' #'
#' @param restart Whether to reload \code{LightGBM} immediately after detaching from R. Defaults to \code{TRUE} which means automatically reload \code{LightGBM} once unloading is performed. #' @param restore Whether to reload \code{LightGBM} immediately after detaching from R. Defaults to \code{TRUE} which means automatically reload \code{LightGBM} once unloading is performed.
#' @param wipe Whether to wipe all \code{lgb.Dataset} and \code{lgb.Booster} from the global environment. Defaults to \code{FALSE} which means to not remove them. #' @param wipe Whether to wipe all \code{lgb.Dataset} and \code{lgb.Booster} from the global environment. Defaults to \code{FALSE} which means to not remove them.
#' @param envir The environment to perform wiping on if \code{wipe == TRUE}. Defaults to \code{.GlobalEnv} which is the global environment. #' @param envir The environment to perform wiping on if \code{wipe == TRUE}. Defaults to \code{.GlobalEnv} which is the global environment.
#' #'
......
...@@ -122,5 +122,28 @@ NULL ...@@ -122,5 +122,28 @@ NULL
# Various imports # Various imports
#' @import methods #' @import methods
#' @importFrom R6 R6Class #' @importFrom R6 R6Class
#' @useDynLib lightgbm #' @useDynLib lib_lightgbm
NULL NULL
# Suppress false positive warnings from R CMD CHECK about
# "unrecognized global variable"
globalVariables(c(
"."
, ".N"
, ".SD"
, "Contribution"
, "Cover"
, "Feature"
, "Frequency"
, "Gain"
, "internal_count"
, "internal_value"
, "leaf_index"
, "leaf_parent"
, "leaf_value"
, "node_parent"
, "split_feature"
, "split_gain"
, "split_index"
, "tree_index"
))
...@@ -8,6 +8,9 @@ lgb.model.dt.tree(model, num_iteration = NULL) ...@@ -8,6 +8,9 @@ lgb.model.dt.tree(model, num_iteration = NULL)
} }
\arguments{ \arguments{
\item{model}{object of class \code{lgb.Booster}} \item{model}{object of class \code{lgb.Booster}}
\item{num_iteration}{number of iterations you want to predict with. NULL or
<= 0 means use best iteration}
} }
\value{ \value{
A \code{data.table} with detailed information about model trees' nodes and leafs. A \code{data.table} with detailed information about model trees' nodes and leafs.
...@@ -25,6 +28,7 @@ The columns of the \code{data.table} are: ...@@ -25,6 +28,7 @@ The columns of the \code{data.table} are:
\item \code{split_gain}: Split gain of a node \item \code{split_gain}: Split gain of a node
\item \code{threshold}: Spliting threshold value of a node \item \code{threshold}: Spliting threshold value of a node
\item \code{decision_type}: Decision type of a node \item \code{decision_type}: Decision type of a node
\item \code{default_left}: Determine how to handle NA value, TRUE -> Left, FALSE -> Right
\item \code{internal_value}: Node value \item \code{internal_value}: Node value
\item \code{internal_count}: The number of observation collected by a node \item \code{internal_count}: The number of observation collected by a node
\item \code{leaf_value}: Leaf value \item \code{leaf_value}: Leaf value
......
...@@ -44,5 +44,4 @@ model <- lgb.train(params, dtrain, 20) ...@@ -44,5 +44,4 @@ model <- lgb.train(params, dtrain, 20)
tree_imp <- lgb.importance(model, percentage = TRUE) tree_imp <- lgb.importance(model, percentage = TRUE)
lgb.plot.importance(tree_imp, top_n = 10, measure = "Gain") lgb.plot.importance(tree_imp, top_n = 10, measure = "Gain")
} }
} }
...@@ -49,5 +49,4 @@ model <- lgb.train(params, dtrain, 20) ...@@ -49,5 +49,4 @@ model <- lgb.train(params, dtrain, 20)
tree_interpretation <- lgb.interprete(model, test$data, 1:5) tree_interpretation <- lgb.interprete(model, test$data, 1:5)
lgb.plot.interpretation(tree_interpretation[[1]], top_n = 10) lgb.plot.interpretation(tree_interpretation[[1]], top_n = 10)
} }
} }
...@@ -75,7 +75,7 @@ If early stopping occurs, the model will have 'best_iter' field} ...@@ -75,7 +75,7 @@ If early stopping occurs, the model will have 'best_iter' field}
\item{callbacks}{list of callback functions \item{callbacks}{list of callback functions
List of callback functions that are applied at each iteration.} List of callback functions that are applied at each iteration.}
\item{...}{other parameters, see parameters.md for more informations} \item{...}{other parameters, see Parameters.rst for more informations}
\item{valids}{a list of \code{lgb.Dataset} objects, used for validation} \item{valids}{a list of \code{lgb.Dataset} objects, used for validation}
...@@ -135,7 +135,7 @@ If early stopping occurs, the model will have 'best_iter' field} ...@@ -135,7 +135,7 @@ If early stopping occurs, the model will have 'best_iter' field}
\item{callbacks}{list of callback functions \item{callbacks}{list of callback functions
List of callback functions that are applied at each iteration.} List of callback functions that are applied at each iteration.}
\item{...}{other parameters, see parameters.md for more informations} \item{...}{other parameters, see Parameters.rst for more informations}
} }
\value{ \value{
a trained model \code{lgb.CVBooster}. a trained model \code{lgb.CVBooster}.
...@@ -143,10 +143,6 @@ a trained model \code{lgb.CVBooster}. ...@@ -143,10 +143,6 @@ a trained model \code{lgb.CVBooster}.
a trained booster model \code{lgb.Booster}. a trained booster model \code{lgb.Booster}.
} }
\description{ \description{
Main CV logic for LightGBM
Main training logic for LightGBM
Simple interface for training an lightgbm model. Simple interface for training an lightgbm model.
Its documentation is combined with lgb.train. Its documentation is combined with lgb.train.
} }
......
...@@ -7,11 +7,11 @@ ...@@ -7,11 +7,11 @@
lgb.unloader(restore = TRUE, wipe = FALSE, envir = .GlobalEnv) lgb.unloader(restore = TRUE, wipe = FALSE, envir = .GlobalEnv)
} }
\arguments{ \arguments{
\item{restore}{Whether to reload \code{LightGBM} immediately after detaching from R. Defaults to \code{TRUE} which means automatically reload \code{LightGBM} once unloading is performed.}
\item{wipe}{Whether to wipe all \code{lgb.Dataset} and \code{lgb.Booster} from the global environment. Defaults to \code{FALSE} which means to not remove them.} \item{wipe}{Whether to wipe all \code{lgb.Dataset} and \code{lgb.Booster} from the global environment. Defaults to \code{FALSE} which means to not remove them.}
\item{envir}{The environment to perform wiping on if \code{wipe == TRUE}. Defaults to \code{.GlobalEnv} which is the global environment.} \item{envir}{The environment to perform wiping on if \code{wipe == TRUE}. Defaults to \code{.GlobalEnv} which is the global environment.}
\item{restart}{Whether to reload \code{LightGBM} immediately after detaching from R. Defaults to \code{TRUE} which means automatically reload \code{LightGBM} once unloading is performed.}
} }
\value{ \value{
NULL invisibly. NULL invisibly.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment