lgb.configure_fast_predict.Rd 6.95 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/lgb.Booster.R
\name{lgb.configure_fast_predict}
\alias{lgb.configure_fast_predict}
\title{Configure Fast Single-Row Predictions}
\usage{
lgb.configure_fast_predict(
  model,
  csr = FALSE,
  start_iteration = NULL,
  num_iteration = NULL,
  type = "response",
  params = list()
)
}
\arguments{
\item{model}{LighGBM model object (class \code{lgb.Booster}).

             \bold{The object will be modified in-place}.}

\item{csr}{Whether the prediction function is going to be called on sparse CSR inputs.
If \code{FALSE}, will be assumed that predictions are going to be called on single-row
regular R matrices.}

\item{start_iteration}{int or None, optional (default=None)
Start index of the iteration to predict.
If None or <= 0, starts from the first iteration.}

\item{num_iteration}{int or None, optional (default=None)
Limit number of iterations in the prediction.
If None, if the best iteration exists and start_iteration is None or <= 0, the
best iteration is used; otherwise, all iterations from start_iteration are used.
If <= 0, all iterations from start_iteration are used (no limits).}

\item{type}{Type of prediction to output. Allowed types are:\itemize{
            \item \code{"response"}: will output the predicted score according to the objective function being
                  optimized (depending on the link function that the objective uses), after applying any necessary
                  transformations - for example, for \code{objective="binary"}, it will output class probabilities.
            \item \code{"class"}: for classification objectives, will output the class with the highest predicted
                  probability. For other objectives, will output the same as "response". Note that \code{"class"} is
                  not a supported type for \link{lgb.configure_fast_predict} (see the documentation of that function
                  for more details).
            \item \code{"raw"}: will output the non-transformed numbers (sum of predictions from boosting iterations'
                  results) from which the "response" number is produced for a given objective function - for example,
                  for \code{objective="binary"}, this corresponds to log-odds. For many objectives such as
                  "regression", since no transformation is applied, the output will be the same as for "response".
            \item \code{"leaf"}: will output the index of the terminal node / leaf at which each observations falls
                  in each tree in the model, outputted as integers, with one column per tree.
            \item \code{"contrib"}: will return the per-feature contributions for each prediction, including an
                  intercept (each feature will produce one column).
            }

            Note that, if using custom objectives, types "class" and "response" will not be available and will
54
55
56
57
58
59
            default towards using "raw" instead.

            If the model was fit through function \link{lightgbm} and it was passed a factor as labels,
            passing the prediction type through \code{params} instead of through this argument might
            result in factor levels for classification objectives not being applied correctly to the
            resulting output.}
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137

\item{params}{a list of additional named parameters. See
\href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#predict-parameters}{
the "Predict Parameters" section of the documentation} for a list of parameters and
valid values. Where these conflict with the values of keyword arguments to this function,
the values in \code{params} take precedence.}
}
\value{
The same \code{model} that was passed as input, invisibly, with the desired
        configuration stored inside it and available to be used in future calls to
        \link{predict.lgb.Booster}.
}
\description{
Pre-configures a LightGBM model object to produce fast single-row predictions
             for a given input data type, prediction type, and parameters.
}
\details{
Calling this function multiple times with different parameters might not override
         the previous configuration and might trigger undefined behavior.

         Any saved configuration for fast predictions might be lost after making a single-row
         prediction of a different type than what was configured (except for types "response" and
         "class", which can be switched between each other at any time without losing the configuration).

         In some situations, setting a fast prediction configuration for one type of prediction
         might cause the prediction function to keep using that configuration for single-row
         predictions even if the requested type of prediction is different from what was configured.

         Note that this function will not accept argument \code{type="class"} - for such cases, one
         can pass \code{type="response"} to this function and then \code{type="class"} to the
         \code{predict} function - the fast configuration will not be lost or altered if the switch
         is between "response" and "class".

         The configuration does not survive de-serializations, so it has to be generated
         anew in every R process that is going to use it (e.g. if loading a model object
         through \code{readRDS}, whatever configuration was there previously will be lost).

         Requesting a different prediction type or passing parameters to \link{predict.lgb.Booster}
         will cause it to ignore the fast-predict configuration and take the slow route instead
         (but be aware that an existing configuration might not always be overriden by supplying
         different parameters or prediction type, so make sure to check that the output is what
         was expected when a prediction is to be made on a single row for something different than
         what is configured).

         Note that, if configuring a non-default prediction type (such as leaf indices),
         then that type must also be passed in the call to \link{predict.lgb.Booster} in
         order for it to use the configuration. This also applies for \code{start_iteration}
         and \code{num_iteration}, but \bold{the \code{params} list must be empty} in the call to \code{predict}.

         Predictions about feature contributions do not allow a fast route for CSR inputs,
         and as such, this function will produce an error if passing \code{csr=TRUE} and
         \code{type = "contrib"} together.
}
\examples{
\donttest{
library(lightgbm)
data(mtcars)
X <- as.matrix(mtcars[, -1L])
y <- mtcars[, 1L]
dtrain <- lgb.Dataset(X, label = y, params = list(max_bin = 5L))
params <- list(min_data_in_leaf = 2L)
model <- lgb.train(
  params = params
 , data = dtrain
 , obj = "regression"
 , nrounds = 5L
 , verbose = -1L
)
lgb.configure_fast_predict(model)

x_single <- X[11L, , drop = FALSE]
predict(model, x_single)

# Will not use it if the prediction to be made
# is different from what was configured
predict(model, x_single, type = "leaf")
}
}