Unverified Commit 99ac1ef8 authored by James Lamb's avatar James Lamb Committed by GitHub
Browse files

[docs] add versionadded notes for v4.0.0 features (#5948)

parent d9c7c72a
...@@ -843,6 +843,9 @@ Booster <- R6::R6Class( ...@@ -843,6 +843,9 @@ Booster <- R6::R6Class(
#' passing the prediction type through \code{params} instead of through this argument might #' passing the prediction type through \code{params} instead of through this argument might
#' result in factor levels for classification objectives not being applied correctly to the #' result in factor levels for classification objectives not being applied correctly to the
#' resulting output. #' resulting output.
#'
#' \emph{New in version 4.0.0}
#'
#' @param start_iteration int or None, optional (default=None) #' @param start_iteration int or None, optional (default=None)
#' Start index of the iteration to predict. #' Start index of the iteration to predict.
#' If None or <= 0, starts from the first iteration. #' If None or <= 0, starts from the first iteration.
...@@ -861,6 +864,9 @@ NULL ...@@ -861,6 +864,9 @@ NULL
#' @name predict.lgb.Booster #' @name predict.lgb.Booster
#' @title Predict method for LightGBM model #' @title Predict method for LightGBM model
#' @description Predicted values based on class \code{lgb.Booster} #' @description Predicted values based on class \code{lgb.Booster}
#'
#' \emph{New in version 4.0.0}
#'
#' @details If the model object has been configured for fast single-row predictions through #' @details If the model object has been configured for fast single-row predictions through
#' \link{lgb.configure_fast_predict}, this function will use the prediction parameters #' \link{lgb.configure_fast_predict}, this function will use the prediction parameters
#' that were configured for it - as such, extra prediction parameters should not be passed #' that were configured for it - as such, extra prediction parameters should not be passed
...@@ -878,6 +884,9 @@ NULL ...@@ -878,6 +884,9 @@ NULL
#' If single-row predictions are going to be performed frequently, it is recommended to #' If single-row predictions are going to be performed frequently, it is recommended to
#' pre-configure the model object for fast single-row sparse predictions through function #' pre-configure the model object for fast single-row sparse predictions through function
#' \link{lgb.configure_fast_predict}. #' \link{lgb.configure_fast_predict}.
#'
#' \emph{Changed from 'data', in version 4.0.0}
#'
#' @param header only used for prediction for text file. True if text file has header #' @param header only used for prediction for text file. True if text file has header
#' @param ... ignored #' @param ... ignored
#' @return For prediction types that are meant to always return one output per observation (e.g. when predicting #' @return For prediction types that are meant to always return one output per observation (e.g. when predicting
...@@ -1137,6 +1146,9 @@ lgb.configure_fast_predict <- function(model, ...@@ -1137,6 +1146,9 @@ lgb.configure_fast_predict <- function(model,
#' @name print.lgb.Booster #' @name print.lgb.Booster
#' @title Print method for LightGBM model #' @title Print method for LightGBM model
#' @description Show summary information about a LightGBM model object (same as \code{summary}). #' @description Show summary information about a LightGBM model object (same as \code{summary}).
#'
#' \emph{New in version 4.0.0}
#'
#' @param x Object of class \code{lgb.Booster} #' @param x Object of class \code{lgb.Booster}
#' @param ... Not used #' @param ... Not used
#' @return The same input \code{x}, returned as invisible. #' @return The same input \code{x}, returned as invisible.
...@@ -1186,6 +1198,9 @@ print.lgb.Booster <- function(x, ...) { ...@@ -1186,6 +1198,9 @@ print.lgb.Booster <- function(x, ...) {
#' @name summary.lgb.Booster #' @name summary.lgb.Booster
#' @title Summary method for LightGBM model #' @title Summary method for LightGBM model
#' @description Show summary information about a LightGBM model object (same as \code{print}). #' @description Show summary information about a LightGBM model object (same as \code{print}).
#'
#' \emph{New in version 4.0.0}
#'
#' @param object Object of class \code{lgb.Booster} #' @param object Object of class \code{lgb.Booster}
#' @param ... Not used #' @param ... Not used
#' @return The same input \code{object}, returned as invisible. #' @return The same input \code{object}, returned as invisible.
......
...@@ -4,6 +4,9 @@ ...@@ -4,6 +4,9 @@
#' a copy of the underlying C++ object as raw bytes, which can be used to reconstruct such object after getting #' a copy of the underlying C++ object as raw bytes, which can be used to reconstruct such object after getting
#' serialized and de-serialized, but at the cost of extra memory usage. If these raw bytes are not needed anymore, #' serialized and de-serialized, but at the cost of extra memory usage. If these raw bytes are not needed anymore,
#' they can be dropped through this function in order to save memory. Note that the object will be modified in-place. #' they can be dropped through this function in order to save memory. Note that the object will be modified in-place.
#'
#' \emph{New in version 4.0.0}
#'
#' @param model \code{lgb.Booster} object which was produced with `serializable=TRUE`. #' @param model \code{lgb.Booster} object which was produced with `serializable=TRUE`.
#' #'
#' @return \code{lgb.Booster} (the same `model` object that was passed as input, as invisible). #' @return \code{lgb.Booster} (the same `model` object that was passed as input, as invisible).
......
...@@ -4,6 +4,9 @@ ...@@ -4,6 +4,9 @@
#' be serializable (e.g. cannot save and load with \code{saveRDS} and \code{readRDS}) as it will lack the raw bytes #' be serializable (e.g. cannot save and load with \code{saveRDS} and \code{readRDS}) as it will lack the raw bytes
#' needed to reconstruct its underlying C++ object. This function can be used to forcibly produce those serialized #' needed to reconstruct its underlying C++ object. This function can be used to forcibly produce those serialized
#' raw bytes and make the object serializable. Note that the object will be modified in-place. #' raw bytes and make the object serializable. Note that the object will be modified in-place.
#'
#' \emph{New in version 4.0.0}
#'
#' @param model \code{lgb.Booster} object which was produced with `serializable=FALSE`. #' @param model \code{lgb.Booster} object which was produced with `serializable=FALSE`.
#' #'
#' @return \code{lgb.Booster} (the same `model` object that was passed as input, as invisible). #' @return \code{lgb.Booster} (the same `model` object that was passed as input, as invisible).
......
...@@ -5,6 +5,8 @@ ...@@ -5,6 +5,8 @@
#' object is restored automatically when calling functions such as \code{predict}, but this function can be #' object is restored automatically when calling functions such as \code{predict}, but this function can be
#' used to forcibly restore it beforehand. Note that the object will be modified in-place. #' used to forcibly restore it beforehand. Note that the object will be modified in-place.
#' #'
#' \emph{New in version 4.0.0}
#'
#' @details Be aware that fast single-row prediction configurations are not restored through this #' @details Be aware that fast single-row prediction configurations are not restored through this
#' function. If you wish to make fast single-row predictions using a \code{lgb.Booster} loaded this way, #' function. If you wish to make fast single-row predictions using a \code{lgb.Booster} loaded this way,
#' call \link{lgb.configure_fast_predict} on the loaded \code{lgb.Booster} object. #' call \link{lgb.configure_fast_predict} on the loaded \code{lgb.Booster} object.
......
...@@ -84,6 +84,9 @@ ...@@ -84,6 +84,9 @@
#' Producing and keeping these raw bytes however uses extra memory, and if they are not required, #' Producing and keeping these raw bytes however uses extra memory, and if they are not required,
#' it is possible to avoid producing them by passing `serializable=FALSE`. In such cases, these raw #' it is possible to avoid producing them by passing `serializable=FALSE`. In such cases, these raw
#' bytes can be added to the model on demand through function \link{lgb.make_serializable}. #' bytes can be added to the model on demand through function \link{lgb.make_serializable}.
#'
#' \emph{New in version 4.0.0}
#'
#' @keywords internal #' @keywords internal
NULL NULL
...@@ -99,6 +102,9 @@ NULL ...@@ -99,6 +102,9 @@ NULL
#' @param label Vector of labels, used if \code{data} is not an \code{\link{lgb.Dataset}} #' @param label Vector of labels, used if \code{data} is not an \code{\link{lgb.Dataset}}
#' @param weights Sample / observation weights for rows in the input data. If \code{NULL}, will assume that all #' @param weights Sample / observation weights for rows in the input data. If \code{NULL}, will assume that all
#' observations / rows have the same importance / weight. #' observations / rows have the same importance / weight.
#'
#' \emph{Changed from 'weight', in version 4.0.0}
#'
#' @param objective Optimization objective (e.g. `"regression"`, `"binary"`, etc.). #' @param objective Optimization objective (e.g. `"regression"`, `"binary"`, etc.).
#' For a list of accepted objectives, see #' For a list of accepted objectives, see
#' \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#objective}{ #' \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#objective}{
...@@ -112,7 +118,13 @@ NULL ...@@ -112,7 +118,13 @@ NULL
#' \code{label}). #' \code{label}).
#' \item Otherwise, will use objective \code{"regression"}. #' \item Otherwise, will use objective \code{"regression"}.
#' } #' }
#'
#' \emph{New in version 4.0.0}
#'
#' @param init_score initial score is the base prediction lightgbm will boost from #' @param init_score initial score is the base prediction lightgbm will boost from
#'
#' \emph{New in version 4.0.0}
#'
#' @param num_threads Number of parallel threads to use. For best speed, this should be set to the number of #' @param num_threads Number of parallel threads to use. For best speed, this should be set to the number of
#' physical cores in the CPU - in a typical x86-64 machine, this corresponds to half the #' physical cores in the CPU - in a typical x86-64 machine, this corresponds to half the
#' number of maximum threads. #' number of maximum threads.
...@@ -129,6 +141,9 @@ NULL ...@@ -129,6 +141,9 @@ NULL
#' #'
#' This parameter gets overriden by \code{num_threads} and its aliases under \code{params} #' This parameter gets overriden by \code{num_threads} and its aliases under \code{params}
#' if passed there. #' if passed there.
#'
#' \emph{New in version 4.0.0}
#'
#' @param ... Additional arguments passed to \code{\link{lgb.train}}. For example #' @param ... Additional arguments passed to \code{\link{lgb.train}}. For example
#' \itemize{ #' \itemize{
#' \item{\code{valids}: a list of \code{lgb.Dataset} objects, used for validation} #' \item{\code{valids}: a list of \code{lgb.Dataset} objects, used for validation}
......
...@@ -56,7 +56,9 @@ If <= 0, all iterations from start_iteration are used (no limits).} ...@@ -56,7 +56,9 @@ If <= 0, all iterations from start_iteration are used (no limits).}
If the model was fit through function \link{lightgbm} and it was passed a factor as labels, If the model was fit through function \link{lightgbm} and it was passed a factor as labels,
passing the prediction type through \code{params} instead of through this argument might passing the prediction type through \code{params} instead of through this argument might
result in factor levels for classification objectives not being applied correctly to the result in factor levels for classification objectives not being applied correctly to the
resulting output.} resulting output.
\emph{New in version 4.0.0}}
\item{params}{a list of additional named parameters. See \item{params}{a list of additional named parameters. See
\href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#predict-parameters}{ \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#predict-parameters}{
......
...@@ -17,6 +17,8 @@ If a LightGBM model object was produced with argument `serializable=TRUE`, the R ...@@ -17,6 +17,8 @@ If a LightGBM model object was produced with argument `serializable=TRUE`, the R
a copy of the underlying C++ object as raw bytes, which can be used to reconstruct such object after getting a copy of the underlying C++ object as raw bytes, which can be used to reconstruct such object after getting
serialized and de-serialized, but at the cost of extra memory usage. If these raw bytes are not needed anymore, serialized and de-serialized, but at the cost of extra memory usage. If these raw bytes are not needed anymore,
they can be dropped through this function in order to save memory. Note that the object will be modified in-place. they can be dropped through this function in order to save memory. Note that the object will be modified in-place.
\emph{New in version 4.0.0}
} }
\seealso{ \seealso{
\link{lgb.restore_handle}, \link{lgb.make_serializable}. \link{lgb.restore_handle}, \link{lgb.make_serializable}.
......
...@@ -17,6 +17,8 @@ If a LightGBM model object was produced with argument `serializable=FALSE`, the ...@@ -17,6 +17,8 @@ If a LightGBM model object was produced with argument `serializable=FALSE`, the
be serializable (e.g. cannot save and load with \code{saveRDS} and \code{readRDS}) as it will lack the raw bytes be serializable (e.g. cannot save and load with \code{saveRDS} and \code{readRDS}) as it will lack the raw bytes
needed to reconstruct its underlying C++ object. This function can be used to forcibly produce those serialized needed to reconstruct its underlying C++ object. This function can be used to forcibly produce those serialized
raw bytes and make the object serializable. Note that the object will be modified in-place. raw bytes and make the object serializable. Note that the object will be modified in-place.
\emph{New in version 4.0.0}
} }
\seealso{ \seealso{
\link{lgb.restore_handle}, \link{lgb.drop_serialized}. \link{lgb.restore_handle}, \link{lgb.drop_serialized}.
......
...@@ -18,6 +18,8 @@ After a LightGBM model object is de-serialized through functions such as \code{s ...@@ -18,6 +18,8 @@ After a LightGBM model object is de-serialized through functions such as \code{s
\code{saveRDS}, its underlying C++ object will be blank and needs to be restored to able to use it. Such \code{saveRDS}, its underlying C++ object will be blank and needs to be restored to able to use it. Such
object is restored automatically when calling functions such as \code{predict}, but this function can be object is restored automatically when calling functions such as \code{predict}, but this function can be
used to forcibly restore it beforehand. Note that the object will be modified in-place. used to forcibly restore it beforehand. Note that the object will be modified in-place.
\emph{New in version 4.0.0}
} }
\details{ \details{
Be aware that fast single-row prediction configurations are not restored through this Be aware that fast single-row prediction configurations are not restored through this
......
...@@ -105,6 +105,8 @@ Parameter docs shared by \code{lgb.train}, \code{lgb.cv}, and \code{lightgbm} ...@@ -105,6 +105,8 @@ Parameter docs shared by \code{lgb.train}, \code{lgb.cv}, and \code{lightgbm}
Producing and keeping these raw bytes however uses extra memory, and if they are not required, Producing and keeping these raw bytes however uses extra memory, and if they are not required,
it is possible to avoid producing them by passing `serializable=FALSE`. In such cases, these raw it is possible to avoid producing them by passing `serializable=FALSE`. In such cases, these raw
bytes can be added to the model on demand through function \link{lgb.make_serializable}. bytes can be added to the model on demand through function \link{lgb.make_serializable}.
\emph{New in version 4.0.0}
} }
\keyword{internal} \keyword{internal}
...@@ -30,7 +30,9 @@ may allow you to pass other types of data like \code{matrix} and then separately ...@@ -30,7 +30,9 @@ may allow you to pass other types of data like \code{matrix} and then separately
\item{label}{Vector of labels, used if \code{data} is not an \code{\link{lgb.Dataset}}} \item{label}{Vector of labels, used if \code{data} is not an \code{\link{lgb.Dataset}}}
\item{weights}{Sample / observation weights for rows in the input data. If \code{NULL}, will assume that all \item{weights}{Sample / observation weights for rows in the input data. If \code{NULL}, will assume that all
observations / rows have the same importance / weight.} observations / rows have the same importance / weight.
\emph{Changed from 'weight', in version 4.0.0}}
\item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{ \item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
the "Parameters" section of the documentation} for a list of parameters and valid values.} the "Parameters" section of the documentation} for a list of parameters and valid values.}
...@@ -67,9 +69,13 @@ set to the iteration number of the best iteration.} ...@@ -67,9 +69,13 @@ set to the iteration number of the best iteration.}
(note that parameter \code{num_class} in this case will also be determined automatically from (note that parameter \code{num_class} in this case will also be determined automatically from
\code{label}). \code{label}).
\item Otherwise, will use objective \code{"regression"}. \item Otherwise, will use objective \code{"regression"}.
}} }
\emph{New in version 4.0.0}}
\item{init_score}{initial score is the base prediction lightgbm will boost from} \item{init_score}{initial score is the base prediction lightgbm will boost from
\emph{New in version 4.0.0}}
\item{num_threads}{Number of parallel threads to use. For best speed, this should be set to the number of \item{num_threads}{Number of parallel threads to use. For best speed, this should be set to the number of
physical cores in the CPU - in a typical x86-64 machine, this corresponds to half the physical cores in the CPU - in a typical x86-64 machine, this corresponds to half the
...@@ -86,7 +92,9 @@ set to the iteration number of the best iteration.} ...@@ -86,7 +92,9 @@ set to the iteration number of the best iteration.}
\code{RhpcBLASctl} to be installed. \code{RhpcBLASctl} to be installed.
This parameter gets overriden by \code{num_threads} and its aliases under \code{params} This parameter gets overriden by \code{num_threads} and its aliases under \code{params}
if passed there.} if passed there.
\emph{New in version 4.0.0}}
\item{...}{Additional arguments passed to \code{\link{lgb.train}}. For example \item{...}{Additional arguments passed to \code{\link{lgb.train}}. For example
\itemize{ \itemize{
......
...@@ -28,7 +28,9 @@ ...@@ -28,7 +28,9 @@
If single-row predictions are going to be performed frequently, it is recommended to If single-row predictions are going to be performed frequently, it is recommended to
pre-configure the model object for fast single-row sparse predictions through function pre-configure the model object for fast single-row sparse predictions through function
\link{lgb.configure_fast_predict}.} \link{lgb.configure_fast_predict}.
\emph{Changed from 'data', in version 4.0.0}}
\item{type}{Type of prediction to output. Allowed types are:\itemize{ \item{type}{Type of prediction to output. Allowed types are:\itemize{
\item \code{"response"}: will output the predicted score according to the objective function being \item \code{"response"}: will output the predicted score according to the objective function being
...@@ -54,7 +56,9 @@ ...@@ -54,7 +56,9 @@
If the model was fit through function \link{lightgbm} and it was passed a factor as labels, If the model was fit through function \link{lightgbm} and it was passed a factor as labels,
passing the prediction type through \code{params} instead of through this argument might passing the prediction type through \code{params} instead of through this argument might
result in factor levels for classification objectives not being applied correctly to the result in factor levels for classification objectives not being applied correctly to the
resulting output.} resulting output.
\emph{New in version 4.0.0}}
\item{start_iteration}{int or None, optional (default=None) \item{start_iteration}{int or None, optional (default=None)
Start index of the iteration to predict. Start index of the iteration to predict.
...@@ -106,6 +110,8 @@ For prediction types that are meant to always return one output per observation ...@@ -106,6 +110,8 @@ For prediction types that are meant to always return one output per observation
} }
\description{ \description{
Predicted values based on class \code{lgb.Booster} Predicted values based on class \code{lgb.Booster}
\emph{New in version 4.0.0}
} }
\details{ \details{
If the model object has been configured for fast single-row predictions through If the model object has been configured for fast single-row predictions through
......
...@@ -16,4 +16,6 @@ The same input \code{x}, returned as invisible. ...@@ -16,4 +16,6 @@ The same input \code{x}, returned as invisible.
} }
\description{ \description{
Show summary information about a LightGBM model object (same as \code{summary}). Show summary information about a LightGBM model object (same as \code{summary}).
\emph{New in version 4.0.0}
} }
...@@ -16,4 +16,6 @@ The same input \code{object}, returned as invisible. ...@@ -16,4 +16,6 @@ The same input \code{object}, returned as invisible.
} }
\description{ \description{
Show summary information about a LightGBM model object (same as \code{print}). Show summary information about a LightGBM model object (same as \code{print}).
\emph{New in version 4.0.0}
} }
...@@ -233,6 +233,8 @@ You could edit your firewall rules to allow communication between any of the wor ...@@ -233,6 +233,8 @@ You could edit your firewall rules to allow communication between any of the wor
Using Custom Objective Functions with Dask Using Custom Objective Functions with Dask
****************************************** ******************************************
.. versionadded:: 4.0.0
It is possible to customize the boosting process by providing a custom objective function written in Python. It is possible to customize the boosting process by providing a custom objective function written in Python.
See the Dask API's documentation for details on how to implement such functions. See the Dask API's documentation for details on how to implement such functions.
......
...@@ -145,6 +145,8 @@ Core Parameters ...@@ -145,6 +145,8 @@ Core Parameters
- ``goss``, Gradient-based One-Side Sampling - ``goss``, Gradient-based One-Side Sampling
- *New in 4.0.0*
- ``data`` :raw-html:`<a id="data" title="Permalink to this parameter" href="#data">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``train``, ``train_data``, ``train_data_file``, ``data_filename`` - ``data`` :raw-html:`<a id="data" title="Permalink to this parameter" href="#data">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``train``, ``train_data``, ``train_data_file``, ``data_filename``
- path of training data, LightGBM will train from this data - path of training data, LightGBM will train from this data
...@@ -670,6 +672,8 @@ Learning Control Parameters ...@@ -670,6 +672,8 @@ Learning Control Parameters
- **Note**: can be used only with ``device_type = cpu`` - **Note**: can be used only with ``device_type = cpu``
- *New in version 4.0.0*
- ``num_grad_quant_bins`` :raw-html:`<a id="num_grad_quant_bins" title="Permalink to this parameter" href="#num_grad_quant_bins">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int - ``num_grad_quant_bins`` :raw-html:`<a id="num_grad_quant_bins" title="Permalink to this parameter" href="#num_grad_quant_bins">&#x1F517;&#xFE0E;</a>`, default = ``4``, type = int
- number of bins to quantization gradients and hessians - number of bins to quantization gradients and hessians
...@@ -678,6 +682,8 @@ Learning Control Parameters ...@@ -678,6 +682,8 @@ Learning Control Parameters
- **Note**: can be used only with ``device_type = cpu`` - **Note**: can be used only with ``device_type = cpu``
- *New in 4.0.0*
- ``quant_train_renew_leaf`` :raw-html:`<a id="quant_train_renew_leaf" title="Permalink to this parameter" href="#quant_train_renew_leaf">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool - ``quant_train_renew_leaf`` :raw-html:`<a id="quant_train_renew_leaf" title="Permalink to this parameter" href="#quant_train_renew_leaf">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool
- whether to renew the leaf values with original gradients when quantized training - whether to renew the leaf values with original gradients when quantized training
...@@ -686,10 +692,14 @@ Learning Control Parameters ...@@ -686,10 +692,14 @@ Learning Control Parameters
- **Note**: can be used only with ``device_type = cpu`` - **Note**: can be used only with ``device_type = cpu``
- *New in 4.0.0*
- ``stochastic_rounding`` :raw-html:`<a id="stochastic_rounding" title="Permalink to this parameter" href="#stochastic_rounding">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool - ``stochastic_rounding`` :raw-html:`<a id="stochastic_rounding" title="Permalink to this parameter" href="#stochastic_rounding">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool
- whether to use stochastic rounding in gradient quantization - whether to use stochastic rounding in gradient quantization
- *New in 4.0.0*
IO Parameters IO Parameters
------------- -------------
...@@ -908,6 +918,8 @@ Dataset Parameters ...@@ -908,6 +918,8 @@ Dataset Parameters
- **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__ - **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__
- *New in 4.0.0*
Predict Parameters Predict Parameters
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
......
...@@ -166,6 +166,7 @@ struct Config { ...@@ -166,6 +166,7 @@ struct Config {
// desc = ``bagging``, Randomly Bagging Sampling // desc = ``bagging``, Randomly Bagging Sampling
// descl2 = **Note**: ``bagging`` is only effective when ``bagging_freq > 0`` and ``bagging_fraction < 1.0`` // descl2 = **Note**: ``bagging`` is only effective when ``bagging_freq > 0`` and ``bagging_fraction < 1.0``
// desc = ``goss``, Gradient-based One-Side Sampling // desc = ``goss``, Gradient-based One-Side Sampling
// desc = *New in 4.0.0*
std::string data_sample_strategy = "bagging"; std::string data_sample_strategy = "bagging";
// alias = train, train_data, train_data_file, data_filename // alias = train, train_data, train_data_file, data_filename
...@@ -598,22 +599,26 @@ struct Config { ...@@ -598,22 +599,26 @@ struct Config {
// desc = with quantized training, most arithmetics in the training process will be integer operations // desc = with quantized training, most arithmetics in the training process will be integer operations
// desc = gradient quantization can accelerate training, with little accuracy drop in most cases // desc = gradient quantization can accelerate training, with little accuracy drop in most cases
// desc = **Note**: can be used only with ``device_type = cpu`` // desc = **Note**: can be used only with ``device_type = cpu``
// desc = *New in version 4.0.0*
bool use_quantized_grad = false; bool use_quantized_grad = false;
// [no-save] // [no-save]
// desc = number of bins to quantization gradients and hessians // desc = number of bins to quantization gradients and hessians
// desc = with more bins, the quantized training will be closer to full precision training // desc = with more bins, the quantized training will be closer to full precision training
// desc = **Note**: can be used only with ``device_type = cpu`` // desc = **Note**: can be used only with ``device_type = cpu``
// desc = *New in 4.0.0*
int num_grad_quant_bins = 4; int num_grad_quant_bins = 4;
// [no-save] // [no-save]
// desc = whether to renew the leaf values with original gradients when quantized training // desc = whether to renew the leaf values with original gradients when quantized training
// desc = renewing is very helpful for good quantized training accuracy for ranking objectives // desc = renewing is very helpful for good quantized training accuracy for ranking objectives
// desc = **Note**: can be used only with ``device_type = cpu`` // desc = **Note**: can be used only with ``device_type = cpu``
// desc = *New in 4.0.0*
bool quant_train_renew_leaf = false; bool quant_train_renew_leaf = false;
// [no-save] // [no-save]
// desc = whether to use stochastic rounding in gradient quantization // desc = whether to use stochastic rounding in gradient quantization
// desc = *New in 4.0.0*
bool stochastic_rounding = true; bool stochastic_rounding = true;
#ifndef __NVCC__ #ifndef __NVCC__
...@@ -777,6 +782,7 @@ struct Config { ...@@ -777,6 +782,7 @@ struct Config {
// desc = path to a ``.json`` file that specifies customized parser initialized configuration // desc = path to a ``.json`` file that specifies customized parser initialized configuration
// desc = see `lightgbm-transform <https://github.com/microsoft/lightgbm-transform>`__ for usage examples // desc = see `lightgbm-transform <https://github.com/microsoft/lightgbm-transform>`__ for usage examples
// desc = **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__ // desc = **Note**: ``lightgbm-transform`` is not maintained by LightGBM's maintainers. Bug reports or feature requests should go to `issues page <https://github.com/microsoft/lightgbm-transform/issues>`__
// desc = *New in 4.0.0*
std::string parser_config_file = ""; std::string parser_config_file = "";
#ifndef __NVCC__ #ifndef __NVCC__
......
...@@ -932,6 +932,8 @@ class _InnerPredictor: ...@@ -932,6 +932,8 @@ class _InnerPredictor:
If True, ensure that the features used to predict match the ones used to train. If True, ensure that the features used to predict match the ones used to train.
Used only if data is pandas DataFrame. Used only if data is pandas DataFrame.
.. versionadded:: 4.0.0
Returns Returns
------- -------
result : numpy array, scipy.sparse or list of scipy.sparse result : numpy array, scipy.sparse or list of scipy.sparse
...@@ -2841,6 +2843,8 @@ class Dataset: ...@@ -2841,6 +2843,8 @@ class Dataset:
def feature_num_bin(self, feature: Union[int, str]) -> int: def feature_num_bin(self, feature: Union[int, str]) -> int:
"""Get the number of bins for a feature. """Get the number of bins for a feature.
.. versionadded:: 4.0.0
Parameters Parameters
---------- ----------
feature : int or str feature : int or str
...@@ -4150,19 +4154,34 @@ class Booster: ...@@ -4150,19 +4154,34 @@ class Booster:
will use ``leaf_output = decay_rate * old_leaf_output + (1.0 - decay_rate) * new_leaf_output`` to refit trees. will use ``leaf_output = decay_rate * old_leaf_output + (1.0 - decay_rate) * new_leaf_output`` to refit trees.
reference : Dataset or None, optional (default=None) reference : Dataset or None, optional (default=None)
Reference for ``data``. Reference for ``data``.
.. versionadded:: 4.0.0
weight : list, numpy 1-D array, pandas Series or None, optional (default=None) weight : list, numpy 1-D array, pandas Series or None, optional (default=None)
Weight for each ``data`` instance. Weights should be non-negative. Weight for each ``data`` instance. Weights should be non-negative.
.. versionadded:: 4.0.0
group : list, numpy 1-D array, pandas Series or None, optional (default=None) group : list, numpy 1-D array, pandas Series or None, optional (default=None)
Group/query size for ``data``. Group/query size for ``data``.
Only used in the learning-to-rank task. Only used in the learning-to-rank task.
sum(group) = n_samples. sum(group) = n_samples.
For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups, For example, if you have a 100-document dataset with ``group = [10, 20, 40, 10, 10, 10]``, that means that you have 6 groups,
where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc. where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
.. versionadded:: 4.0.0
init_score : list, list of lists (for multi-class task), numpy array, pandas Series, pandas DataFrame (for multi-class task), or None, optional (default=None) init_score : list, list of lists (for multi-class task), numpy array, pandas Series, pandas DataFrame (for multi-class task), or None, optional (default=None)
Init score for ``data``. Init score for ``data``.
.. versionadded:: 4.0.0
feature_name : list of str, or 'auto', optional (default="auto") feature_name : list of str, or 'auto', optional (default="auto")
Feature names for ``data``. Feature names for ``data``.
If 'auto' and data is pandas DataFrame, data columns names are used. If 'auto' and data is pandas DataFrame, data columns names are used.
.. versionadded:: 4.0.0
categorical_feature : list of str or int, or 'auto', optional (default="auto") categorical_feature : list of str or int, or 'auto', optional (default="auto")
Categorical features for ``data``. Categorical features for ``data``.
If list of int, interpreted as indices. If list of int, interpreted as indices.
...@@ -4173,13 +4192,25 @@ class Booster: ...@@ -4173,13 +4192,25 @@ class Booster:
All negative values in categorical features will be treated as missing values. All negative values in categorical features will be treated as missing values.
The output cannot be monotonically constrained with respect to a categorical feature. The output cannot be monotonically constrained with respect to a categorical feature.
Floating point numbers in categorical features will be rounded towards 0. Floating point numbers in categorical features will be rounded towards 0.
.. versionadded:: 4.0.0
dataset_params : dict or None, optional (default=None) dataset_params : dict or None, optional (default=None)
Other parameters for Dataset ``data``. Other parameters for Dataset ``data``.
.. versionadded:: 4.0.0
free_raw_data : bool, optional (default=True) free_raw_data : bool, optional (default=True)
If True, raw data is freed after constructing inner Dataset for ``data``. If True, raw data is freed after constructing inner Dataset for ``data``.
.. versionadded:: 4.0.0
validate_features : bool, optional (default=False) validate_features : bool, optional (default=False)
If True, ensure that the features used to refit the model match the original ones. If True, ensure that the features used to refit the model match the original ones.
Used only if data is pandas DataFrame. Used only if data is pandas DataFrame.
.. versionadded:: 4.0.0
**kwargs **kwargs
Other parameters for refit. Other parameters for refit.
These parameters will be passed to ``predict`` method. These parameters will be passed to ``predict`` method.
...@@ -4271,6 +4302,8 @@ class Booster: ...@@ -4271,6 +4302,8 @@ class Booster:
) -> 'Booster': ) -> 'Booster':
"""Set the output of a leaf. """Set the output of a leaf.
.. versionadded:: 4.0.0
Parameters Parameters
---------- ----------
tree_id : int tree_id : int
......
...@@ -407,6 +407,8 @@ def early_stopping(stopping_rounds: int, first_metric_only: bool = False, verbos ...@@ -407,6 +407,8 @@ def early_stopping(stopping_rounds: int, first_metric_only: bool = False, verbos
If float, this single value is used for all metrics. If float, this single value is used for all metrics.
If list, its length should match the total number of metrics. If list, its length should match the total number of metrics.
.. versionadded:: 4.0.0
Returns Returns
------- -------
callback : _EarlyStoppingCallback callback : _EarlyStoppingCallback
......
...@@ -656,6 +656,9 @@ def create_tree_digraph( ...@@ -656,6 +656,9 @@ def create_tree_digraph(
example_case : numpy 2-D array, pandas DataFrame or None, optional (default=None) example_case : numpy 2-D array, pandas DataFrame or None, optional (default=None)
Single row with the same structure as the training data. Single row with the same structure as the training data.
If not None, the plot will highlight the path that sample takes through the tree. If not None, the plot will highlight the path that sample takes through the tree.
.. versionadded:: 4.0.0
max_category_values : int, optional (default=10) max_category_values : int, optional (default=10)
The maximum number of category values to display in tree nodes, if the number of thresholds is greater than this value, thresholds will be collapsed and displayed on the label tooltip instead. The maximum number of category values to display in tree nodes, if the number of thresholds is greater than this value, thresholds will be collapsed and displayed on the label tooltip instead.
...@@ -672,6 +675,8 @@ def create_tree_digraph( ...@@ -672,6 +675,8 @@ def create_tree_digraph(
graph = lgb.create_tree_digraph(clf, max_category_values=5) graph = lgb.create_tree_digraph(clf, max_category_values=5)
HTML(graph._repr_image_svg_xml()) HTML(graph._repr_image_svg_xml())
.. versionadded:: 4.0.0
**kwargs **kwargs
Other parameters passed to ``Digraph`` constructor. Other parameters passed to ``Digraph`` constructor.
Check https://graphviz.readthedocs.io/en/stable/api.html#digraph for the full list of supported parameters. Check https://graphviz.readthedocs.io/en/stable/api.html#digraph for the full list of supported parameters.
...@@ -792,6 +797,9 @@ def plot_tree( ...@@ -792,6 +797,9 @@ def plot_tree(
example_case : numpy 2-D array, pandas DataFrame or None, optional (default=None) example_case : numpy 2-D array, pandas DataFrame or None, optional (default=None)
Single row with the same structure as the training data. Single row with the same structure as the training data.
If not None, the plot will highlight the path that sample takes through the tree. If not None, the plot will highlight the path that sample takes through the tree.
.. versionadded:: 4.0.0
**kwargs **kwargs
Other parameters passed to ``Digraph`` constructor. Other parameters passed to ``Digraph`` constructor.
Check https://graphviz.readthedocs.io/en/stable/api.html#digraph for the full list of supported parameters. Check https://graphviz.readthedocs.io/en/stable/api.html#digraph for the full list of supported parameters.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment