lgb.Dataset.Rd 2.35 KB
Newer Older
Guolin Ke's avatar
Guolin Ke committed
1
2
3
4
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/lgb.Dataset.R
\name{lgb.Dataset}
\alias{lgb.Dataset}
Nikita Titov's avatar
Nikita Titov committed
5
\title{Construct \code{lgb.Dataset} object}
Guolin Ke's avatar
Guolin Ke committed
6
\usage{
7
8
9
10
11
12
13
14
15
16
lgb.Dataset(
  data,
  params = list(),
  reference = NULL,
  colnames = NULL,
  categorical_feature = NULL,
  free_raw_data = TRUE,
  info = list(),
  ...
)
Guolin Ke's avatar
Guolin Ke committed
17
18
}
\arguments{
19
20
21
\item{data}{a \code{matrix} object, a \code{dgCMatrix} object,
a character representing a path to a text file (CSV, TSV, or LibSVM),
or a character representing a path to a binary \code{lgb.Dataset} file}
Guolin Ke's avatar
Guolin Ke committed
22

23
24
25
26
\item{params}{a list of parameters. See
\href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#dataset-parameters}{
The "Dataset Parameters" section of the documentation} for a list of parameters
and valid values.}
Guolin Ke's avatar
Guolin Ke committed
27

28
29
30
\item{reference}{reference dataset. When LightGBM creates a Dataset, it does some preprocessing like binning
continuous features into histograms. If you want to apply the same bin boundaries from an existing
dataset to new \code{data}, pass that existing Dataset to this argument.}
Guolin Ke's avatar
Guolin Ke committed
31
32
33

\item{colnames}{names of columns}

34
35
36
\item{categorical_feature}{categorical features. This can either be a character vector of feature
names or an integer vector with the indices of the features (e.g.
\code{c(1L, 10L)} to say "the first and tenth columns").}
Guolin Ke's avatar
Guolin Ke committed
37

38
39
40
41
42
\item{free_raw_data}{LightGBM constructs its data format, called a "Dataset", from tabular data.
By default, that Dataset object on the R side does not keep a copy of the raw data.
This reduces LightGBM's memory consumption, but it means that the Dataset object
cannot be changed after it has been constructed. If you'd prefer to be able to
change the Dataset object after construction, set \code{free_raw_data = FALSE}.}
Guolin Ke's avatar
Guolin Ke committed
43

Nikita Titov's avatar
Nikita Titov committed
44
\item{info}{a list of information of the \code{lgb.Dataset} object}
Guolin Ke's avatar
Guolin Ke committed
45
46
47
48
49
50
51

\item{...}{other information to pass to \code{info} or parameters pass to \code{params}}
}
\value{
constructed dataset
}
\description{
Nikita Titov's avatar
Nikita Titov committed
52
Construct \code{lgb.Dataset} object from dense matrix, sparse matrix
53
             or local file (that was created previously by saving an \code{lgb.Dataset}).
Guolin Ke's avatar
Guolin Ke committed
54
55
}
\examples{
56
\donttest{
Guolin Ke's avatar
Guolin Ke committed
57
58
59
data(agaricus.train, package = "lightgbm")
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label = train$label)
60
61
62
data_file <- tempfile(fileext = ".data")
lgb.Dataset.save(dtrain, data_file)
dtrain <- lgb.Dataset(data_file)
Guolin Ke's avatar
Guolin Ke committed
63
lgb.Dataset.construct(dtrain)
64
}
Guolin Ke's avatar
Guolin Ke committed
65
}