- by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">🔗︎</a>`, default = ``false``, type = bool, aliases: ``has_header``
- set this to ``true`` if input data has header
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">🔗︎</a>`, default = ``""``, type = int or string, aliases: ``label``
...
...
@@ -764,7 +764,7 @@ Dataset Parameters
- if omitted, the first column in the training data is used as the label
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">🔗︎</a>`, default = ``""``, type = int or string, aliases: ``weight``
...
...
@@ -774,7 +774,7 @@ Dataset Parameters
- add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
...
...
@@ -786,7 +786,7 @@ Dataset Parameters
- add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- **Note**: data should be grouped by query\_id, for more information, see `Query Data <#query-data>`__
...
...
@@ -800,7 +800,7 @@ Dataset Parameters
- add a prefix ``name:`` for column name, e.g. ``ignore_column=name:c1,c2,c3`` means c1, c2 and c3 will be ignored
- **Note**: works only in case of loading data directly from file
- **Note**: works only in case of loading data directly from text file
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
// alias = two_round_loading, use_two_round_loading
// desc = set this to ``true`` if data file is too big to fit in memory
// desc = by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
booltwo_round=false;
// alias = has_header
// desc = set this to ``true`` if input data has header
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
boolheader=false;
// type = int or string
...
...
@@ -659,7 +659,7 @@ struct Config {
// desc = use number for index, e.g. ``label=0`` means column\_0 is the label
// desc = add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
// desc = if omitted, the first column in the training data is used as the label
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
std::stringlabel_column="";
// type = int or string
...
...
@@ -667,7 +667,7 @@ struct Config {
// desc = used to specify the weight column
// desc = use number for index, e.g. ``weight=0`` means column\_0 is the weight
// desc = add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
std::stringweight_column="";
...
...
@@ -676,7 +676,7 @@ struct Config {
// desc = used to specify the query/group id column
// desc = use number for index, e.g. ``query=0`` means column\_0 is the query id
// desc = add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
// desc = **Note**: data should be grouped by query\_id, for more information, see `Query Data <#query-data>`__
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
std::stringgroup_column="";
...
...
@@ -686,7 +686,7 @@ struct Config {
// desc = used to specify some ignoring columns in training
// desc = use number for index, e.g. ``ignore_column=0,1,2`` means column\_0, column\_1 and column\_2 will be ignored
// desc = add a prefix ``name:`` for column name, e.g. ``ignore_column=name:c1,c2,c3`` means c1, c2 and c3 will be ignored
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: works only in case of loading data directly from text file
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``
// desc = **Note**: despite the fact that specified columns will be completely ignored during the training, they still should have a valid format allowing LightGBM to load file successfully