Commit f9ab5f58 authored by Nikita Titov's avatar Nikita Titov Committed by Guolin Ke
Browse files

[docs] added notes about params usage when data is provided via path and...

[docs] added notes about params usage when data is provided via path and removed unused param (#2024)

* added notes about params usage when data is provided via path

* fixed init score and valid init score params note

* fixed binary params description
parent c5cfe3e3
......@@ -451,7 +451,7 @@ IO Parameters
- if ``""``, will use ``train_data_file`` + ``.init`` (if exists)
- **Note**: can be used only in CLI version
- **Note**: works only in case of loading data directly from file
- ``valid_data_initscores`` :raw-html:`<a id="valid_data_initscores" title="Permalink to this parameter" href="#valid_data_initscores">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``valid_data_init_scores``, ``valid_init_score_file``, ``valid_init_score``
......@@ -461,7 +461,7 @@ IO Parameters
- separate by ``,`` for multi-validation data
- **Note**: can be used only in CLI version
- **Note**: works only in case of loading data directly from file
- ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition``
......@@ -507,20 +507,20 @@ IO Parameters
- by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
- **Note**: works only in case of loading data directly from file
- ``save_binary`` :raw-html:`<a id="save_binary" title="Permalink to this parameter" href="#save_binary">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_save_binary``, ``is_save_binary_file``
- if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time
- ``enable_load_from_binary_file`` :raw-html:`<a id="enable_load_from_binary_file" title="Permalink to this parameter" href="#enable_load_from_binary_file">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``load_from_binary_file``, ``binary_load``, ``load_binary``
- set this to ``true`` to enable autoloading from previous saved binary datasets
- set this to ``false`` to ignore binary datasets
- **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function
- ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``has_header``
- set this to ``true`` if input data has header
- **Note**: works only in case of loading data directly from file
- ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``label``
- used to specify the label column
......@@ -529,6 +529,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
- **Note**: works only in case of loading data directly from file
- ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``weight``
- used to specify the weight column
......@@ -537,6 +539,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
- **Note**: works only in case of loading data directly from file
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
- ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
......@@ -547,6 +551,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
- **Note**: works only in case of loading data directly from file
- **Note**: data should be grouped by query\_id
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
......
......@@ -439,7 +439,7 @@ struct Config {
// alias = init_score_filename, init_score_file, init_score, input_init_score
// desc = path of file with training initial scores
// desc = if ``""``, will use ``train_data_file`` + ``.init`` (if exists)
// desc = **Note**: can be used only in CLI version
// desc = **Note**: works only in case of loading data directly from file
std::string initscore_filename = "";
// alias = valid_data_init_scores, valid_init_score_file, valid_init_score
......@@ -447,7 +447,7 @@ struct Config {
// desc = path(s) of file(s) with validation initial scores
// desc = if ``""``, will use ``valid_data_file`` + ``.init`` (if exists)
// desc = separate by ``,`` for multi-validation data
// desc = **Note**: can be used only in CLI version
// desc = **Note**: works only in case of loading data directly from file
std::vector<std::string> valid_data_initscores;
// alias = is_pre_partition
......@@ -486,19 +486,17 @@ struct Config {
// alias = two_round_loading, use_two_round_loading
// desc = set this to ``true`` if data file is too big to fit in memory
// desc = by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
// desc = **Note**: works only in case of loading data directly from file
bool two_round = false;
// alias = is_save_binary, is_save_binary_file
// desc = if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time
// desc = **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function
bool save_binary = false;
// alias = load_from_binary_file, binary_load, load_binary
// desc = set this to ``true`` to enable autoloading from previous saved binary datasets
// desc = set this to ``false`` to ignore binary datasets
bool enable_load_from_binary_file = true;
// alias = has_header
// desc = set this to ``true`` if input data has header
// desc = **Note**: works only in case of loading data directly from file
bool header = false;
// type = int or string
......@@ -506,6 +504,7 @@ struct Config {
// desc = used to specify the label column
// desc = use number for index, e.g. ``label=0`` means column\_0 is the label
// desc = add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
// desc = **Note**: works only in case of loading data directly from file
std::string label_column = "";
// type = int or string
......@@ -513,6 +512,7 @@ struct Config {
// desc = used to specify the weight column
// desc = use number for index, e.g. ``weight=0`` means column\_0 is the weight
// desc = add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
std::string weight_column = "";
......@@ -521,6 +521,7 @@ struct Config {
// desc = used to specify the query/group id column
// desc = use number for index, e.g. ``query=0`` means column\_0 is the query id
// desc = add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: data should be grouped by query\_id
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
std::string group_column = "";
......
......@@ -108,9 +108,6 @@ std::unordered_map<std::string, std::string> Config::alias_table({
{"use_two_round_loading", "two_round"},
{"is_save_binary", "save_binary"},
{"is_save_binary_file", "save_binary"},
{"load_from_binary_file", "enable_load_from_binary_file"},
{"binary_load", "enable_load_from_binary_file"},
{"load_binary", "enable_load_from_binary_file"},
{"has_header", "header"},
{"label", "label_column"},
{"weight", "weight_column"},
......@@ -221,7 +218,6 @@ std::unordered_set<std::string> Config::parameter_set({
"zero_as_missing",
"two_round",
"save_binary",
"enable_load_from_binary_file",
"header",
"label_column",
"weight_column",
......@@ -424,8 +420,6 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
GetBool(params, "save_binary", &save_binary);
GetBool(params, "enable_load_from_binary_file", &enable_load_from_binary_file);
GetBool(params, "header", &header);
GetString(params, "label_column", &label_column);
......@@ -581,7 +575,6 @@ std::string Config::SaveMembersToString() const {
str_buf << "[zero_as_missing: " << zero_as_missing << "]\n";
str_buf << "[two_round: " << two_round << "]\n";
str_buf << "[save_binary: " << save_binary << "]\n";
str_buf << "[enable_load_from_binary_file: " << enable_load_from_binary_file << "]\n";
str_buf << "[header: " << header << "]\n";
str_buf << "[label_column: " << label_column << "]\n";
str_buf << "[weight_column: " << weight_column << "]\n";
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment