Commit f9ab5f58 authored by Nikita Titov's avatar Nikita Titov Committed by Guolin Ke
Browse files

[docs] added notes about params usage when data is provided via path and...

[docs] added notes about params usage when data is provided via path and removed unused param (#2024)

* added notes about params usage when data is provided via path

* fixed init score and valid init score params note

* fixed binary params description
parent c5cfe3e3
...@@ -451,7 +451,7 @@ IO Parameters ...@@ -451,7 +451,7 @@ IO Parameters
- if ``""``, will use ``train_data_file`` + ``.init`` (if exists) - if ``""``, will use ``train_data_file`` + ``.init`` (if exists)
- **Note**: can be used only in CLI version - **Note**: works only in case of loading data directly from file
- ``valid_data_initscores`` :raw-html:`<a id="valid_data_initscores" title="Permalink to this parameter" href="#valid_data_initscores">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``valid_data_init_scores``, ``valid_init_score_file``, ``valid_init_score`` - ``valid_data_initscores`` :raw-html:`<a id="valid_data_initscores" title="Permalink to this parameter" href="#valid_data_initscores">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``valid_data_init_scores``, ``valid_init_score_file``, ``valid_init_score``
...@@ -461,7 +461,7 @@ IO Parameters ...@@ -461,7 +461,7 @@ IO Parameters
- separate by ``,`` for multi-validation data - separate by ``,`` for multi-validation data
- **Note**: can be used only in CLI version - **Note**: works only in case of loading data directly from file
- ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition`` - ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition``
...@@ -507,20 +507,20 @@ IO Parameters ...@@ -507,20 +507,20 @@ IO Parameters
- by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big - by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
- **Note**: works only in case of loading data directly from file
- ``save_binary`` :raw-html:`<a id="save_binary" title="Permalink to this parameter" href="#save_binary">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_save_binary``, ``is_save_binary_file`` - ``save_binary`` :raw-html:`<a id="save_binary" title="Permalink to this parameter" href="#save_binary">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_save_binary``, ``is_save_binary_file``
- if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time - if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time
- ``enable_load_from_binary_file`` :raw-html:`<a id="enable_load_from_binary_file" title="Permalink to this parameter" href="#enable_load_from_binary_file">&#x1F517;&#xFE0E;</a>`, default = ``true``, type = bool, aliases: ``load_from_binary_file``, ``binary_load``, ``load_binary`` - **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function
- set this to ``true`` to enable autoloading from previous saved binary datasets
- set this to ``false`` to ignore binary datasets
- ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``has_header`` - ``header`` :raw-html:`<a id="header" title="Permalink to this parameter" href="#header">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``has_header``
- set this to ``true`` if input data has header - set this to ``true`` if input data has header
- **Note**: works only in case of loading data directly from file
- ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``label`` - ``label_column`` :raw-html:`<a id="label_column" title="Permalink to this parameter" href="#label_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``label``
- used to specify the label column - used to specify the label column
...@@ -529,6 +529,8 @@ IO Parameters ...@@ -529,6 +529,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``label=name:is_click`` - add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
- **Note**: works only in case of loading data directly from file
- ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``weight`` - ``weight_column`` :raw-html:`<a id="weight_column" title="Permalink to this parameter" href="#weight_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``weight``
- used to specify the weight column - used to specify the weight column
...@@ -537,6 +539,8 @@ IO Parameters ...@@ -537,6 +539,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``weight=name:weight`` - add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
- **Note**: works only in case of loading data directly from file
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0`` - **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
- ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id`` - ``group_column`` :raw-html:`<a id="group_column" title="Permalink to this parameter" href="#group_column">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = int or string, aliases: ``group``, ``group_id``, ``query_column``, ``query``, ``query_id``
...@@ -547,6 +551,8 @@ IO Parameters ...@@ -547,6 +551,8 @@ IO Parameters
- add a prefix ``name:`` for column name, e.g. ``query=name:query_id`` - add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
- **Note**: works only in case of loading data directly from file
- **Note**: data should be grouped by query\_id - **Note**: data should be grouped by query\_id
- **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0`` - **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
......
...@@ -439,7 +439,7 @@ struct Config { ...@@ -439,7 +439,7 @@ struct Config {
// alias = init_score_filename, init_score_file, init_score, input_init_score // alias = init_score_filename, init_score_file, init_score, input_init_score
// desc = path of file with training initial scores // desc = path of file with training initial scores
// desc = if ``""``, will use ``train_data_file`` + ``.init`` (if exists) // desc = if ``""``, will use ``train_data_file`` + ``.init`` (if exists)
// desc = **Note**: can be used only in CLI version // desc = **Note**: works only in case of loading data directly from file
std::string initscore_filename = ""; std::string initscore_filename = "";
// alias = valid_data_init_scores, valid_init_score_file, valid_init_score // alias = valid_data_init_scores, valid_init_score_file, valid_init_score
...@@ -447,7 +447,7 @@ struct Config { ...@@ -447,7 +447,7 @@ struct Config {
// desc = path(s) of file(s) with validation initial scores // desc = path(s) of file(s) with validation initial scores
// desc = if ``""``, will use ``valid_data_file`` + ``.init`` (if exists) // desc = if ``""``, will use ``valid_data_file`` + ``.init`` (if exists)
// desc = separate by ``,`` for multi-validation data // desc = separate by ``,`` for multi-validation data
// desc = **Note**: can be used only in CLI version // desc = **Note**: works only in case of loading data directly from file
std::vector<std::string> valid_data_initscores; std::vector<std::string> valid_data_initscores;
// alias = is_pre_partition // alias = is_pre_partition
...@@ -486,19 +486,17 @@ struct Config { ...@@ -486,19 +486,17 @@ struct Config {
// alias = two_round_loading, use_two_round_loading // alias = two_round_loading, use_two_round_loading
// desc = set this to ``true`` if data file is too big to fit in memory // desc = set this to ``true`` if data file is too big to fit in memory
// desc = by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big // desc = by default, LightGBM will map data file to memory and load features from memory. This will provide faster data loading speed, but may cause run out of memory error when the data file is very big
// desc = **Note**: works only in case of loading data directly from file
bool two_round = false; bool two_round = false;
// alias = is_save_binary, is_save_binary_file // alias = is_save_binary, is_save_binary_file
// desc = if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time // desc = if ``true``, LightGBM will save the dataset (including validation data) to a binary file. This speed ups the data loading for the next time
// desc = **Note**: can be used only in CLI version; for language-specific packages you can use the correspondent function
bool save_binary = false; bool save_binary = false;
// alias = load_from_binary_file, binary_load, load_binary
// desc = set this to ``true`` to enable autoloading from previous saved binary datasets
// desc = set this to ``false`` to ignore binary datasets
bool enable_load_from_binary_file = true;
// alias = has_header // alias = has_header
// desc = set this to ``true`` if input data has header // desc = set this to ``true`` if input data has header
// desc = **Note**: works only in case of loading data directly from file
bool header = false; bool header = false;
// type = int or string // type = int or string
...@@ -506,6 +504,7 @@ struct Config { ...@@ -506,6 +504,7 @@ struct Config {
// desc = used to specify the label column // desc = used to specify the label column
// desc = use number for index, e.g. ``label=0`` means column\_0 is the label // desc = use number for index, e.g. ``label=0`` means column\_0 is the label
// desc = add a prefix ``name:`` for column name, e.g. ``label=name:is_click`` // desc = add a prefix ``name:`` for column name, e.g. ``label=name:is_click``
// desc = **Note**: works only in case of loading data directly from file
std::string label_column = ""; std::string label_column = "";
// type = int or string // type = int or string
...@@ -513,6 +512,7 @@ struct Config { ...@@ -513,6 +512,7 @@ struct Config {
// desc = used to specify the weight column // desc = used to specify the weight column
// desc = use number for index, e.g. ``weight=0`` means column\_0 is the weight // desc = use number for index, e.g. ``weight=0`` means column\_0 is the weight
// desc = add a prefix ``name:`` for column name, e.g. ``weight=name:weight`` // desc = add a prefix ``name:`` for column name, e.g. ``weight=name:weight``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0`` // desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0, and weight is column\_1, the correct parameter is ``weight=0``
std::string weight_column = ""; std::string weight_column = "";
...@@ -521,6 +521,7 @@ struct Config { ...@@ -521,6 +521,7 @@ struct Config {
// desc = used to specify the query/group id column // desc = used to specify the query/group id column
// desc = use number for index, e.g. ``query=0`` means column\_0 is the query id // desc = use number for index, e.g. ``query=0`` means column\_0 is the query id
// desc = add a prefix ``name:`` for column name, e.g. ``query=name:query_id`` // desc = add a prefix ``name:`` for column name, e.g. ``query=name:query_id``
// desc = **Note**: works only in case of loading data directly from file
// desc = **Note**: data should be grouped by query\_id // desc = **Note**: data should be grouped by query\_id
// desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0`` // desc = **Note**: index starts from ``0`` and it doesn't count the label column when passing type is ``int``, e.g. when label is column\_0 and query\_id is column\_1, the correct parameter is ``query=0``
std::string group_column = ""; std::string group_column = "";
......
...@@ -108,9 +108,6 @@ std::unordered_map<std::string, std::string> Config::alias_table({ ...@@ -108,9 +108,6 @@ std::unordered_map<std::string, std::string> Config::alias_table({
{"use_two_round_loading", "two_round"}, {"use_two_round_loading", "two_round"},
{"is_save_binary", "save_binary"}, {"is_save_binary", "save_binary"},
{"is_save_binary_file", "save_binary"}, {"is_save_binary_file", "save_binary"},
{"load_from_binary_file", "enable_load_from_binary_file"},
{"binary_load", "enable_load_from_binary_file"},
{"load_binary", "enable_load_from_binary_file"},
{"has_header", "header"}, {"has_header", "header"},
{"label", "label_column"}, {"label", "label_column"},
{"weight", "weight_column"}, {"weight", "weight_column"},
...@@ -221,7 +218,6 @@ std::unordered_set<std::string> Config::parameter_set({ ...@@ -221,7 +218,6 @@ std::unordered_set<std::string> Config::parameter_set({
"zero_as_missing", "zero_as_missing",
"two_round", "two_round",
"save_binary", "save_binary",
"enable_load_from_binary_file",
"header", "header",
"label_column", "label_column",
"weight_column", "weight_column",
...@@ -424,8 +420,6 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str ...@@ -424,8 +420,6 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
GetBool(params, "save_binary", &save_binary); GetBool(params, "save_binary", &save_binary);
GetBool(params, "enable_load_from_binary_file", &enable_load_from_binary_file);
GetBool(params, "header", &header); GetBool(params, "header", &header);
GetString(params, "label_column", &label_column); GetString(params, "label_column", &label_column);
...@@ -581,7 +575,6 @@ std::string Config::SaveMembersToString() const { ...@@ -581,7 +575,6 @@ std::string Config::SaveMembersToString() const {
str_buf << "[zero_as_missing: " << zero_as_missing << "]\n"; str_buf << "[zero_as_missing: " << zero_as_missing << "]\n";
str_buf << "[two_round: " << two_round << "]\n"; str_buf << "[two_round: " << two_round << "]\n";
str_buf << "[save_binary: " << save_binary << "]\n"; str_buf << "[save_binary: " << save_binary << "]\n";
str_buf << "[enable_load_from_binary_file: " << enable_load_from_binary_file << "]\n";
str_buf << "[header: " << header << "]\n"; str_buf << "[header: " << header << "]\n";
str_buf << "[label_column: " << label_column << "]\n"; str_buf << "[label_column: " << label_column << "]\n";
str_buf << "[weight_column: " << weight_column << "]\n"; str_buf << "[weight_column: " << weight_column << "]\n";
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment