Refine config object (#1381)

* [WIP] refine config * [wip] ready for the auto code generate * auto generate config codes * use with to open file * fix bug * fix pylint * fix bug * fix pylint * fix bugs. * tmp for failed test. * fix tests. * added nthreads alias * added new aliases from new config.h * fixed duplicated alias * refactored parameter_generator.py * added new aliases from config.h and removed remaining old names * fix bugs & some miss alias * added aliases * add more descriptions. * add comment.

Refine config object (#1381)
* [WIP] refine config * [wip] ready for the auto code generate * auto generate config codes * use with to open file * fix bug * fix pylint * fix bug * fix pylint * fix bugs. * tmp for failed test. * fix tests. * added nthreads alias * added new aliases from new config.h * fixed duplicated alias * refactored parameter_generator.py * added new aliases from config.h and removed remaining old names * fix bugs & some miss alias * added aliases * add more descriptions. * add comment.
dc699574 · Guolin Ke · GitHub · 497e60ed · dc699574 · dc699574
Unverified Commit dc699574 authored May 20, 2018 by Guolin Ke Committed by GitHub May 20, 2018
20 changed files
--- a/docs/Parameters.rst
+++ b/docs/Parameters.rst
@@ -32,7 +32,7 @@ Core Parameters

   -  **Note**: Only can be used in CLI version

-  ``task``, default=\ ``train``, type=enum, options=\ ``train``, ``predict``, ``convert_model``, ``refit``
+-  ``task``, default=\ ``train``, type=enum, options=\ ``train``, ``predict``, ``convert_model``, ``refit``, alias=\ ``task_type``

   -  ``train``, alias=\ ``training``, for training

@@ -47,7 +47,7 @@ Core Parameters
 -  ``application``, default=\ ``regression``, type=enum,
   options=\ ``regression``, ``regression_l1``, ``huber``, ``fair``, ``poisson``, ``quantile``, ``mape``, ``gammma``, ``tweedie``,
   ``binary``, ``multiclass``, ``multiclassova``, ``xentropy``, ``xentlambda``, ``lambdarank``,
-   alias=\ ``objective``, ``app``
+   alias=\ ``app``, ``objective``, ``objective_type``

   -  regression application

@@ -107,11 +107,11 @@ Core Parameters

   -  ``goss``, Gradient-based One-Side Sampling

-  ``data``, default=\ ``""``, type=string, alias=\ ``train``, ``train_data``
+-  ``data``, default=\ ``""``, type=string, alias=\ ``train``, ``train_data``, ``data_filename``

   -  training data, LightGBM will train from this data

-  ``valid``, default=\ ``""``, type=multi-string, alias=\ ``test``, ``valid_data``, ``test_data``
+-  ``valid``, default=\ ``""``, type=multi-string, alias=\ ``test``, ``valid_data``, ``test_data``, ``valid_filenames``

   -  validation/test data, LightGBM will output metrics for these data

@@ -137,7 +137,7 @@ Core Parameters

   -  number of leaves in one tree

-  ``tree_learner``, default=\ ``serial``, type=enum, options=\ ``serial``, ``feature``, ``data``, ``voting``, alias=\ ``tree``
+-  ``tree_learner``, default=\ ``serial``, type=enum, options=\ ``serial``, ``feature``, ``data``, ``voting``, alias=\ ``tree``, ``tree_learner_type``

   -  ``serial``, single machine tree learner

@@ -149,7 +149,7 @@ Core Parameters

   -  refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details

-  ``num_threads``, default=\ ``OpenMP_default``, type=int, alias=\ ``num_thread``, ``nthread``
+-  ``num_threads``, default=\ ``OpenMP_default``, type=int, alias=\ ``num_thread``, ``nthread``, ``nthreads``

   -  number of threads for LightGBM

@@ -204,7 +204,7 @@ Learning Control Parameters

   -  random seed for ``feature_fraction``

-  ``bagging_fraction``, default=\ ``1.0``, type=double, ``0.0 < bagging_fraction <= 1.0``, alias=\ ``sub_row``, ``subsample``
+-  ``bagging_fraction``, default=\ ``1.0``, type=double, ``0.0 < bagging_fraction <= 1.0``, alias=\ ``sub_row``, ``subsample``, ``bagging``

   -  like ``feature_fraction``, but this will randomly select part of data without resampling

@@ -312,7 +312,7 @@ Learning Control Parameters

   -  set this to larger value for more accurate result, but it will slow down the training speed

-  ``monotone_constraint``, default=\ ``None``, type=multi-int, alias=\ ``mc``
+-  ``monotone_constraint``, default=\ ``None``, type=multi-int, alias=\ ``mc``, ``monotone_constraints``

   -  used for constraints of monotonic features

@@ -443,7 +443,7 @@ IO Parameters

   -  **Note**: the negative values will be treated as **missing values**

-  ``predict_raw_score``, default=\ ``false``, type=bool, alias=\ ``raw_score``, ``is_predict_raw_score``
+-  ``predict_raw_score``, default=\ ``false``, type=bool, alias=\ ``raw_score``, ``is_predict_raw_score``, ``predict_rawscore``

   -  only used in ``prediction`` task

@@ -501,17 +501,17 @@ IO Parameters

   -  set to ``false`` to use ``na`` to represent missing values

-  ``init_score_file``, default=\ ``""``, type=string
+-  ``init_score_file``, default=\ ``""``, type=string, alias=\ ``init_score_filename``, ``initscore_filename``, ``init_score``

   -  path to training initial score file, ``""`` will use ``train_data_file`` + ``.init`` (if exists)

-  ``valid_init_score_file``, default=\ ``""``, type=multi-string
+-  ``valid_init_score_file``, default=\ ``""``, type=multi-string, alias=\ ``valid_data_initscores``, ``valid_data_init_scores``, ``valid_init_score``

   -  path to validation initial score file, ``""`` will use ``valid_data_file`` + ``.init`` (if exists)

   -  separate by ``,`` for multi-validation data

-  ``forced_splits``, default=\ ``""``, type=string
+-  ``forced_splits``, default=\ ``""``, type=string, alias=\ ``forced_splits_file``, ``forcedsplits_filename``, ``forced_splits_filename``

   -  path to a ``.json`` file that specifies splits to force at the top of every decision tree before best-first learning commences

@@ -593,7 +593,7 @@ Objective Parameters
 Metric Parameters
 -----------------

-  ``metric``, default=\ ``''``, type=multi-enum
+-  ``metric``, default=\ ``''``, type=multi-enum, alias=\ ``metric_types``

   -  metric to be evaluated on the evaluation sets **in addition** to what is provided in the training arguments

@@ -650,7 +650,7 @@ Metric Parameters

   -  frequency for metric output

-  ``train_metric``, default=\ ``false``, type=bool, alias=\ ``training_metric``, ``is_training_metric``
+-  ``train_metric``, default=\ ``false``, type=bool, alias=\ ``training_metric``, ``is_training_metric``, ``is_provide_training_metric``

   -  set this to ``true`` if you need to output metric result of training


--- a/helper/parameter_generator.py
+++ b/helper/parameter_generator.py
+import os
+
+
+def GetParameterInfos(config_hpp):
+    is_inparameter = False
+    parameter_group = None
+    cur_key = None
+    cur_info = {}
+    keys = []
+    member_infos = []
+    with open(config_hpp) as config_hpp_file:
+        for line in config_hpp_file:
+            if "#pragma region Parameters" in line:
+                is_inparameter = True
+            elif "#pragma region" in line and "Parameters" in line:
+                cur_key = line.split("region")[1].strip()
+                keys.append(cur_key)
+                member_infos.append([])
+            elif '#pragma endregion' in line:
+                if cur_key is not None:
+                    cur_key = None
+                elif is_inparameter:
+                    is_inparameter = False
+            elif cur_key is not None:
+                line = line.strip()
+                if line.startswith("//"):
+                    tokens = line.split("//")[1].split("=")
+                    key = tokens[0].strip()
+                    val = '='.join(tokens[1:]).strip()
+                    if key not in cur_info:
+                        if key == "descl2":
+                            cur_info["desc"] = []
+                        else:
+                            cur_info[key] = []
+                    if key == "desc":
+                        cur_info["desc"].append(["l1", val])
+                    elif key == "descl2":
+                        cur_info["desc"].append(["l2", val])
+                    else:
+                        cur_info[key].append(val)
+                elif line:
+                    has_eqsgn = False
+                    tokens = line.split("=")
+                    if len(tokens) == 2:
+                        if "default" not in cur_info:
+                            cur_info["default"] = [tokens[1][:-1].strip()]
+                        has_eqsgn = True
+                    tokens = line.split()
+                    cur_info["inner_type"] = [tokens[0].strip()]
+                    if "name" not in cur_info:
+                        if has_eqsgn:
+                            cur_info["name"] = [tokens[1].strip()]
+                        else:
+                            cur_info["name"] = [tokens[1][:-1].strip()]
+                    member_infos[-1].append(cur_info)
+                    cur_info = {}
+    return (keys, member_infos)
+
+
+def GetNames(infos):
+    names = []
+    for x in infos:
+        for y in x:
+            names.append(y["name"][0])
+    return names
+
+
+def GetAlias(infos):
+    pairs = []
+    for x in infos:
+        for y in x:
+            if "alias" in y:
+                name = y["name"][0]
+                alias = y["alias"][0].split(',')
+                for name2 in alias:
+                    pairs.append([name2.strip(), name])
+    return pairs
+
+
+def SetOneVarFromString(name, type, checks):
+    ret = ""
+    univar_mapper = {"int": "GetInt", "double": "GetDouble", "bool": "GetBool", "std::string": "GetString"}
+    if "vector" not in type:
+        ret += "  %s(params, \"%s\", &%s);\n" % (univar_mapper[type], name, name)
+        if len(checks) > 0:
+            for check in checks:
+                ret += "  CHECK(%s %s);\n" % (name, check)
+        ret += "\n"
+    else:
+        ret += "  if (GetString(params, \"%s\", &tmp_str)) {\n" % (name)
+        type2 = type.split("<")[1][:-1]
+        if type2 == "std::string":
+            ret += "    %s = Common::Split(tmp_str.c_str(), ',');\n" % (name)
+        else:
+            ret += "    %s = Common::StringToArray<%s>(tmp_str, ',');\n" % (name, type2)
+        ret += "  }\n\n"
+    return ret
+
+
+def GenParameterCode(config_hpp, config_out_cpp):
+    keys, infos = GetParameterInfos(config_hpp)
+    names = GetNames(infos)
+    alias = GetAlias(infos)
+    str_to_write = "/// This file is auto generated by LightGBM\\helper\\parameter_generator.py\n"
+    str_to_write += "#include<LightGBM/config.h>\nnamespace LightGBM {\n"
+    # alias table
+    str_to_write += "std::unordered_map<std::string, std::string> Config::alias_table({\n"
+    for pair in alias:
+        str_to_write += "  {\"%s\", \"%s\"}, \n" % (pair[0], pair[1])
+    str_to_write += "});\n\n"
+    # names
+    str_to_write += "std::unordered_set<std::string> Config::parameter_set({\n"
+    for name in names:
+        str_to_write += "  \"%s\", \n" % (name)
+    str_to_write += "});\n\n"
+    # from strings
+    str_to_write += "void Config::GetMembersFromString(const std::unordered_map<std::string, std::string>& params) {\n"
+    str_to_write += "  std::string tmp_str = \"\";\n"
+    for x in infos:
+        for y in x:
+            if "[doc-only]" in y:
+                continue
+            type = y["inner_type"][0]
+            name = y["name"][0]
+            checks = []
+            if "check" in y:
+                checks = y["check"]
+            tmp = SetOneVarFromString(name, type, checks)
+            str_to_write += tmp
+    # tails
+    str_to_write += "}\n\n"
+    str_to_write += "std::string Config::SaveMembersToString() const {\n"
+    str_to_write += "  std::stringstream str_buf;\n"
+    for x in infos:
+        for y in x:
+            if "[doc-only]" in y:
+                continue
+            type = y["inner_type"][0]
+            name = y["name"][0]
+            if "vector" in type:
+                if "int8" in type:
+                    str_to_write += "  str_buf << \"[%s: \" << Common::Join(Common::ArrayCast<int8_t, int>(%s),\",\") << \"]\\n\";\n" % (name, name)
+                else:
+                    str_to_write += "  str_buf << \"[%s: \" << Common::Join(%s,\",\") << \"]\\n\";\n" % (name, name)
+            else:
+                str_to_write += "  str_buf << \"[%s: \" << %s << \"]\\n\";\n" % (name, name)
+    # tails
+    str_to_write += "  return str_buf.str();\n"
+    str_to_write += "}\n\n"
+    str_to_write += "}\n"
+    with open(config_out_cpp, "w") as config_out_cpp_file:
+        config_out_cpp_file.write(str_to_write)
+
+
+if __name__ == "__main__":
+    config_hpp = os.path.join(os.path.pardir, 'include', 'LightGBM', 'config.h')
+    config_out_cpp = os.path.join(os.path.pardir, 'src', 'io', 'config_auto.cpp')
+    GenParameterCode(config_hpp, config_out_cpp)
--- a/include/LightGBM/application.h
+++ b/include/LightGBM/application.h
@@ -56,7 +56,7 @@ private:
  void ConvertModel();

  /*! \brief All configs */
-  OverallConfig config_;
+  Config config_;
  /*! \brief Training data */
  std::unique_ptr<Dataset> train_data_;
  /*! \brief Validation data */
@@ -73,10 +73,10 @@ private:


 inline void Application::Run() {
-  if (config_.task_type == TaskType::kPredict || config_.task_type == TaskType::KRefitTree) {
+  if (config_.task == TaskType::kPredict || config_.task == TaskType::KRefitTree) {
    InitPredict();
    Predict();
-  } else if (config_.task_type == TaskType::kConvertModel) {
+  } else if (config_.task == TaskType::kConvertModel) {
    ConvertModel();
  } else {
    InitTrain();

--- a/include/LightGBM/boosting.h
+++ b/include/LightGBM/boosting.h
@@ -32,7 +32,7 @@ public:
  * \param training_metrics Training metric
  */
  virtual void Init(
-    const BoostingConfig* config,
+    const Config* config,
    const Dataset* train_data,
    const ObjectiveFunction* objective_function,
    const std::vector<const Metric*>& training_metrics) = 0;
@@ -47,7 +47,7 @@ public:
  virtual void ResetTrainingData(const Dataset* train_data, const ObjectiveFunction* objective_function,
                                 const std::vector<const Metric*>& training_metrics) = 0;

-  virtual void ResetConfig(const BoostingConfig* config) = 0;
+  virtual void ResetConfig(const Config* config) = 0;




--- a/include/LightGBM/config.h
+++ b/include/LightGBM/config.h
@@ -16,27 +16,15 @@

 namespace LightGBM {

-const std::string kDefaultTreeLearnerType = "serial";
-const std::string kDefaultDevice = "cpu";
-const std::string kDefaultBoostingType = "gbdt";
-const std::string kDefaultObjectiveType = "regression";
+/*! \brief Types of tasks */
+enum TaskType {
+  kTrain, kPredict, kConvertModel, KRefitTree
+};
 const int kDefaultNumLeaves = 31;

-/*!
-* \brief The interface for Config
-*/
-struct ConfigBase {
+struct Config {
 public:
-  /*! \brief virtual destructor */
-  virtual ~ConfigBase() {}
-
-  /*!
-  * \brief Set current config object by params
-  * \param params Store the key and value for params
-  */
-  virtual void Set(
-    const std::unordered_map<std::string, std::string>& params) = 0;
-
+  std::string ToString() const;
  /*!
  * \brief Get string value by specific name of key
  * \param params Store the key and value for params
@@ -83,230 +71,627 @@ public:

  static void KV2Map(std::unordered_map<std::string, std::string>& params, const char* kv);
  static std::unordered_map<std::string, std::string> Str2Map(const char* parameters);
-};

-/*! \brief Types of tasks */
-enum TaskType {
-  kTrain, kPredict, kConvertModel, KRefitTree
-};
+  #pragma region Parameters
+  #pragma region Core Parameters
+
+  // [doc-only]
+  // alias=config_file
+  // desc=path of config file
+  // desc=**Note**: Only can be used in CLI version
+  std::string config = "";
+
+  // [doc-only]
+  // type=enum
+  // default=train
+  // options=train,predict,convert_model,refit
+  // alias=task_type
+  // desc=``train``, alias=\ ``training``, for training
+  // desc=``predict``, alias=\ ``prediction``, ``test``, for prediction
+  // desc=``convert_model``, for converting model file into if-else format, see more information in `Convert model parameters <#convert-model-parameters>`__
+  // desc=``refit``, alias = \ ``refit_tree``, refit existing models with new data
+  // desc=**Note**: Only can be used in CLI version
+  TaskType task = TaskType::kTrain;
+
+  // [doc-only]
+  // type=enum
+  // options=regression,regression_l1,huber,fair,poisson,quantile,mape,gammma,tweedie,binary,multiclass,multiclassova,xentropy,xentlambda,lambdarank
+  // alias=application,app,objective_type
+  // desc=regression application
+  // descl2=``regression_l2``, L2 loss, alias=\ ``regression``, ``mean_squared_error``, ``mse``, ``l2_root``, ``root_mean_squared_error``, ``rmse``
+  // descl2=``regression_l1``, L1 loss, alias=\ ``mean_absolute_error``, ``mae``
+  // descl2=``huber``, `Huber loss`_
+  // descl2=``fair``, `Fair loss`_
+  // descl2=``poisson``, `Poisson regression`_
+  // descl2=``quantile``, `Quantile regression`_
+  // descl2=``mape``, `MAPE loss`_, alias=\ ``mean_absolute_percentage_error``
+  // descl2=``gamma``, Gamma regression with log-link. It might be useful, e.g., for modeling insurance claims severity, or for any target that might be `gamma-distributed`_
+  // descl2=``tweedie``, Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any target that might be `tweedie-distributed`_
+  // desc=``binary``, binary `log loss`_ classification application
+  // desc=multi-class classification application
+  // descl2=``multiclass``, `softmax`_ objective function, alias=\ ``softmax``
+  // descl2=``multiclassova``, `One-vs-All`_ binary objective function, alias=\ ``multiclass_ova``, ``ova``, ``ovr``
+  // descl2=``num_class`` should be set as well
+  // desc=cross-entropy application
+  // descl2=``xentropy``, objective function for cross-entropy (with optional linear weights), alias=\ ``cross_entropy``
+  // descl2=``xentlambda``, alternative parameterization of cross-entropy, alias=\ ``cross_entropy_lambda``
+  // descl2=the label is anything in interval [0, 1]
+  // desc=``lambdarank``, `lambdarank`_ application
+  // descl2=the label should be ``int`` type in lambdarank tasks, and larger number represent the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)
+  // descl2=`label_gain <#objective-parameters>`__ can be used to set the gain(weight) of ``int`` label
+  // descl2=all values in ``label`` must be smaller than number of elements in ``label_gain``
+  std::string objective = "regression";
+
+
+  // [doc-only]
+  // type=enum
+  // alias=boosting_type,boost
+  // options=gbdt,rf,dart,goss
+  // desc=``gbdt``, traditional Gradient Boosting Decision Tree
+  // desc=``rf``, Random Forest
+  // desc=``dart``, `Dropouts meet Multiple Additive Regression Trees`_
+  // desc=``goss``, Gradient - based One - Side Sampling
+  std::string boosting = "gbdt";
+
+  // alias=train,train_data,data_filename
+  // desc=training data, LightGBM will train from this data
+  std::string data = "";
+
+  // alias=test,valid_data,test_data,valid_filenames
+  // desc=validation/test data, LightGBM will output metrics for these data
+  // desc=support multi validation data, separate by ``,``
+  std::vector<std::string> valid;
+
+  // alias=num_iteration,num_tree,num_trees,num_round,num_rounds,num_boost_round,n_estimators
+  // check=>=0
+  // desc=number of boosting iterations
+  // desc=**Note**: for Python/R package,**this parameter is ignored**, use num_boost_round (Python) or nrounds (R) input arguments of train and cv methods instead
+  // desc=**Note**: internally,LightGBM constructs num_class * num_iterations trees for multiclass problems
+  int num_iterations = 100;

-/*! \brief Config for input and output files */
-struct IOConfig: public ConfigBase {
-public:
+  // alias=shrinkage_rate
+  // check=>0
+  // desc=shrinkage rate
+  // desc=in dart,it also affects on normalization weights of dropped trees
+  double learning_rate = 0.1;
+
+  // default=31
+  // alias = num_leaf
+  // check=>1
+  // desc=max number of leaves in one tree
+  int num_leaves = kDefaultNumLeaves;
+
+  // [doc-only]
+  // type=enum
+  // options=serial, feature, data, voting
+  // alias = tree, tree_learner_type
+  // desc=serial,single machine tree learner
+  // desc=feature,alias=feature_parallel,feature parallel tree learner
+  // desc=data,alias=data_parallel,data parallel tree learner
+  // desc=voting,alias=voting_parallel,voting parallel tree learner
+  // desc=refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details
+  std::string tree_learner = "serial";
+
+  // default=OpenMP_default
+  // alias = num_thread, nthread, nthreads
+  // desc = number of threads for LightGBM
+  // desc=for the best speed,set this to the number of **real CPU cores**,
+  // not the number of threads(most CPU using `hyper-threading`_ to generate 2 threads per CPU core)
+  // desc=do not set it too large if your dataset is small (do not use 64 threads for a dataset with 10,000 rows for instance)
+  // desc=be aware a task manager or any similar CPU monitoring tool might report cores not being fully utilized. **This is normal**
+  // desc=for parallel learning,should not use full CPU cores since this will cause poor performance for the network
+  int num_threads = 0;
+
+  // [doc-only]
+  // options=cpu,gpu
+  // desc = choose device for the tree learning, you can use GPU to achieve the faster learning
+  // desc=**Note**: it is recommended to use the smaller max_bin (e.g. 63) to get the better speed up
+  // desc=**Note**: for the faster speed,GPU use 32-bit float point to sum up by default,may affect the accuracy for some tasks.
+  // desc=You can set gpu_use_dp = true to enable 64 - bit float point, but it will slow down the training
+  // desc=**Note**: refer to `Installation Guide <./Installation-Guide.rst#build-gpu-version>`__ to build with GPU
+  std::string device_type = "cpu";
+
+  // [doc-only]
+  // alias=random_seed
+  // desc=Use this seed to generate seeds for others, e.g. data_random_seed.
+  // desc=Will be override if set other seeds as well
+  // default=none
+  int seed = 0;
+
+  #pragma endregion
+
+  #pragma region Learning Control Parameters
+
+  // desc=limit the max depth for tree model. This is used to deal with over-fitting when #data is small. Tree still grows by leaf-wise
+  // desc=< 0 means no limit
+  int max_depth = -1;
+
+  // alias = min_data_per_leaf, min_data, min_child_samples
+  // check=>=0
+  // desc=minimal number of data in one leaf. Can be used to deal with over-fitting
+  int min_data_in_leaf = 20;
+
+  // alias=min_sum_hessian_per_leaf,min_sum_hessian,min_hessian,min_child_weight
+  // check >=0
+  // desc=minimal sum hessian in one leaf. Like min_data_in_leaf,it can be used to deal with over-fitting
+  double min_sum_hessian_in_leaf = 1e-3;
+
+  // alias=sub_row,subsample,bagging
+  // check=>0
+  // check=<=1.0
+  // desc = like feature_fraction, but this will randomly select part of data without resampling
+  // desc=can be used to speed up training
+  // desc=can be used to deal with over-fitting
+  // desc=**Note**: To enable bagging,bagging_freq should be set to a non zero value as well
+  double bagging_fraction = 1.0;
+
+  // alias=subsample_freq
+  // desc=frequency for bagging,0 means disable bagging. k means will perform bagging at every k iteration
+  // desc=**Note**: to enable bagging,bagging_fraction should be set as well
+  int bagging_freq = 0;
+
+  // alias = bagging_fraction_seed
+  // desc = random seed for bagging
+  int bagging_seed = 3;
+
+
+  // alias = sub_feature, colsample_bytree
+  // check=>0
+  // check=<=1.0
+  // desc=LightGBM will randomly select part of features on each iteration if feature_fraction smaller than 1.0. For example, if set to 0.8, will select 80 % features before training each tree
+  // desc=can be used to speed up training
+  // desc=can be used to deal with over-fitting
+  double feature_fraction = 1.0;
+
+  // desc=random seed for feature_fraction
+  int feature_fraction_seed = 2;
+
+  // alias=early_stopping_rounds,early_stopping
+  // desc=will stop training if one metric of one validation data doesn't improve in last early_stopping_round rounds
+  // desc=enable when greater than 0
+  int early_stopping_round = 0;
+
+  // alias=max_tree_output,max_leaf_output
+  // desc=Used to limit the max output of tree leaves
+  // desc=when <= 0,there is not constraint
+  // desc=the final max output of leaves is learning_rate*max_delta_step
+  double max_delta_step = 0.0;
+
+  // alias=reg_alpha
+  // check=>=0
+  // desc=L1 regularization
+  double lambda_l1 = 0.0;
+
+  // alias = reg_lambda
+  // check=>=0
+  // desc = L2 regularization
+  double lambda_l2 = 0.0;
+
+  // alias=min_split_gain
+  // desc=the minimal gain to perform split
+  double min_gain_to_split = 0.0;
+
+  // check=>=0
+  // check=<=1.0
+  // desc=only used in dart
+  double drop_rate = 0.1;
+
+  // desc=only used in dart,max number of dropped trees on one iteration
+  // desc=<=0 means no limit
+  int max_drop = 50;
+
+  // check=>=0
+  // check=<=1.0
+  // desc=only used in dart,probability of skipping drop
+  double skip_drop = 0.5;
+
+  // desc=only used in dart,set this to true if want to use xgboost dart mode
+  bool xgboost_dart_mode = false;
+
+  // desc=only used in dart,set this to true if want to use uniform drop
+  bool uniform_drop = false;
+
+  // desc=only used in dart,random seed to choose dropping models
+  int drop_seed = 4;
+
+  // check=>=0
+  // check=<=1.0
+  // desc=only used in goss,the retain ratio of large gradient data
+  double top_rate = 0.2;
+
+  // check=>=0
+  // check=<=1.0
+  // desc=only used in goss,the retain ratio of small gradient data
+  double other_rate = 0.1;
+
+  // check=>0
+  // desc=min number of data per categorical group
+  int min_data_per_group = 100;
+
+  // check=>0
+  // desc=use for the categorical features
+  // desc=limit the max threshold points in categorical features
+  int max_cat_threshold = 32;
+
+  // check=>=0
+  // desc=L2 regularization in categorcial split
+  double cat_l2 = 10;
+
+  // check=>=0
+  // desc=used for the categorical features
+  // desc=this can reduce the effect of noises in categorical features,especially for categories with few data
+  double cat_smooth = 10;
+  
+  // check=>0
+  // desc=when number of categories of one feature smaller than or equal to max_cat_to_onehot,one-vs-other split algorithm will be used
+  int max_cat_to_onehot = 4;
+
+  // alias = topk
+  // desc=used in `Voting parallel <./Parallel-Learning-Guide.rst#choose-appropriate-parallel-algorithm>`__
+  // desc=set this to larger value for more accurate result,but it will slow down the training speed
+  int top_k = 20;
+
+  // type = multi-int
+  // alias = mc,monotone_constraint
+  // default=none
+  // desc=used for constraints of monotonic features
+  // desc=1 means increasing,-1 means decreasing,0 means non-constraint
+  // desc=you need to specify all features in order. For example,mc=-1,0,1 means the decreasing for 1st feature,non-constraint for 2nd feature and increasing for the 3rd feature
+  std::vector<int8_t> monotone_constraints;
+  
+  // alias=forced_splits_filename,forced_splits_file,forced_splits
+  // desc = path to a.json file that specifies splits to force at the top of every decision tree before best - first learning commences
+  // desc=.json file can be arbitrarily nested,and each split contains feature,threshold fields,as well as left and right fields representing subsplits.Categorical splits are forced in a one - hot fashion, with left representing the split containing the feature value and right representing other values
+  // desc=see `this file <https://github.com/Microsoft/LightGBM/tree/master/examples/binary_classification/forced_splits.json>`__ as an example
+  std::string forcedsplits_filename = "";
+
+  #pragma endregion
+
+  #pragma region IO Parameters
+
+  // check=>1
+  // desc=max number of bins that feature values will be bucketed in.
+  // desc=Small number of bins may reduce training accuracy but may increase general power(deal with over - fitting)
+  // desc=LightGBM will auto compress memory according max_bin.
+  // desc=For example, LightGBM will use uint8_t for feature value if max_bin = 255
  int max_bin = 255;
-  int num_class = 1;
+
+  // check=>0
+  // desc=min number of data inside one bin,use this to avoid one-data-one-bin (may over-fitting)
+  int min_data_in_bin = 3;
+
+  // desc=random seed for data partition in parallel learning (not include feature parallel)
  int data_random_seed = 1;
-  std::string data_filename = "";
-  std::string initscore_filename = "";
-  std::vector<std::string> valid_data_filenames;
-  std::vector<std::string> valid_data_initscores;
-  int snapshot_freq = -1;
+
+  // alias=model_output,model_out
+  // desc=file name of output model in training
  std::string output_model = "LightGBM_model.txt";
-  std::string output_result = "LightGBM_predict_result.txt";
-  std::string convert_model = "gbdt_prediction.cpp";
+
+  // alias = model_input, model_in
+  // desc=file name of input model
+  // desc=for prediction task,this model will be used for prediction data
+  // desc=for train task,training will be continued from this model
  std::string input_model = "";

-  int verbosity = 1;
-  int num_iteration_predict = -1;
-  bool is_pre_partition = false;
+  // alias=predict_result,prediction_result
+  // desc=file name of prediction result in prediction task
+  std::string output_result = "LightGBM_predict_result.txt";
+
+  // alias = is_pre_partition
+  // desc=used for parallel learning (not include feature parallel)
+  // desc=true if training data are pre-partitioned,and different machines use different partitions
+  bool pre_partition = false;
+
+  // alias = is_sparse, enable_sparse
+  // desc = used to enable / disable sparse optimization.Set to false to disable sparse optimization
  bool is_enable_sparse = true;
-  /*! \brief The threshold of zero elements precentage for treating a feature as a sparse feature.
-   *  Default is 0.8, where a feature is treated as a sparse feature when there are over 80% zeros.
-   *  When setting to 1.0, all features are processed as dense features.
-   */
+
+  // check=>0
+  // check=<=1
+  // desc=the threshold of zero elements precentage for treating a feature as a sparse feature.
  double sparse_threshold = 0.8;
-  bool use_two_round_loading = false;
-  bool is_save_binary_file = false;
-  bool enable_load_from_binary_file = true;
-  int bin_construct_sample_cnt = 200000;
-  bool is_predict_leaf_index = false;
-  bool is_predict_contrib = false;
-  bool is_predict_raw_score = false;
-  int min_data_in_leaf = 20;
-  int min_data_in_bin = 3;
-  double max_conflict_rate = 0.0;
-  bool enable_bundle = true;
-  bool has_header = false;
-  std::vector<int8_t> monotone_constraints;
-  /*! \brief Index or column name of label, default is the first column
-   * And add an prefix "name:" while using column name */
+
+  // alias=two_round_loading,use_two_round_loading
+  // desc = by default, LightGBM will map data file to memory and load features from memory.
+  // desc = This will provide faster data loading speed.But it may run out of memory when the data file is very big
+  // desc = set this to true if data file is too big to fit in memory
+  bool two_round = false;
+
+  // alias = is_save_binary, is_save_binary_file
+  // desc = if true LightGBM will save the dataset(include validation data) to a binary file.
+  // desc = Speed up the data loading for the next time
+  bool save_binary = false;
+
+  // alias=verbose
+  // desc= <0 = Fatal, =0 = Error(Warn), >0 = Info
+  int verbosity = 1;
+
+  // alias = has_header
+  // desc=set this to true if input data has header
+  bool header = false;
+
+
+  // alias=label
+  // desc=specify the label column
+  // desc=use number for index,e.g. label=0 means column\_0 is the label
+  // desc=add a prefix name: for column name,e.g. label=name:is_click
  std::string label_column = "";
-  /*! \brief Index or column name of weight, < 0 means not used
-  * And add an prefix "name:" while using column name
-  * Note: when using Index, it doesn't count the label index */
+
+  // alias=weight
+  // desc=specify the weight column
+  // desc=use number for index,e.g. weight=0 means column\_0 is the weight
+  // desc=add a prefix name: for column name,e.g. weight=name:weight
+  // desc=**Note**: index starts from 0. And it doesn't count the label column when passing type is Index,e.g. when label is column\_0,and weight is column\_1,the correct parameter is weight=0
  std::string weight_column = "";
-  /*! \brief Index or column name of group/query id, < 0 means not used
-  * And add an prefix "name:" while using column name
-  * Note: when using Index, it doesn't count the label index */
+
+  // alias = query_column, group, query
+  // desc=specify the query/group id column
+  // desc=use number for index,e.g. query=0 means column\_0 is the query id
+  // desc=add a prefix name: for column name,e.g. query=name:query_id
+  // desc=**Note**: data should be grouped by query\_id. Index starts from 0. And it doesn't count the label column when passing type is Index,e.g. when label is column\_0 and query\_id is column\_1,the correct parameter is query=0
  std::string group_column = "";
-  /*! \brief ignored features, separate by ','
-  * And add an prefix "name:" while using column name
-  * Note: when using Index, it doesn't count the label index */
+
+  // alias = ignore_feature, blacklist
+  // desc=specify some ignoring columns in training
+  // desc=use number for index,e.g. ignore_column=0,1,2 means column\_0,column\_1 and column\_2 will be ignored
+  // desc=add a prefix name: for column name,e.g. ignore_column=name:c1,c2,c3 means c1,c2 and c3 will be ignored
+  // desc=**Note**: works only in case of loading data directly from file
+  // desc=**Note**: index starts from 0. And it doesn't count the label column
  std::string ignore_column = "";
-  /*! \brief specific categorical columns, Note:only support for integer type categorical
-  * And add an prefix "name:" while using column name
-  * Note: when using Index, it doesn't count the label index */
-  std::string categorical_column = "";
-  std::string device_type = kDefaultDevice;

-  /*! \brief Set to true if want to use early stop for the prediction */
+  // alias=categorical_column,cat_feature,cat_column
+  // desc=specify categorical features
+  // desc=use number for index,e.g. categorical_feature=0,1,2 means column\_0,column\_1 and column\_2 are categorical features
+  // desc=add a prefix name: for column name,e.g. categorical_feature=name:c1,c2,c3 means c1,c2 and c3 are categorical features
+  // desc=**Note**: only supports categorical with int type. Index starts from 0. And it doesn't count the label column
+  // desc=**Note**: the negative values will be treated as **missing values**
+  std::string categorical_feature = "";
+
+  // alias=raw_score,is_predict_raw_score,predict_rawscore
+  // desc=only used in prediction task
+  // desc=set to true to predict only the raw scores
+  // desc=set to false to predict transformed scores
+  bool predict_raw_score = false;
+
+  // alias=leaf_index,is_predict_leaf_index
+  // desc=only used in prediction task
+  // desc=set to true to predict with leaf index of all trees
+  bool predict_leaf_index = false;
+
+  // alias=contrib,is_predict_contrib
+  // desc=only used in prediction task
+  // desc=set to true to estimate `SHAP values`_,which represent how each feature contributs to each prediction.
+  // desc=Produces number of features + 1 values where the last value is the expected value of the model output over the training data
+  bool predict_contrib = false;
+
+  // desc=only used in prediction task
+  // desc=use to specify how many trained iterations will be used in prediction
+  // desc=<= 0 means no limit
+  int num_iteration_predict = -1;
+
+  // desc=if true will use early-stopping to speed up the prediction. May affect the accuracy
  bool pred_early_stop = false;
-  /*! \brief Frequency of checking the pred_early_stop */
+  
+  // desc=the frequency of checking early-stopping prediction
  int pred_early_stop_freq = 10;
-  /*! \brief Threshold of margin of pred_early_stop */
+
+  // desc = the threshold of margin in early - stopping prediction
  double pred_early_stop_margin = 10.0;
-  bool zero_as_missing = false;
+
+  // alias=subsample_for_bin
+  // check=>0
+  // desc=number of data that sampled to construct histogram bins
+  // desc=will give better training result when set this larger,but will increase data loading time
+  // desc=set this to larger value if data is very sparse
+  int bin_construct_sample_cnt = 200000;
+
+  // desc=set to false to disable the special handle of missing value
  bool use_missing = true;
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-};

-/*! \brief Config for objective function */
-struct ObjectiveConfig: public ConfigBase {
-public:
-  virtual ~ObjectiveConfig() {}
+  // desc=set to true to treat all zero as missing values (including the unshown values in libsvm/sparse matrics)
+  // desc=set to false to use na to represent missing values
+  bool zero_as_missing = false;
+
+  // alias=init_score_filename,init_score_file,init_score
+  // desc = path to training initial score file, "" will use train_data_file + .init(if exists)
+  std::string initscore_filename = "";
+
+  // alias=valid_data_init_scores,valid_init_score_file,valid_init_score
+  // desc=path to validation initial score file,"" will use valid_data_file + .init (if exists)
+  // desc=separate by ,for multi-validation data
+  std::vector<std::string> valid_data_initscores;
+  
+  // desc=max cache size(unit:MB) for historical histogram. < 0 means no limit
+  double histogram_pool_size = -1.0;
+
+  // desc=set to true to enable auto loading from previous saved binary datasets
+  // desc=set to false will ignore the binary datasets
+  bool enable_load_from_binary_file = true;
+
+  // desc=set to false to disable Exclusive Feature Bundling (EFB), which is described in LightGBM NIPS2017 paper
+  // desc=disable this may cause the slow training speed for sparse datasets
+  bool enable_bundle = true;
+  
+  // check=>=0
+  // check=<1
+  // desc=max conflict rate for bundles in EFB
+  // desc=set to zero will diallow the conflict, and provide more accurace results
+  // desc=the speed may be faster if set it to a larger value
+  double max_conflict_rate = 0.0;
+
+  // desc=frequency of saving model file snapshot
+  // desc=set to positive numbers will enable this function
+  // desc=for example, the model file will be snopshoted at each iteration if set it to 1 
+  int snapshot_freq = -1;
+
+  // desc=only cpp is supported yet
+  // desc=if convert_model_language is set when task is set to train,the model will also be converted
+  std::string convert_model_language = "";
+
+  // desc=output file name of converted model
+  std::string convert_model = "gbdt_prediction.cpp";
+  #pragma endregion
+
+
+  #pragma region Objective Parameters
+  
+  // alias=num_classes
+  // desc=need to specify this in multi-class classification
+  int num_class = 1;
+
+  // check=>0
+  // desc=parameter for sigmoid function. Will be used in binary and multiclassova classification and in lambdarank
  double sigmoid = 1.0;
+
+  // desc=parameter for `Huber loss`_ and `Quantile regression`_. Will be used in regression task
+  double alpha = 0.9;
+
+  // desc=parameter for `Fair loss`_. Will be used in regression task
  double fair_c = 1.0;
+
+  // desc=parameter for `Poisson regression`_ to safeguard optimization
  double poisson_max_delta_step = 0.7;
-  // for lambdarank
-  std::vector<double> label_gain;
-  // for lambdarank
-  int max_position = 20;
-  // for binary
+
+  // desc=only used in regression task
+  // desc=adjust initial score to the mean of labels for faster convergence
+  bool boost_from_average = true;
+
+  // alias=unbalanced_sets
+  // desc=used in binary classification
+  // desc=set this to true if training data are unbalance
  bool is_unbalance = false;
-  // for multiclass
-  int num_class = 1;
-  // Balancing of positive and negative weights
+
+  // check=>0
+  // desc=weight of positive class in binary classification task
  double scale_pos_weight = 1.0;
-  // True will sqrt fit the sqrt(label)
+
+  // desc=only used in regression, usually works better for the large-range of labels
+  // desc=will fit sqrt(label) instead and prediction result will be also automatically converted to pow2(prediction)
  bool reg_sqrt = false;
-  double alpha = 0.9;
-  double tweedie_variance_power = 1.5;
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-};

-/*! \brief Config for metrics interface*/
-struct MetricConfig: public ConfigBase {
-public:
-  virtual ~MetricConfig() {}
-  int num_class = 1;
-  double sigmoid = 1.0;
-  double fair_c = 1.0;
-  double alpha = 0.9;
+  // desc=only used in tweedie regression
+  // desc=controls the variance of the tweedie distribution
+  // desc=set closer to 2 to shift towards a gamma distribution
+  // desc=set closer to 1 to shift towards a poisson distribution
  double tweedie_variance_power = 1.5;
-  std::vector<double> label_gain;
-  std::vector<int> eval_at;
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-};

+  // default = 0, 1, 3, 7, 15, 31, 63, ..., 2 ^ 30 - 1
+  // desc=used in lambdarank
+  // desc=relevant gain for labels. For example,the gain of label 2 is 3 if using default label gains
+  // desc=separate by ,
+  std::vector<double> label_gain;

-/*! \brief Config for tree model */
-struct TreeConfig: public ConfigBase {
-public:
-  int min_data_in_leaf = 20;
-  double min_sum_hessian_in_leaf = 1e-3;
-  double max_delta_step = 0.0;
-  double lambda_l1 = 0.0;
-  double lambda_l2 = 0.0;
-  double min_gain_to_split = 0.0;
-  // should > 1
-  int num_leaves = kDefaultNumLeaves;
-  int feature_fraction_seed = 2;
-  double feature_fraction = 1.0;
-  // max cache size(unit:MB) for historical histogram. < 0 means no limit
-  double histogram_pool_size = -1.0;
-  // max depth of tree model.
-  // Still grow tree by leaf-wise, but limit the max depth to avoid over-fitting
-  // And the max leaves will be min(num_leaves, pow(2, max_depth))
-  // max_depth < 0 means no limit
-  int max_depth = -1;
-  int top_k = 20;
-  /*! \brief OpenCL platform ID. Usually each GPU vendor exposes one OpenCL platform.
-   *  Default value is -1, using the system-wide default platform
-   */
-  int gpu_platform_id = -1;
-  /*! \brief OpenCL device ID in the specified platform. Each GPU in the selected platform has a
-   *  unique device ID. Default value is -1, using the default device in the selected platform
-   */
-  int gpu_device_id = -1;
-  /*! \brief Set to true to use double precision math on GPU (default using single precision) */
-  bool gpu_use_dp = false;
-  int min_data_per_group = 100;
-  int max_cat_threshold = 32;
-  double cat_l2 = 10;
-  double cat_smooth = 10;
-  int max_cat_to_onehot = 4;
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-};
+  // check=>0
+  // desc=used in lambdarank
+  // desc=will optimize `NDCG`_ at this position
+  int max_position = 20;

-/*! \brief Config for Boosting */
-struct BoostingConfig: public ConfigBase {
-public:
-  virtual ~BoostingConfig() {}
-  int output_freq = 1;
+  #pragma endregion
+
+  #pragma region Metric Parameters
+  
+  // [doc-only]
+  // alias=metric_types
+  // default=''
+  // type=multi-enum
+  // desc=metric to be evaluated on the evaluation sets **in addition** to what is provided in the training arguments
+  // descl2='' (empty string or not specific),metric corresponding to specified objective will be used (this is possible only for pre - defined objective functions, otherwise no evaluation metric will be added)
+  // descl2='None' (string,**not** a None value),no metric registered,alias=na
+  // descl2=l1,absolute loss,alias=mean_absolute_error,mae,regression_l1
+  // descl2=l2,square loss,alias=mean_squared_error,mse,regression_l2,regression
+  // descl2=l2_root,root square loss,alias=root_mean_squared_error,rmse
+  // descl2=quantile,`Quantile regression`_
+  // descl2=mape,`MAPE loss`_,alias=mean_absolute_percentage_error
+  // descl2=huber,`Huber loss`_
+  // descl2=fair,`Fair loss`_
+  // descl2=poisson,negative log-likelihood for `Poisson regression`_
+  // descl2=gamma,negative log-likelihood for Gamma regression
+  // descl2=gamma_deviance,residual deviance for Gamma regression
+  // descl2=tweedie,negative log-likelihood for Tweedie regression
+  // descl2=ndcg,`NDCG`_
+  // descl2=map,`MAP`_,alias=mean_average_precision
+  // descl2=auc,`AUC`_
+  // descl2=binary_logloss,`log loss`_,alias=binary
+  // descl2=binary_error,for one sample: 0 for correct classification,1 for error classification
+  // descl2=multi_logloss,log loss for mulit-class classification,alias=multiclass,softmax,multiclassova,multiclass_ova,ova,ovr
+  // descl2=multi_error,error rate for mulit-class classification
+  // descl2=xentropy,cross-entropy (with optional linear weights),alias=cross_entropy
+  // descl2=xentlambda,"intensity-weighted" cross-entropy,alias=cross_entropy_lambda
+  // descl2=kldiv,`Kullback-Leibler divergence`_,alias=kullback_leibler
+  // desc=support multiple metrics,separated by ,
+  std::vector<std::string> metric;
+
+  // check=>0
+  // alias = output_freq
+  // desc = frequency for metric output
+  int metric_freq = 1;
+
+  // alias=training_metric,is_training_metric,train_metric
+  // desc=set this to true if you need to output metric result over training dataset
  bool is_provide_training_metric = false;
-  int num_iterations = 100;
-  double learning_rate = 0.1;
-  double bagging_fraction = 1.0;
-  int bagging_seed = 3;
-  int bagging_freq = 0;
-  int early_stopping_round = 0;
-  int num_class = 1;
-  double drop_rate = 0.1;
-  int max_drop = 50;
-  double skip_drop = 0.5;
-  bool xgboost_dart_mode = false;
-  bool uniform_drop = false;
-  int drop_seed = 4;
-  double top_rate = 0.2;
-  double other_rate = 0.1;
-  // only used for the regression. Will boost from the average labels.
-  bool boost_from_average = true;
-  std::string tree_learner_type = kDefaultTreeLearnerType;
-  std::string device_type = kDefaultDevice;
-  TreeConfig tree_config;
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;

-  /* filename of forced splits */
-  std::string forcedsplits_filename = "";
-};
+  // default=1,2,3,4,5
+  // alias=ndcg_eval_at,ndcg_at
+  // desc=`NDCG`_ evaluation positions,separated by ,
+  std::vector<int> eval_at;

-/*! \brief Config for Network */
-struct NetworkConfig: public ConfigBase {
-public:
+  #pragma endregion
+
+  #pragma region Network Parameters
+
+  // alias=num_machine
+  // desc=used for parallel learning,the number of machines for parallel learning application
+  // desc=need to set this in both socket and mpi versions
  int num_machines = 1;
+
+  // alias = local_port
+  // desc=TCP listen port for local machines
+  // desc=you should allow this port in firewall settings before training
  int local_listen_port = 12400;
+
+  // desc=socket time-out in minutes
  int time_out = 120;  // in minutes
+
+  // alias=mlist
+  // desc=file that lists machines for this parallel learning application
+  // desc=each line contains one IP and one port for one machine. The format is ip port,separate by space
  std::string machine_list_filename = "";
+
+  // alias=works,nodes
+  // desc=list of machines, format: ip1:port1,ip2:port2
  std::string machines = "";
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-};

+  #pragma endregion
+
+  #pragma region GPU Parameters
+
+  // desc=OpenCL platform ID. Usually each GPU vendor exposes one OpenCL platform
+  // desc=default value is -1,means the system-wide default platform
+  int gpu_platform_id = -1;
+
+  // desc=OpenCL device ID in the specified platform. Each GPU in the selected platform has a unique device ID
+  // desc=default value is -1,means the default device in the selected platform
+  int gpu_device_id = -1;
+
+  // desc=set to true to use double precision math on GPU (default using single precision)
+  bool gpu_use_dp = false;
+
+  #pragma endregion
+
+  #pragma endregion

-/*! \brief Overall config, all configs will put on this class */
-struct OverallConfig: public ConfigBase {
-public:
-  TaskType task_type = TaskType::kTrain;
-  NetworkConfig network_config;
-  int seed = 0;
-  int num_threads = 0;
  bool is_parallel = false;
  bool is_parallel_find_bin = false;
-  IOConfig io_config;
-  std::string boosting_type = kDefaultBoostingType;
-  BoostingConfig boosting_config;
-  std::string objective_type =  kDefaultObjectiveType;
-  ObjectiveConfig objective_config;
-  std::vector<std::string> metric_types;
-  MetricConfig metric_config;
-  std::string convert_model_language = "";
-  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params) override;
-
+  LIGHTGBM_EXPORT void Set(const std::unordered_map<std::string, std::string>& params);
+  static std::unordered_map<std::string, std::string> alias_table;
+  static std::unordered_set<std::string> parameter_set;
 private:
  void CheckParamConflict();
+  void GetMembersFromString(const std::unordered_map<std::string, std::string>& params);
+  std::string SaveMembersToString() const;
 };

-
-inline bool ConfigBase::GetString(
+inline bool Config::GetString(
  const std::unordered_map<std::string, std::string>& params,
  const std::string& name, std::string* out) {
  if (params.count(name) > 0) {
@@ -316,33 +701,33 @@ inline bool ConfigBase::GetString(
  return false;
 }

-inline bool ConfigBase::GetInt(
+inline bool Config::GetInt(
  const std::unordered_map<std::string, std::string>& params,
  const std::string& name, int* out) {
  if (params.count(name) > 0) {
    if (!Common::AtoiAndCheck(params.at(name).c_str(), out)) {
      Log::Fatal("Parameter %s should be of type int, got \"%s\"",
-        name.c_str(), params.at(name).c_str());
+                 name.c_str(), params.at(name).c_str());
    }
    return true;
  }
  return false;
 }

-inline bool ConfigBase::GetDouble(
+inline bool Config::GetDouble(
  const std::unordered_map<std::string, std::string>& params,
  const std::string& name, double* out) {
  if (params.count(name) > 0) {
    if (!Common::AtofAndCheck(params.at(name).c_str(), out)) {
      Log::Fatal("Parameter %s should be of type double, got \"%s\"",
-        name.c_str(), params.at(name).c_str());
+                 name.c_str(), params.at(name).c_str());
    }
    return true;
  }
  return false;
 }

-inline bool ConfigBase::GetBool(
+inline bool Config::GetBool(
  const std::unordered_map<std::string, std::string>& params,
  const std::string& name, bool* out) {
  if (params.count(name) > 0) {
@@ -354,7 +739,7 @@ inline bool ConfigBase::GetBool(
      *out = true;
    } else {
      Log::Fatal("Parameter %s should be \"true\"/\"+\" or \"false\"/\"-\", got \"%s\"",
-        name.c_str(), params.at(name).c_str());
+                 name.c_str(), params.at(name).c_str());
    }
    return true;
  }
@@ -363,154 +748,28 @@ inline bool ConfigBase::GetBool(

 struct ParameterAlias {
  static void KeyAliasTransform(std::unordered_map<std::string, std::string>* params) {
-    const std::unordered_map<std::string, std::string> alias_table(
-    {
-      { "config", "config_file" },
-      { "nthread", "num_threads" },
-      { "num_thread", "num_threads" },
-      { "random_seed", "seed" },
-      { "boosting", "boosting_type" },
-      { "boost", "boosting_type" },
-      { "application", "objective" },
-      { "app", "objective" },
-      { "train_data", "data" },
-      { "train", "data" },
-      { "model_output", "output_model" },
-      { "model_out", "output_model" },
-      { "model_input", "input_model" },
-      { "model_in", "input_model" },
-      { "predict_result", "output_result" },
-      { "prediction_result", "output_result" },
-      { "valid", "valid_data" },
-      { "test_data", "valid_data" },
-      { "test", "valid_data" },
-      { "is_sparse", "is_enable_sparse" },
-      { "enable_sparse", "is_enable_sparse" },
-      { "pre_partition", "is_pre_partition" },
-      { "training_metric", "is_training_metric" },
-      { "train_metric", "is_training_metric" },
-      { "ndcg_at", "ndcg_eval_at" },
-      { "eval_at", "ndcg_eval_at" },
-      { "min_data_per_leaf", "min_data_in_leaf" },
-      { "min_data", "min_data_in_leaf" },
-      { "min_child_samples", "min_data_in_leaf" },
-      { "min_sum_hessian_per_leaf", "min_sum_hessian_in_leaf" },
-      { "min_sum_hessian", "min_sum_hessian_in_leaf" },
-      { "min_hessian", "min_sum_hessian_in_leaf" },
-      { "min_child_weight", "min_sum_hessian_in_leaf" },
-      { "num_leaf", "num_leaves" },
-      { "sub_feature", "feature_fraction" },
-      { "colsample_bytree", "feature_fraction" },
-      { "num_iteration", "num_iterations" },
-      { "num_tree", "num_iterations" },
-      { "num_round", "num_iterations" },
-      { "num_trees", "num_iterations" },
-      { "num_rounds", "num_iterations" },
-      { "num_boost_round", "num_iterations" },
-      { "n_estimators", "num_iterations"},
-      { "sub_row", "bagging_fraction" },
-      { "subsample", "bagging_fraction" },
-      { "subsample_freq", "bagging_freq" },
-      { "shrinkage_rate", "learning_rate" },
-      { "tree", "tree_learner" },
-      { "num_machine", "num_machines" },
-      { "local_port", "local_listen_port" },
-      { "two_round_loading", "use_two_round_loading"},
-      { "two_round", "use_two_round_loading" },
-      { "mlist", "machine_list_file" },
-      { "is_save_binary", "is_save_binary_file" },
-      { "save_binary", "is_save_binary_file" },
-      { "early_stopping_rounds", "early_stopping_round"},
-      { "early_stopping", "early_stopping_round"},
-      { "verbosity", "verbose" },
-      { "header", "has_header" },
-      { "label", "label_column" },
-      { "weight", "weight_column" },
-      { "group", "group_column" },
-      { "query", "group_column" },
-      { "query_column", "group_column" },
-      { "ignore_feature", "ignore_column" },
-      { "blacklist", "ignore_column" },
-      { "categorical_feature", "categorical_column" },
-      { "cat_column", "categorical_column" },
-      { "cat_feature", "categorical_column" },
-      { "predict_raw_score", "is_predict_raw_score" },
-      { "raw_score", "is_predict_raw_score" },
-      { "leaf_index", "is_predict_leaf_index" },
-      { "predict_leaf_index", "is_predict_leaf_index" },
-      { "contrib", "is_predict_contrib" },
-      { "predict_contrib", "is_predict_contrib" },
-      { "min_split_gain", "min_gain_to_split" },
-      { "topk", "top_k" },
-      { "reg_alpha", "lambda_l1" },
-      { "reg_lambda", "lambda_l2" },
-      { "num_classes", "num_class" },
-      { "unbalanced_sets", "is_unbalance" },
-      { "bagging_fraction_seed", "bagging_seed" },
-      { "workers", "machines" },
-      { "nodes", "machines" },
-      { "subsample_for_bin", "bin_construct_sample_cnt" },
-      { "metric_freq", "output_freq" },
-      { "mc", "monotone_constraints" },
-      { "max_tree_output", "max_delta_step" },
-      { "max_leaf_output", "max_delta_step" }
-    });
-    const std::unordered_set<std::string> parameter_set({
-      "config", "config_file", "task", "device",
-      "num_threads", "seed", "boosting_type", "objective", "data",
-      "output_model", "input_model", "output_result", "valid_data",
-      "is_enable_sparse", "is_pre_partition", "is_training_metric",
-      "ndcg_eval_at", "min_data_in_leaf", "min_sum_hessian_in_leaf",
-      "num_leaves", "feature_fraction", "num_iterations",
-      "bagging_fraction", "bagging_freq", "learning_rate", "tree_learner",
-      "num_machines", "local_listen_port", "use_two_round_loading",
-      "machine_list_file", "is_save_binary_file", "early_stopping_round",
-      "verbose", "has_header", "label_column", "weight_column", "group_column",
-      "ignore_column", "categorical_column", "is_predict_raw_score",
-      "is_predict_leaf_index", "min_gain_to_split", "top_k",
-      "lambda_l1", "lambda_l2", "num_class", "is_unbalance",
-      "max_depth", "max_bin", "bagging_seed",
-      "drop_rate", "skip_drop", "max_drop", "uniform_drop",
-      "xgboost_dart_mode", "drop_seed", "top_rate", "other_rate",
-      "min_data_in_bin", "data_random_seed", "bin_construct_sample_cnt",
-      "num_iteration_predict", "pred_early_stop", "pred_early_stop_freq",
-      "pred_early_stop_margin", "use_missing", "sigmoid",
-      "fair_c", "poission_max_delta_step", "scale_pos_weight",
-      "boost_from_average", "max_position", "label_gain",
-      "metric", "output_freq", "time_out",
-      "gpu_platform_id", "gpu_device_id", "gpu_use_dp",
-      "convert_model", "convert_model_language",
-      "feature_fraction_seed", "enable_bundle", "data_filename", "valid_data_filenames",
-      "snapshot_freq", "verbosity", "sparse_threshold", "enable_load_from_binary_file",
-      "max_conflict_rate", "poisson_max_delta_step",
-      "histogram_pool_size", "is_provide_training_metric", "machine_list_filename", "machines",
-      "zero_as_missing", "init_score_file", "valid_init_score_file", "is_predict_contrib",
-      "max_cat_threshold",  "cat_smooth", "min_data_per_group", "cat_l2", "max_cat_to_onehot",
-      "alpha", "reg_sqrt", "tweedie_variance_power", "monotone_constraints", "max_delta_step",
-      "forced_splits"
-    });
    std::unordered_map<std::string, std::string> tmp_map;
    for (const auto& pair : *params) {
-      auto alias = alias_table.find(pair.first);
-      if (alias != alias_table.end()) { // found alias
-        auto alias_set = tmp_map.find(alias->second); 
+      auto alias = Config::alias_table.find(pair.first);
+      if (alias != Config::alias_table.end()) { // found alias
+        auto alias_set = tmp_map.find(alias->second);
        if (alias_set != tmp_map.end()) { // alias already set
-          // set priority by length & alphabetically to ensure reproducible behavior
+                                          // set priority by length & alphabetically to ensure reproducible behavior
          if (alias_set->second.size() < pair.first.size() ||
            (alias_set->second.size() == pair.first.size() && alias_set->second < pair.first)) {
            Log::Warning("%s is set with %s=%s, %s=%s will be ignored. Current value: %s=%s",
-              alias->second.c_str(), alias_set->second.c_str(), params->at(alias_set->second).c_str(),
-              pair.first.c_str(), pair.second.c_str(), alias->second.c_str(), params->at(alias_set->second).c_str());
+                         alias->second.c_str(), alias_set->second.c_str(), params->at(alias_set->second).c_str(),
+                         pair.first.c_str(), pair.second.c_str(), alias->second.c_str(), params->at(alias_set->second).c_str());
          } else {
            Log::Warning("%s is set with %s=%s, will be overridden by %s=%s. Current value: %s=%s",
-              alias->second.c_str(), alias_set->second.c_str(), params->at(alias_set->second).c_str(),
-              pair.first.c_str(), pair.second.c_str(), alias->second.c_str(), pair.second.c_str());
+                         alias->second.c_str(), alias_set->second.c_str(), params->at(alias_set->second).c_str(),
+                         pair.first.c_str(), pair.second.c_str(), alias->second.c_str(), pair.second.c_str());
            tmp_map[alias->second] = pair.first;
          }
        } else { // alias not set
          tmp_map.emplace(alias->second, pair.first);
        }
-      } else if (parameter_set.find(pair.first) == parameter_set.end()) {
+      } else if (Config::parameter_set.find(pair.first) == Config::parameter_set.end()) {
        Log::Warning("Unknown parameter: %s", pair.first.c_str());
      }
    }
@@ -520,9 +779,9 @@ struct ParameterAlias {
        params->emplace(pair.first, params->at(pair.second));
        params->erase(pair.second);
      } else {
-        Log::Warning("%s is set=%s, %s=%s will be ignored. Current value: %s=%s", 
-          pair.first.c_str(), alias->second.c_str(), pair.second.c_str(), params->at(pair.second).c_str(),
-          pair.first.c_str(), alias->second.c_str());
+        Log::Warning("%s is set=%s, %s=%s will be ignored. Current value: %s=%s",
+                     pair.first.c_str(), alias->second.c_str(), pair.second.c_str(), params->at(pair.second).c_str(),
+                     pair.first.c_str(), alias->second.c_str());
      }
    }
  }

--- a/include/LightGBM/dataset.h
+++ b/include/LightGBM/dataset.h
@@ -273,7 +273,7 @@ public:
  * \param label_idx index of label column
  * \return Object of parser
  */
-  static Parser* CreateParser(const char* filename, bool has_header, int num_features, int label_idx);
+  static Parser* CreateParser(const char* filename, bool header, int num_features, int label_idx);
 };

 /*! \brief The main class of data set,
@@ -292,7 +292,7 @@ public:
    int** sample_non_zero_indices,
    const int* num_per_col,
    size_t total_sample_cnt,
-    const IOConfig& io_config);
+    const Config& io_config);

  /*! \brief Destructor */
  LIGHTGBM_EXPORT ~Dataset();

--- a/include/LightGBM/dataset_loader.h
+++ b/include/LightGBM/dataset_loader.h
@@ -8,7 +8,7 @@ namespace LightGBM {
 class DatasetLoader {
 public:

-  LIGHTGBM_EXPORT DatasetLoader(const IOConfig& io_config, const PredictFunction& predict_fun, int num_class, const char* filename);
+  LIGHTGBM_EXPORT DatasetLoader(const Config& io_config, const PredictFunction& predict_fun, int num_class, const char* filename);

  LIGHTGBM_EXPORT ~DatasetLoader();

@@ -54,7 +54,7 @@ private:
  /*! \brief Check can load from binary file */
  std::string CheckCanLoadFromBin(const char* filename);

-  const IOConfig& io_config_;
+  const Config& config_;
  /*! \brief Random generator*/
  Random random_;
  /*! \brief prediction function for initial model */

--- a/include/LightGBM/metric.h
+++ b/include/LightGBM/metric.h
@@ -47,7 +47,7 @@ public:
  * \param type Specific type of metric
  * \param config Config for metric
  */
-  LIGHTGBM_EXPORT static Metric* CreateMetric(const std::string& type, const MetricConfig& config);
+  LIGHTGBM_EXPORT static Metric* CreateMetric(const std::string& type, const Config& config);

 };

@@ -56,11 +56,14 @@ public:
 */
 class DCGCalculator {
 public:
+  
+  static void DefaultEvalAt(std::vector<int>* eval_at);
+  static void DefaultLabelGain(std::vector<double>* label_gain);
  /*!
  * \brief Initial logic
  * \param label_gain Gain for labels, default is 2^i - 1
  */
-  static void Init(std::vector<double> label_gain);
+  static void Init(const std::vector<double>& label_gain);

  /*!
  * \brief Calculate the DCG score at position k

--- a/include/LightGBM/network.h
+++ b/include/LightGBM/network.h
@@ -89,7 +89,7 @@ public:
  * \brief Initialize
  * \param config Config of network setting
  */
-  static void Init(NetworkConfig config);
+  static void Init(Config config);
  /*!
  * \brief Initialize
  */

--- a/include/LightGBM/objective_function.h
+++ b/include/LightGBM/objective_function.h
@@ -71,7 +71,7 @@ public:
  * \param config Config for objective function
  */
  LIGHTGBM_EXPORT static ObjectiveFunction* CreateObjectiveFunction(const std::string& type,
-    const ObjectiveConfig& config);
+    const Config& config);

  /*!
  * \brief Load objective function from string object

--- a/include/LightGBM/tree.h
+++ b/include/LightGBM/tree.h
@@ -170,7 +170,7 @@ public:
  std::string ToJSON() const;

  /*! \brief Serialize this object to if-else statement*/
-  std::string ToIfElse(int index, bool is_predict_leaf_index) const;
+  std::string ToIfElse(int index, bool predict_leaf_index) const;

  inline static bool IsZero(double fval) {
    if (fval > -kZeroThreshold && fval <= kZeroThreshold) {
@@ -307,9 +307,9 @@ private:
  std::string NodeToJSON(int index) const;

  /*! \brief Serialize one node to if-else statement*/
-  std::string NodeToIfElse(int index, bool is_predict_leaf_index) const;
+  std::string NodeToIfElse(int index, bool predict_leaf_index) const;

-  std::string NodeToIfElseByMap(int index, bool is_predict_leaf_index) const;
+  std::string NodeToIfElseByMap(int index, bool predict_leaf_index) const;

  double ExpectedValue() const;


--- a/include/LightGBM/tree_learner.h
+++ b/include/LightGBM/tree_learner.h
@@ -36,9 +36,9 @@ public:

  /*!
  * \brief Reset tree configs
-  * \param tree_config config of tree
+  * \param config config of tree
  */
-  virtual void ResetConfig(const TreeConfig* tree_config) = 0;
+  virtual void ResetConfig(const Config* config) = 0;

  /*!
  * \brief training tree model on dataset 
@@ -85,11 +85,11 @@ public:
  * \brief Create object of tree learner
  * \param learner_type Type of tree learner
  * \param device_type Type of tree learner
-  * \param tree_config config of tree
+  * \param config config of tree
  */
  static TreeLearner* CreateTreeLearner(const std::string& learner_type,
    const std::string& device_type,
-    const TreeConfig* tree_config);
+    const Config* config);
 };

 }  // namespace LightGBM

--- a/src/application/application.cpp
+++ b/src/application/application.cpp
@@ -33,7 +33,7 @@ Application::Application(int argc, char** argv) {
  if (config_.num_threads > 0) {
    omp_set_num_threads(config_.num_threads);
  }
-  if (config_.io_config.data_filename.size() == 0 && config_.task_type != TaskType::kConvertModel) {
+  if (config_.data.size() == 0 && config_.task != TaskType::kConvertModel) {
    Log::Fatal("No training/prediction data, application quit");
  }
  omp_set_nested(0);
@@ -48,13 +48,13 @@ Application::~Application() {
 void Application::LoadParameters(int argc, char** argv) {
  std::unordered_map<std::string, std::string> params;
  for (int i = 1; i < argc; ++i) {
-    ConfigBase::KV2Map(params, argv[i]);
+    Config::KV2Map(params, argv[i]);
  }
  // check for alias
  ParameterAlias::KeyAliasTransform(&params);
  // read parameters from config file
-  if (params.count("config_file") > 0) {
-    TextReader<size_t> config_reader(params["config_file"].c_str(), false);
+  if (params.count("config") > 0) {
+    TextReader<size_t> config_reader(params["config"].c_str(), false);
    config_reader.ReadAllLines();
    if (!config_reader.Lines().empty()) {
      for (auto& line : config_reader.Lines()) {
@@ -66,11 +66,11 @@ void Application::LoadParameters(int argc, char** argv) {
        if (line.size() == 0) {
          continue;
        }
-        ConfigBase::KV2Map(params, line.c_str());
+        Config::KV2Map(params, line.c_str());
      }
    } else {
      Log::Warning("Config file %s doesn't exist, will ignore",
-                   params["config_file"].c_str());
+                   params["config"].c_str());
    }
  }
  // check for alias again
@@ -87,37 +87,37 @@ void Application::LoadData() {
  PredictFunction predict_fun = nullptr;
  PredictionEarlyStopInstance pred_early_stop = CreatePredictionEarlyStopInstance("none", LightGBM::PredictionEarlyStopConfig());
  // need to continue training
-  if (boosting_->NumberOfTotalModel() > 0 && config_.task_type != TaskType::KRefitTree) {
+  if (boosting_->NumberOfTotalModel() > 0 && config_.task != TaskType::KRefitTree) {
    predictor.reset(new Predictor(boosting_.get(), -1, true, false, false, false, -1, -1));
    predict_fun = predictor->GetPredictFunction();
  }

  // sync up random seed for data partition
  if (config_.is_parallel_find_bin) {
-    config_.io_config.data_random_seed = Network::GlobalSyncUpByMin(config_.io_config.data_random_seed);
+    config_.data_random_seed = Network::GlobalSyncUpByMin(config_.data_random_seed);
  }

-  DatasetLoader dataset_loader(config_.io_config, predict_fun,
-                               config_.boosting_config.num_class, config_.io_config.data_filename.c_str());
+  DatasetLoader dataset_loader(config_, predict_fun,
+                               config_.num_class, config_.data.c_str());
  // load Training data
  if (config_.is_parallel_find_bin) {
    // load data for parallel training
-    train_data_.reset(dataset_loader.LoadFromFile(config_.io_config.data_filename.c_str(),
-                                                  config_.io_config.initscore_filename.c_str(),
+    train_data_.reset(dataset_loader.LoadFromFile(config_.data.c_str(),
+                                                  config_.initscore_filename.c_str(),
                                                  Network::rank(), Network::num_machines()));
  } else {
    // load data for single machine
-    train_data_.reset(dataset_loader.LoadFromFile(config_.io_config.data_filename.c_str(), config_.io_config.initscore_filename.c_str(),
+    train_data_.reset(dataset_loader.LoadFromFile(config_.data.c_str(), config_.initscore_filename.c_str(),
                                                  0, 1));
  }
  // need save binary file
-  if (config_.io_config.is_save_binary_file) {
+  if (config_.save_binary) {
    train_data_->SaveBinaryFile(nullptr);
  }
  // create training metric
-  if (config_.boosting_config.is_provide_training_metric) {
-    for (auto metric_type : config_.metric_types) {
-      auto metric = std::unique_ptr<Metric>(Metric::CreateMetric(metric_type, config_.metric_config));
+  if (config_.is_provide_training_metric) {
+    for (auto metric_type : config_.metric) {
+      auto metric = std::unique_ptr<Metric>(Metric::CreateMetric(metric_type, config_));
      if (metric == nullptr) { continue; }
      metric->Init(train_data_->metadata(), train_data_->num_data());
      train_metric_.push_back(std::move(metric));
@@ -126,28 +126,28 @@ void Application::LoadData() {
  train_metric_.shrink_to_fit();


-  if (!config_.metric_types.empty()) {
+  if (!config_.metric.empty()) {
    // only when have metrics then need to construct validation data

    // Add validation data, if it exists
-    for (size_t i = 0; i < config_.io_config.valid_data_filenames.size(); ++i) {
+    for (size_t i = 0; i < config_.valid.size(); ++i) {
      // add
      auto new_dataset = std::unique_ptr<Dataset>(
        dataset_loader.LoadFromFileAlignWithOtherDataset(
-          config_.io_config.valid_data_filenames[i].c_str(),
-          config_.io_config.valid_data_initscores[i].c_str(),
+          config_.valid[i].c_str(),
+          config_.valid_data_initscores[i].c_str(),
          train_data_.get())
        );
      valid_datas_.push_back(std::move(new_dataset));
      // need save binary file
-      if (config_.io_config.is_save_binary_file) {
+      if (config_.save_binary) {
        valid_datas_.back()->SaveBinaryFile(nullptr);
      }

      // add metric for validation data
      valid_metrics_.emplace_back();
-      for (auto metric_type : config_.metric_types) {
-        auto metric = std::unique_ptr<Metric>(Metric::CreateMetric(metric_type, config_.metric_config));
+      for (auto metric_type : config_.metric) {
+        auto metric = std::unique_ptr<Metric>(Metric::CreateMetric(metric_type, config_));
        if (metric == nullptr) { continue; }
        metric->Init(valid_datas_.back()->metadata(),
                     valid_datas_.back()->num_data());
@@ -167,30 +167,30 @@ void Application::LoadData() {
 void Application::InitTrain() {
  if (config_.is_parallel) {
    // need init network
-    Network::Init(config_.network_config);
+    Network::Init(config_);
    Log::Info("Finished initializing network");
-    config_.boosting_config.tree_config.feature_fraction_seed =
-      Network::GlobalSyncUpByMin(config_.boosting_config.tree_config.feature_fraction_seed);
-    config_.boosting_config.tree_config.feature_fraction =
-      Network::GlobalSyncUpByMin(config_.boosting_config.tree_config.feature_fraction);
-    config_.boosting_config.drop_seed =
-      Network::GlobalSyncUpByMin(config_.boosting_config.drop_seed);
+    config_.feature_fraction_seed =
+      Network::GlobalSyncUpByMin(config_.feature_fraction_seed);
+    config_.feature_fraction =
+      Network::GlobalSyncUpByMin(config_.feature_fraction);
+    config_.drop_seed =
+      Network::GlobalSyncUpByMin(config_.drop_seed);
  }

  // create boosting
  boosting_.reset(
-    Boosting::CreateBoosting(config_.boosting_type,
-                             config_.io_config.input_model.c_str()));
+    Boosting::CreateBoosting(config_.boosting,
+                             config_.input_model.c_str()));
  // create objective function
  objective_fun_.reset(
-    ObjectiveFunction::CreateObjectiveFunction(config_.objective_type,
-                                               config_.objective_config));
+    ObjectiveFunction::CreateObjectiveFunction(config_.objective,
+                                               config_));
  // load training data
  LoadData();
  // initialize the objective function
  objective_fun_->Init(train_data_->metadata(), train_data_->num_data());
  // initialize the boosting
-  boosting_->Init(&config_.boosting_config, train_data_.get(), objective_fun_.get(),
+  boosting_->Init(&config_, train_data_.get(), objective_fun_.get(),
                  Common::ConstPtrInVectorWrapper<Metric>(train_metric_));
  // add validation data into boosting
  for (size_t i = 0; i < valid_datas_.size(); ++i) {
@@ -202,22 +202,22 @@ void Application::InitTrain() {

 void Application::Train() {
  Log::Info("Started training...");
-  boosting_->Train(config_.io_config.snapshot_freq, config_.io_config.output_model);
-  boosting_->SaveModelToFile(-1, config_.io_config.output_model.c_str());
+  boosting_->Train(config_.snapshot_freq, config_.output_model);
+  boosting_->SaveModelToFile(-1, config_.output_model.c_str());
  // convert model to if-else statement code
  if (config_.convert_model_language == std::string("cpp")) {
-    boosting_->SaveModelToIfElse(-1, config_.io_config.convert_model.c_str());
+    boosting_->SaveModelToIfElse(-1, config_.convert_model.c_str());
  }
  Log::Info("Finished training");
 }

 void Application::Predict() {

-  if (config_.task_type == TaskType::KRefitTree) {
+  if (config_.task == TaskType::KRefitTree) {
    // create predictor
    Predictor predictor(boosting_.get(), -1, false, true, false, false, 1, 1);
-    predictor.Predict(config_.io_config.data_filename.c_str(), config_.io_config.output_result.c_str(), config_.io_config.has_header);
-    TextReader<int> result_reader(config_.io_config.output_result.c_str(), false);
+    predictor.Predict(config_.data.c_str(), config_.output_result.c_str(), config_.header);
+    TextReader<int> result_reader(config_.output_result.c_str(), false);
    result_reader.ReadAllLines();
    std::vector<std::vector<int>> pred_leaf(result_reader.Lines().size());
    #pragma omp parallel for schedule(static)
@@ -226,41 +226,41 @@ void Application::Predict() {
      // Free memory
      result_reader.Lines()[i].clear();
    }
-    DatasetLoader dataset_loader(config_.io_config, nullptr,
-                                 config_.boosting_config.num_class, config_.io_config.data_filename.c_str());
-    train_data_.reset(dataset_loader.LoadFromFile(config_.io_config.data_filename.c_str(), config_.io_config.initscore_filename.c_str(),
+    DatasetLoader dataset_loader(config_, nullptr,
+                                 config_.num_class, config_.data.c_str());
+    train_data_.reset(dataset_loader.LoadFromFile(config_.data.c_str(), config_.initscore_filename.c_str(),
                                                  0, 1));
    train_metric_.clear();
-    objective_fun_.reset(ObjectiveFunction::CreateObjectiveFunction(config_.objective_type,
-                                                                    config_.objective_config));
+    objective_fun_.reset(ObjectiveFunction::CreateObjectiveFunction(config_.objective,
+                                                                    config_));
    objective_fun_->Init(train_data_->metadata(), train_data_->num_data());
-    boosting_->Init(&config_.boosting_config, train_data_.get(), objective_fun_.get(),
+    boosting_->Init(&config_, train_data_.get(), objective_fun_.get(),
                    Common::ConstPtrInVectorWrapper<Metric>(train_metric_));
    boosting_->RefitTree(pred_leaf);
-    boosting_->SaveModelToFile(-1, config_.io_config.output_model.c_str());
+    boosting_->SaveModelToFile(-1, config_.output_model.c_str());
    Log::Info("Finished RefitTree");
  } else {
    // create predictor
-    Predictor predictor(boosting_.get(), config_.io_config.num_iteration_predict, config_.io_config.is_predict_raw_score,
-                        config_.io_config.is_predict_leaf_index, config_.io_config.is_predict_contrib,
-                        config_.io_config.pred_early_stop, config_.io_config.pred_early_stop_freq,
-                        config_.io_config.pred_early_stop_margin);
-    predictor.Predict(config_.io_config.data_filename.c_str(),
-                      config_.io_config.output_result.c_str(), config_.io_config.has_header);
+    Predictor predictor(boosting_.get(), config_.num_iteration_predict, config_.predict_raw_score,
+                        config_.predict_leaf_index, config_.predict_contrib,
+                        config_.pred_early_stop, config_.pred_early_stop_freq,
+                        config_.pred_early_stop_margin);
+    predictor.Predict(config_.data.c_str(),
+                      config_.output_result.c_str(), config_.header);
    Log::Info("Finished prediction");
  }
 }

 void Application::InitPredict() {
  boosting_.reset(
-    Boosting::CreateBoosting("gbdt", config_.io_config.input_model.c_str()));
+    Boosting::CreateBoosting("gbdt", config_.input_model.c_str()));
  Log::Info("Finished initializing prediction, total used %d iterations", boosting_->GetCurrentIteration());
 }

 void Application::ConvertModel() {
  boosting_.reset(
-    Boosting::CreateBoosting(config_.boosting_type, config_.io_config.input_model.c_str()));
-  boosting_->SaveModelToIfElse(-1, config_.io_config.convert_model.c_str());
+    Boosting::CreateBoosting(config_.boosting, config_.input_model.c_str()));
+  boosting_->SaveModelToIfElse(-1, config_.convert_model.c_str());
 }



--- a/src/application/predictor.hpp
+++ b/src/application/predictor.hpp
@@ -29,11 +29,11 @@ public:
  * \param boosting Input boosting model
  * \param num_iteration Number of boosting round
  * \param is_raw_score True if need to predict result with raw score
-  * \param is_predict_leaf_index True to output leaf index instead of prediction score
-  * \param is_predict_contrib True to output feature contributions instead of prediction score
+  * \param predict_leaf_index True to output leaf index instead of prediction score
+  * \param predict_contrib True to output feature contributions instead of prediction score
  */
  Predictor(Boosting* boosting, int num_iteration,
-            bool is_raw_score, bool is_predict_leaf_index, bool is_predict_contrib,
+            bool is_raw_score, bool predict_leaf_index, bool predict_contrib,
            bool early_stop, int early_stop_freq, double early_stop_margin) {

    early_stop_ = CreatePredictionEarlyStopInstance("none", LightGBM::PredictionEarlyStopConfig());
@@ -55,14 +55,14 @@ public:
    {
      num_threads_ = omp_get_num_threads();
    }
-    boosting->InitPredict(num_iteration, is_predict_contrib);
+    boosting->InitPredict(num_iteration, predict_contrib);
    boosting_ = boosting;
-    num_pred_one_row_ = boosting_->NumPredictOneRow(num_iteration, is_predict_leaf_index, is_predict_contrib);
+    num_pred_one_row_ = boosting_->NumPredictOneRow(num_iteration, predict_leaf_index, predict_contrib);
    num_feature_ = boosting_->MaxFeatureIdx() + 1;
    predict_buf_ = std::vector<std::vector<double>>(num_threads_, std::vector<double>(num_feature_, 0.0f));
    const int kFeatureThreshold = 100000;
    const size_t KSparseThreshold = static_cast<size_t>(0.01 * num_feature_);
-    if (is_predict_leaf_index) {
+    if (predict_leaf_index) {
      predict_fun_ = [this, kFeatureThreshold, KSparseThreshold](const std::vector<std::pair<int, double>>& features, double* output) {
        int tid = omp_get_thread_num();
        if (num_feature_ > kFeatureThreshold && features.size() < KSparseThreshold) {
@@ -75,7 +75,7 @@ public:
          ClearPredictBuffer(predict_buf_[tid].data(), predict_buf_[tid].size(), features);
        }
      };
-    } else if (is_predict_contrib) {
+    } else if (predict_contrib) {
      predict_fun_ = [this](const std::vector<std::pair<int, double>>& features, double* output) {
        int tid = omp_get_thread_num();
        CopyToPredictBuffer(predict_buf_[tid].data(), features);
@@ -127,27 +127,27 @@ public:
  * \param data_filename Filename of data
  * \param result_filename Filename of output result
  */
-  void Predict(const char* data_filename, const char* result_filename, bool has_header) {
+  void Predict(const char* data_filename, const char* result_filename, bool header) {
    auto writer = VirtualFileWriter::Make(result_filename);
    if (!writer->Init()) {
      Log::Fatal("Prediction results file %s cannot be found", result_filename);
    }
-    auto parser = std::unique_ptr<Parser>(Parser::CreateParser(data_filename, has_header, boosting_->MaxFeatureIdx() + 1, boosting_->LabelIdx()));
+    auto parser = std::unique_ptr<Parser>(Parser::CreateParser(data_filename, header, boosting_->MaxFeatureIdx() + 1, boosting_->LabelIdx()));

    if (parser == nullptr) {
      Log::Fatal("Could not recognize the data format of data file %s", data_filename);
    }

-    TextReader<data_size_t> predict_data_reader(data_filename, has_header);
+    TextReader<data_size_t> predict_data_reader(data_filename, header);
    std::unordered_map<int, int> feature_names_map_;
    bool need_adjust = false;
-    if (has_header) {
+    if (header) {
      std::string first_line = predict_data_reader.first_line();
-      std::vector<std::string> header = Common::Split(first_line.c_str(), "\t,");
-      header.erase(header.begin() + boosting_->LabelIdx());
-      for (int i = 0; i < static_cast<int>(header.size()); ++i) {
+      std::vector<std::string> header_words = Common::Split(first_line.c_str(), "\t,");
+      header_words.erase(header_words.begin() + boosting_->LabelIdx());
+      for (int i = 0; i < static_cast<int>(header_words.size()); ++i) {
        for (int j = 0; j < static_cast<int>(boosting_->FeatureNames().size()); ++j) {
-          if (header[i] == boosting_->FeatureNames()[j]) {
+          if (header_words[i] == boosting_->FeatureNames()[j]) {
            feature_names_map_[i] = j;
            break;
          }

--- a/src/boosting/dart.hpp
+++ b/src/boosting/dart.hpp
@@ -32,17 +32,17 @@ public:
  * \param training_metrics Training metrics
  * \param output_model_filename Filename of output model
  */
-  void Init(const BoostingConfig* config, const Dataset* train_data,
+  void Init(const Config* config, const Dataset* train_data,
            const ObjectiveFunction* objective_function,
            const std::vector<const Metric*>& training_metrics) override {
    GBDT::Init(config, train_data, objective_function, training_metrics);
-    random_for_drop_ = Random(gbdt_config_->drop_seed);
+    random_for_drop_ = Random(config_->drop_seed);
    sum_weight_ = 0.0f;
  }

-  void ResetConfig(const BoostingConfig* config) override {
+  void ResetConfig(const Config* config) override {
    GBDT::ResetConfig(config);
-    random_for_drop_ = Random(gbdt_config_->drop_seed);
+    random_for_drop_ = Random(config_->drop_seed);
    sum_weight_ = 0.0f;
  }

@@ -57,7 +57,7 @@ public:
    }
    // normalize
    Normalize();
-    if (!gbdt_config_->uniform_drop) {
+    if (!config_->uniform_drop) {
      tree_weight_.push_back(shrinkage_rate_);
      sum_weight_ += shrinkage_rate_;
    }
@@ -85,31 +85,31 @@ private:
  */
  void DroppingTrees() {
    drop_index_.clear();
-    bool is_skip = random_for_drop_.NextFloat() < gbdt_config_->skip_drop;
+    bool is_skip = random_for_drop_.NextFloat() < config_->skip_drop;
    // select dropping tree indices based on drop_rate and tree weights
    if (!is_skip) {
-      double drop_rate = gbdt_config_->drop_rate;
-      if (!gbdt_config_->uniform_drop) {
+      double drop_rate = config_->drop_rate;
+      if (!config_->uniform_drop) {
        double inv_average_weight = static_cast<double>(tree_weight_.size()) / sum_weight_;
-        if (gbdt_config_->max_drop > 0) {
-          drop_rate = std::min(drop_rate, gbdt_config_->max_drop * inv_average_weight / sum_weight_);
+        if (config_->max_drop > 0) {
+          drop_rate = std::min(drop_rate, config_->max_drop * inv_average_weight / sum_weight_);
        }
        for (int i = 0; i < iter_; ++i) {
          if (random_for_drop_.NextFloat() < drop_rate * tree_weight_[i] * inv_average_weight) {
            drop_index_.push_back(num_init_iteration_ + i);
-            if (drop_index_.size() >= static_cast<size_t>(gbdt_config_->max_drop)) {
+            if (drop_index_.size() >= static_cast<size_t>(config_->max_drop)) {
              break;
            }
          }
        }
      } else {
-        if (gbdt_config_->max_drop > 0) {
-          drop_rate = std::min(drop_rate, gbdt_config_->max_drop / static_cast<double>(iter_));
+        if (config_->max_drop > 0) {
+          drop_rate = std::min(drop_rate, config_->max_drop / static_cast<double>(iter_));
        }
        for (int i = 0; i < iter_; ++i) {
          if (random_for_drop_.NextFloat() < drop_rate) {
            drop_index_.push_back(num_init_iteration_ + i);
-            if (drop_index_.size() >= static_cast<size_t>(gbdt_config_->max_drop)) {
+            if (drop_index_.size() >= static_cast<size_t>(config_->max_drop)) {
              break;
            }
          }
@@ -124,13 +124,13 @@ private:
        train_score_updater_->AddScore(models_[curr_tree].get(), cur_tree_id);
      }
    }
-    if (!gbdt_config_->xgboost_dart_mode) {
-      shrinkage_rate_ = gbdt_config_->learning_rate / (1.0f + static_cast<double>(drop_index_.size()));
+    if (!config_->xgboost_dart_mode) {
+      shrinkage_rate_ = config_->learning_rate / (1.0f + static_cast<double>(drop_index_.size()));
    } else {
      if (drop_index_.empty()) {
-        shrinkage_rate_ = gbdt_config_->learning_rate;
+        shrinkage_rate_ = config_->learning_rate;
      } else {
-        shrinkage_rate_ = gbdt_config_->learning_rate / (gbdt_config_->learning_rate + static_cast<double>(drop_index_.size()));
+        shrinkage_rate_ = config_->learning_rate / (config_->learning_rate + static_cast<double>(drop_index_.size()));
      }
    }
  }
@@ -146,7 +146,7 @@ private:
  */
  void Normalize() {
    double k = static_cast<double>(drop_index_.size());
-    if (!gbdt_config_->xgboost_dart_mode) {
+    if (!config_->xgboost_dart_mode) {
      for (auto i : drop_index_) {
        for (int cur_tree_id = 0; cur_tree_id < num_tree_per_iteration_; ++cur_tree_id) {
          auto curr_tree = i * num_tree_per_iteration_ + cur_tree_id;
@@ -159,7 +159,7 @@ private:
          models_[curr_tree]->Shrinkage(-k);
          train_score_updater_->AddScore(models_[curr_tree].get(), cur_tree_id);
        }
-        if (!gbdt_config_->uniform_drop) {
+        if (!config_->uniform_drop) {
          sum_weight_ -= tree_weight_[i] * (1.0f / (k + 1.0f));
          tree_weight_[i] *= (k / (k + 1.0f));
        }
@@ -174,12 +174,12 @@ private:
            score_updater->AddScore(models_[curr_tree].get(), cur_tree_id);
          }
          // update training score
-          models_[curr_tree]->Shrinkage(-k / gbdt_config_->learning_rate);
+          models_[curr_tree]->Shrinkage(-k / config_->learning_rate);
          train_score_updater_->AddScore(models_[curr_tree].get(), cur_tree_id);
        }
-        if (!gbdt_config_->uniform_drop) {
-          sum_weight_ -= tree_weight_[i] * (1.0f / (k + gbdt_config_->learning_rate));;
-          tree_weight_[i] *= (k / (k + gbdt_config_->learning_rate));
+        if (!config_->uniform_drop) {
+          sum_weight_ -= tree_weight_[i] * (1.0f / (k + config_->learning_rate));;
+          tree_weight_[i] *= (k / (k + config_->learning_rate));
        }
      }
    }

--- a/src/boosting/gbdt.cpp
+++ b/src/boosting/gbdt.cpp
@@ -61,7 +61,7 @@ GBDT::~GBDT() {
  #endif
 }

-void GBDT::Init(const BoostingConfig* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
+void GBDT::Init(const Config* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
                const std::vector<const Metric*>& training_metrics) {
  CHECK(train_data != nullptr);
  CHECK(train_data->num_features() > 0);
@@ -70,9 +70,9 @@ void GBDT::Init(const BoostingConfig* config, const Dataset* train_data, const O
  num_iteration_for_pred_ = 0;
  max_feature_idx_ = 0;
  num_class_ = config->num_class;
-  gbdt_config_ = std::unique_ptr<BoostingConfig>(new BoostingConfig(*config));
-  early_stopping_round_ = gbdt_config_->early_stopping_round;
-  shrinkage_rate_ = gbdt_config_->learning_rate;
+  config_ = std::unique_ptr<Config>(new Config(*config));
+  early_stopping_round_ = config_->early_stopping_round;
+  shrinkage_rate_ = config_->learning_rate;

  std::string forced_splits_path = config->forcedsplits_filename;
  //load forced_splits file
@@ -93,7 +93,7 @@ void GBDT::Init(const BoostingConfig* config, const Dataset* train_data, const O
    is_constant_hessian_ = false;
  }

-  tree_learner_ = std::unique_ptr<TreeLearner>(TreeLearner::CreateTreeLearner(gbdt_config_->tree_learner_type, gbdt_config_->device_type, &gbdt_config_->tree_config));
+  tree_learner_ = std::unique_ptr<TreeLearner>(TreeLearner::CreateTreeLearner(config_->tree_learner, config_->device_type, config_.get()));

  // init tree learner
  tree_learner_->Init(train_data_, is_constant_hessian_);
@@ -123,7 +123,7 @@ void GBDT::Init(const BoostingConfig* config, const Dataset* train_data, const O
  feature_infos_ = train_data_->feature_infos();

  // if need bagging, create buffer
-  ResetBaggingConfig(gbdt_config_.get(), true);
+  ResetBaggingConfig(config_.get(), true);

  // reset config for tree learner
  class_need_train_ = std::vector<bool>(num_tree_per_iteration_, true);
@@ -214,7 +214,7 @@ data_size_t GBDT::BaggingHelper(Random& cur_rand, data_size_t start, data_size_t
  if (cnt <= 0) {
    return 0;
  }
-  data_size_t bag_data_cnt = static_cast<data_size_t>(gbdt_config_->bagging_fraction * cnt);
+  data_size_t bag_data_cnt = static_cast<data_size_t>(config_->bagging_fraction * cnt);
  data_size_t cur_left_cnt = 0;
  data_size_t cur_right_cnt = 0;
  auto right_buffer = buffer + bag_data_cnt;
@@ -233,7 +233,7 @@ data_size_t GBDT::BaggingHelper(Random& cur_rand, data_size_t start, data_size_t

 void GBDT::Bagging(int iter) {
  // if need bagging
-  if ((bag_data_cnt_ < num_data_ && iter % gbdt_config_->bagging_freq == 0)
+  if ((bag_data_cnt_ < num_data_ && iter % config_->bagging_freq == 0)
      || need_re_bagging_) {
    need_re_bagging_ = false;
    const data_size_t min_inner_size = 1000;
@@ -249,7 +249,7 @@ void GBDT::Bagging(int iter) {
      if (cur_start > num_data_) { continue; }
      data_size_t cur_cnt = inner_size;
      if (cur_start + cur_cnt > num_data_) { cur_cnt = num_data_ - cur_start; }
-      Random cur_rand(gbdt_config_->bagging_seed + iter * num_threads_ + i);
+      Random cur_rand(config_->bagging_seed + iter * num_threads_ + i);
      data_size_t cur_left_count = BaggingHelper(cur_rand, cur_start, cur_cnt, tmp_indices_.data() + cur_start);
      offsets_buf_[i] = cur_start;
      left_cnts_buf_[i] = cur_left_count;
@@ -318,7 +318,7 @@ double ObtainAutomaticInitialScore(const ObjectiveFunction* fobj) {
 void GBDT::Train(int snapshot_freq, const std::string& model_output_path) {
  bool is_finished = false;
  auto start_time = std::chrono::steady_clock::now();
-  for (int iter = 0; iter < gbdt_config_->num_iterations && !is_finished; ++iter) {
+  for (int iter = 0; iter < config_->num_iterations && !is_finished; ++iter) {
    is_finished = TrainOneIter(nullptr, nullptr);
    if (!is_finished) {
      is_finished = EvalAndCheckEarlyStopping();
@@ -364,7 +364,7 @@ double GBDT::BoostFromAverage() {
  if (models_.empty() && !train_score_updater_->has_init_score()
      && num_class_ <= 1
      && objective_function_ != nullptr) {
-    if (gbdt_config_->boost_from_average) {
+    if (config_->boost_from_average) {
      double init_score = ObtainAutomaticInitialScore(objective_function_);
      if (std::fabs(init_score) > kEpsilon) {
        train_score_updater_->AddScore(init_score, 0);
@@ -580,7 +580,7 @@ std::vector<double> GBDT::EvalOneMetric(const Metric* metric, const double* scor
 }

 std::string GBDT::OutputMetric(int iter) {
-  bool need_output = (iter % gbdt_config_->output_freq) == 0;
+  bool need_output = (iter % config_->metric_freq) == 0;
  std::string ret = "";
  std::stringstream msg_buf;
  std::vector<std::pair<size_t, size_t>> meet_early_stopping_pairs;
@@ -777,24 +777,24 @@ void GBDT::ResetTrainingData(const Dataset* train_data, const ObjectiveFunction*
    feature_infos_ = train_data_->feature_infos();

    tree_learner_->ResetTrainingData(train_data);
-    ResetBaggingConfig(gbdt_config_.get(), true);
+    ResetBaggingConfig(config_.get(), true);
  }
 }

-void GBDT::ResetConfig(const BoostingConfig* config) {
-  auto new_config = std::unique_ptr<BoostingConfig>(new BoostingConfig(*config));
+void GBDT::ResetConfig(const Config* config) {
+  auto new_config = std::unique_ptr<Config>(new Config(*config));
  early_stopping_round_ = new_config->early_stopping_round;
  shrinkage_rate_ = new_config->learning_rate;
  if (tree_learner_ != nullptr) {
-    tree_learner_->ResetConfig(&new_config->tree_config);
+    tree_learner_->ResetConfig(new_config.get());
  }
  if (train_data_ != nullptr) {
    ResetBaggingConfig(new_config.get(), false);
  }
-  gbdt_config_.reset(new_config.release());
+  config_.reset(new_config.release());
 }

-void GBDT::ResetBaggingConfig(const BoostingConfig* config, bool is_change_dataset) {
+void GBDT::ResetBaggingConfig(const Config* config, bool is_change_dataset) {
  // if need bagging, create buffer
  if (config->bagging_fraction < 1.0 && config->bagging_freq > 0) {
    bag_data_cnt_ =

--- a/src/boosting/gbdt.h
+++ b/src/boosting/gbdt.h
@@ -43,7 +43,7 @@ public:
  * \param objective_function Training objective function
  * \param training_metrics Training metrics
  */
-  void Init(const BoostingConfig* gbdt_config, const Dataset* train_data,
+  void Init(const Config* gbdt_config, const Dataset* train_data,
            const ObjectiveFunction* objective_function,
            const std::vector<const Metric*>& training_metrics) override;

@@ -83,7 +83,7 @@ public:
  * \brief Reset Boosting Config
  * \param gbdt_config Config for boosting
  */
-  void ResetConfig(const BoostingConfig* gbdt_config) override;
+  void ResetConfig(const Config* gbdt_config) override;

  /*!
  * \brief Adding a validation dataset
@@ -335,7 +335,7 @@ protected:
  /*!
  * \brief reset config for bagging
  */
-  void ResetBaggingConfig(const BoostingConfig* config, bool is_change_dataset);
+  void ResetBaggingConfig(const Config* config, bool is_change_dataset);

  /*!
  * \brief Implement bagging logic
@@ -384,7 +384,7 @@ protected:
  /*! \brief Pointer to training data */
  const Dataset* train_data_;
  /*! \brief Config of gbdt */
-  std::unique_ptr<BoostingConfig> gbdt_config_;
+  std::unique_ptr<Config> config_;
  /*! \brief Tree learner, will use this class to learn trees */
  std::unique_ptr<TreeLearner> tree_learner_;
  /*! \brief Objective function */

--- a/src/boosting/gbdt_model_text.cpp
+++ b/src/boosting/gbdt_model_text.cpp
@@ -300,6 +300,10 @@ std::string GBDT::SaveModelToString(int num_iteration) const {
  for (size_t i = 0; i < pairs.size(); ++i) {
    ss << pairs[i].second << "=" << std::to_string(pairs[i].first) << '\n';
  }
+  if (config_ != nullptr) {
+    ss << "parameters:" << '\n';
+    ss << config_->ToString() << "\n";
+  }
  return ss.str();
 }


--- a/src/boosting/goss.hpp
+++ b/src/boosting/goss.hpp
@@ -39,7 +39,7 @@ public:
    #endif
  }

-  void Init(const BoostingConfig* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
+  void Init(const Config* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
            const std::vector<const Metric*>& training_metrics) override {
    GBDT::Init(config, train_data, objective_function, training_metrics);
    ResetGoss();
@@ -51,15 +51,15 @@ public:
    ResetGoss();
  }

-  void ResetConfig(const BoostingConfig* config) override {
+  void ResetConfig(const Config* config) override {
    GBDT::ResetConfig(config);
    ResetGoss();
  }

  void ResetGoss() {
-    CHECK(gbdt_config_->top_rate + gbdt_config_->other_rate <= 1.0f);
-    CHECK(gbdt_config_->top_rate > 0.0f && gbdt_config_->other_rate > 0.0f);
-    if (gbdt_config_->bagging_freq > 0 && gbdt_config_->bagging_fraction != 1.0f) {
+    CHECK(config_->top_rate + config_->other_rate <= 1.0f);
+    CHECK(config_->top_rate > 0.0f && config_->other_rate > 0.0f);
+    if (config_->bagging_freq > 0 && config_->bagging_fraction != 1.0f) {
      Log::Fatal("Cannot use bagging in GOSS");
    }
    Log::Info("Using GOSS");
@@ -74,8 +74,8 @@ public:
    right_write_pos_buf_.resize(num_threads_);

    is_use_subset_ = false;
-    if (gbdt_config_->top_rate + gbdt_config_->other_rate <= 0.5) {
-      auto bag_data_cnt = static_cast<data_size_t>((gbdt_config_->top_rate + gbdt_config_->other_rate) * num_data_);
+    if (config_->top_rate + config_->other_rate <= 0.5) {
+      auto bag_data_cnt = static_cast<data_size_t>((config_->top_rate + config_->other_rate) * num_data_);
      bag_data_cnt = std::max(1, bag_data_cnt);
      tmp_subset_.reset(new Dataset(bag_data_cnt));
      tmp_subset_->CopyFeatureMapperFrom(train_data_);
@@ -93,8 +93,8 @@ public:
        tmp_gradients[i] += std::fabs(gradients_[idx] * hessians_[idx]);
      }
    }
-    data_size_t top_k = static_cast<data_size_t>(cnt * gbdt_config_->top_rate);
-    data_size_t other_k = static_cast<data_size_t>(cnt * gbdt_config_->other_rate);
+    data_size_t top_k = static_cast<data_size_t>(cnt * config_->top_rate);
+    data_size_t other_k = static_cast<data_size_t>(cnt * config_->other_rate);
    top_k = std::max(1, top_k);
    ArrayArgs<score_t>::ArgMaxAtK(&tmp_gradients, 0, static_cast<int>(tmp_gradients.size()), top_k - 1);
    score_t threshold = tmp_gradients[top_k - 1];
@@ -135,7 +135,7 @@ public:
  void Bagging(int iter) override {
    bag_data_cnt_ = num_data_;
    // not subsample for first iterations
-    if (iter < static_cast<int>(1.0f / gbdt_config_->learning_rate)) { return; }
+    if (iter < static_cast<int>(1.0f / config_->learning_rate)) { return; }

    const data_size_t min_inner_size = 100;
    data_size_t inner_size = (num_data_ + num_threads_ - 1) / num_threads_;
@@ -150,7 +150,7 @@ public:
      if (cur_start > num_data_) { continue; }
      data_size_t cur_cnt = inner_size;
      if (cur_start + cur_cnt > num_data_) { cur_cnt = num_data_ - cur_start; }
-      Random cur_rand(gbdt_config_->bagging_seed + iter * num_threads_ + i);
+      Random cur_rand(config_->bagging_seed + iter * num_threads_ + i);
      data_size_t cur_left_count = BaggingHelper(cur_rand, cur_start, cur_cnt,
                                                 tmp_indices_.data() + cur_start, tmp_indice_right_.data() + cur_start);
      offsets_buf_[i] = cur_start;

--- a/src/boosting/rf.hpp
+++ b/src/boosting/rf.hpp
@@ -24,10 +24,10 @@ public:

  ~RF() {}

-  void Init(const BoostingConfig* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
+  void Init(const Config* config, const Dataset* train_data, const ObjectiveFunction* objective_function,
            const std::vector<const Metric*>& training_metrics) override {
    CHECK(config->bagging_freq > 0 && config->bagging_fraction < 1.0f && config->bagging_fraction > 0.0f);
-    CHECK(config->tree_config.feature_fraction < 1.0f && config->tree_config.feature_fraction > 0.0f);
+    CHECK(config->feature_fraction < 1.0f && config->feature_fraction > 0.0f);
    GBDT::Init(config, train_data, objective_function, training_metrics);

    if (num_init_iteration_ > 0) {
@@ -50,9 +50,9 @@ public:
    }
  }

-  void ResetConfig(const BoostingConfig* config) override {
+  void ResetConfig(const Config* config) override {
    CHECK(config->bagging_freq > 0 && config->bagging_fraction < 1.0f && config->bagging_fraction > 0.0f);
-    CHECK(config->tree_config.feature_fraction < 1.0f && config->tree_config.feature_fraction > 0.0f);
+    CHECK(config->feature_fraction < 1.0f && config->feature_fraction > 0.0f);
    GBDT::ResetConfig(config);
    // not shrinkage rate for the RF
    shrinkage_rate_ = 1.0f;