"git@developer.sourcefind.cn:tianlh/lightgbm-dcu.git" did not exist on "2ce2b096c857c302ced732870334cb5b8e9cb321"
Unverified Commit 7880b79f authored by James Lamb's avatar James Lamb Committed by GitHub
Browse files

[docs] Change some 'parallel learning' references to 'distributed learning' (#4000)

* [docs] Change some 'parallel learning' references to 'distributed learning'

* found a few more

* one more reference
parent 0ee4d37f
OPTION(USE_MPI "Enable MPI-based parallel learning" OFF) OPTION(USE_MPI "Enable MPI-based distributed learning" OFF)
OPTION(USE_OPENMP "Enable OpenMP" ON) OPTION(USE_OPENMP "Enable OpenMP" ON)
OPTION(USE_GPU "Enable GPU-accelerated training" OFF) OPTION(USE_GPU "Enable GPU-accelerated training" OFF)
OPTION(USE_SWIG "Enable SWIG to generate Java API" OFF) OPTION(USE_SWIG "Enable SWIG to generate Java API" OFF)
......
...@@ -21,14 +21,14 @@ LightGBM is a gradient boosting framework that uses tree based learning algorith ...@@ -21,14 +21,14 @@ LightGBM is a gradient boosting framework that uses tree based learning algorith
- Faster training speed and higher efficiency. - Faster training speed and higher efficiency.
- Lower memory usage. - Lower memory usage.
- Better accuracy. - Better accuracy.
- Support of parallel and GPU learning. - Support of parallel, distributed, and GPU learning.
- Capable of handling large-scale data. - Capable of handling large-scale data.
For further details, please refer to [Features](https://github.com/microsoft/LightGBM/blob/master/docs/Features.rst). For further details, please refer to [Features](https://github.com/microsoft/LightGBM/blob/master/docs/Features.rst).
Benefitting from these advantages, LightGBM is being widely-used in many [winning solutions](https://github.com/microsoft/LightGBM/blob/master/examples/README.md#machine-learning-challenge-winning-solutions) of machine learning competitions. Benefitting from these advantages, LightGBM is being widely-used in many [winning solutions](https://github.com/microsoft/LightGBM/blob/master/examples/README.md#machine-learning-challenge-winning-solutions) of machine learning competitions.
[Comparison experiments](https://github.com/microsoft/LightGBM/blob/master/docs/Experiments.rst#comparison-experiment) on public datasets show that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. What's more, [parallel experiments](https://github.com/microsoft/LightGBM/blob/master/docs/Experiments.rst#parallel-experiment) show that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings. [Comparison experiments](https://github.com/microsoft/LightGBM/blob/master/docs/Experiments.rst#comparison-experiment) on public datasets show that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. What's more, [distributed learning experiments](https://github.com/microsoft/LightGBM/blob/master/docs/Experiments.rst#parallel-experiment) show that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings.
Get Started and Documentation Get Started and Documentation
----------------------------- -----------------------------
...@@ -40,7 +40,7 @@ Next you may want to read: ...@@ -40,7 +40,7 @@ Next you may want to read:
- [**Examples**](https://github.com/microsoft/LightGBM/tree/master/examples) showing command line usage of common tasks. - [**Examples**](https://github.com/microsoft/LightGBM/tree/master/examples) showing command line usage of common tasks.
- [**Features**](https://github.com/microsoft/LightGBM/blob/master/docs/Features.rst) and algorithms supported by LightGBM. - [**Features**](https://github.com/microsoft/LightGBM/blob/master/docs/Features.rst) and algorithms supported by LightGBM.
- [**Parameters**](https://github.com/microsoft/LightGBM/blob/master/docs/Parameters.rst) is an exhaustive list of customization you can make. - [**Parameters**](https://github.com/microsoft/LightGBM/blob/master/docs/Parameters.rst) is an exhaustive list of customization you can make.
- [**Parallel Learning**](https://github.com/microsoft/LightGBM/blob/master/docs/Parallel-Learning-Guide.rst) and [**GPU Learning**](https://github.com/microsoft/LightGBM/blob/master/docs/GPU-Tutorial.rst) can speed up computation. - [**Distributed Learning**](https://github.com/microsoft/LightGBM/blob/master/docs/Parallel-Learning-Guide.rst) and [**GPU Learning**](https://github.com/microsoft/LightGBM/blob/master/docs/GPU-Tutorial.rst) can speed up computation.
- [**Laurae++ interactive documentation**](https://sites.google.com/view/lauraepp/parameters) is a detailed guide for hyperparameters. - [**Laurae++ interactive documentation**](https://sites.google.com/view/lauraepp/parameters) is a detailed guide for hyperparameters.
- [**Optuna Hyperparameter Tuner**](https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258) provides automated tuning for LightGBM hyperparameters ([code examples](https://github.com/optuna/optuna/blob/master/examples/)). - [**Optuna Hyperparameter Tuner**](https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258) provides automated tuning for LightGBM hyperparameters ([code examples](https://github.com/optuna/optuna/blob/master/examples/)).
......
...@@ -62,7 +62,7 @@ Parameters Tuning ...@@ -62,7 +62,7 @@ Parameters Tuning
Parallel Learning Parallel Learning
----------------- -----------------
- Refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__. - Refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__.
GPU Support GPU Support
----------- -----------
......
...@@ -239,7 +239,7 @@ Results ...@@ -239,7 +239,7 @@ Results
| 16 | 42 s | 11GB | | 16 | 42 s | 11GB |
+----------+---------------+---------------------------+ +----------+---------------+---------------------------+
The results show that LightGBM achieves a linear speedup with parallel learning. The results show that LightGBM achieves a linear speedup with distributed learning.
GPU Experiments GPU Experiments
--------------- ---------------
......
...@@ -28,7 +28,7 @@ LightGBM uses histogram-based algorithms\ `[4, 5, 6] <#references>`__, which buc ...@@ -28,7 +28,7 @@ LightGBM uses histogram-based algorithms\ `[4, 5, 6] <#references>`__, which buc
- No need to store additional information for pre-sorting feature values - No need to store additional information for pre-sorting feature values
- **Reduce communication cost for parallel learning** - **Reduce communication cost for distributed learning**
Sparse Optimization Sparse Optimization
------------------- -------------------
...@@ -68,14 +68,14 @@ More specifically, LightGBM sorts the histogram (for a categorical feature) acco ...@@ -68,14 +68,14 @@ More specifically, LightGBM sorts the histogram (for a categorical feature) acco
Optimization in Network Communication Optimization in Network Communication
------------------------------------- -------------------------------------
It only needs to use some collective communication algorithms, like "All reduce", "All gather" and "Reduce scatter", in parallel learning of LightGBM. It only needs to use some collective communication algorithms, like "All reduce", "All gather" and "Reduce scatter", in distributed learning of LightGBM.
LightGBM implements state-of-art algorithms\ `[9] <#references>`__. LightGBM implements state-of-art algorithms\ `[9] <#references>`__.
These collective communication algorithms can provide much better performance than point-to-point communication. These collective communication algorithms can provide much better performance than point-to-point communication.
Optimization in Parallel Learning Optimization in Parallel Learning
--------------------------------- ---------------------------------
LightGBM provides the following parallel learning algorithms. LightGBM provides the following distributed learning algorithms.
Feature Parallel Feature Parallel
~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~
......
...@@ -348,7 +348,7 @@ Build MPI Version ...@@ -348,7 +348,7 @@ Build MPI Version
The default build version of LightGBM is based on socket. LightGBM also supports MPI. The default build version of LightGBM is based on socket. LightGBM also supports MPI.
`MPI`_ is a high performance communication approach with `RDMA`_ support. `MPI`_ is a high performance communication approach with `RDMA`_ support.
If you need to run a parallel learning application with high performance communication, you can build the LightGBM with MPI support. If you need to run a distributed learning application with high performance communication, you can build the LightGBM with MPI support.
Windows Windows
^^^^^^^ ^^^^^^^
......
Parallel Learning Guide Parallel Learning Guide
======================= =======================
This is a guide for parallel learning of LightGBM. This guide describes distributed learning in LightGBM. Distributed learning allows the use of multiple machines to produce a single model.
Follow the `Quick Start <./Quick-Start.rst>`__ to know how to use LightGBM first. Follow the `Quick Start <./Quick-Start.rst>`__ to know how to use LightGBM first.
...@@ -20,7 +20,7 @@ Follow the `Quick Start <./Quick-Start.rst>`__ to know how to use LightGBM first ...@@ -20,7 +20,7 @@ Follow the `Quick Start <./Quick-Start.rst>`__ to know how to use LightGBM first
Choose Appropriate Parallel Algorithm Choose Appropriate Parallel Algorithm
------------------------------------- -------------------------------------
LightGBM provides 3 parallel learning algorithms now. LightGBM provides 3 distributed learning algorithms now.
+--------------------+---------------------------+ +--------------------+---------------------------+
| Parallel Algorithm | How to Use | | Parallel Algorithm | How to Use |
...@@ -57,7 +57,7 @@ Preparation ...@@ -57,7 +57,7 @@ Preparation
Socket Version Socket Version
^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
It needs to collect IP of all machines that want to run parallel learning in and allocate one TCP port (assume 12345 here) for all machines, It needs to collect IP of all machines that want to run distributed learning in and allocate one TCP port (assume 12345 here) for all machines,
and change firewall rules to allow income of this port (12345). Then write these IP and ports in one file (assume ``mlist.txt``), like following: and change firewall rules to allow income of this port (12345). Then write these IP and ports in one file (assume ``mlist.txt``), like following:
.. code:: .. code::
...@@ -68,7 +68,7 @@ and change firewall rules to allow income of this port (12345). Then write these ...@@ -68,7 +68,7 @@ and change firewall rules to allow income of this port (12345). Then write these
MPI Version MPI Version
^^^^^^^^^^^ ^^^^^^^^^^^
It needs to collect IP (or hostname) of all machines that want to run parallel learning in. It needs to collect IP (or hostname) of all machines that want to run distributed learning in.
Then write these IP in one file (assume ``mlist.txt``) like following: Then write these IP in one file (assume ``mlist.txt``) like following:
.. code:: .. code::
...@@ -132,7 +132,7 @@ MPI Version ...@@ -132,7 +132,7 @@ MPI Version
Example Example
^^^^^^^ ^^^^^^^
- `A simple parallel example`_ - `A simple distributed learning example`_
.. _MMLSpark: https://aka.ms/spark .. _MMLSpark: https://aka.ms/spark
...@@ -148,4 +148,4 @@ Example ...@@ -148,4 +148,4 @@ Example
.. _here: https://www.youtube.com/watch?v=iqzXhp5TxUY .. _here: https://www.youtube.com/watch?v=iqzXhp5TxUY
.. _A simple parallel example: https://github.com/microsoft/lightgbm/tree/master/examples/parallel_learning .. _A simple distributed learning example: https://github.com/microsoft/lightgbm/tree/master/examples/parallel_learning
...@@ -181,7 +181,7 @@ Core Parameters ...@@ -181,7 +181,7 @@ Core Parameters
- ``voting``, voting parallel tree learner, aliases: ``voting_parallel`` - ``voting``, voting parallel tree learner, aliases: ``voting_parallel``
- refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details - refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` - ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
...@@ -195,7 +195,7 @@ Core Parameters ...@@ -195,7 +195,7 @@ Core Parameters
- be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal** - be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal**
- for parallel learning, do not use all CPU cores because this will cause poor performance for the network communication - for distributed learning, do not use all CPU cores because this will cause poor performance for the network communication
- **Note**: please **don't** change this during training, especially when running multiple jobs simultaneously by external packages, otherwise it may cause undesirable errors - **Note**: please **don't** change this during training, especially when running multiple jobs simultaneously by external packages, otherwise it may cause undesirable errors
...@@ -714,7 +714,7 @@ Dataset Parameters ...@@ -714,7 +714,7 @@ Dataset Parameters
- ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition`` - ``pre_partition`` :raw-html:`<a id="pre_partition" title="Permalink to this parameter" href="#pre_partition">&#x1F517;&#xFE0E;</a>`, default = ``false``, type = bool, aliases: ``is_pre_partition``
- used for parallel learning (excluding the ``feature_parallel`` mode) - used for distributed learning (excluding the ``feature_parallel`` mode)
- ``true`` if training data are pre-partitioned, and different machines use different partitions - ``true`` if training data are pre-partitioned, and different machines use different partitions
...@@ -1133,7 +1133,7 @@ Network Parameters ...@@ -1133,7 +1133,7 @@ Network Parameters
- ``num_machines`` :raw-html:`<a id="num_machines" title="Permalink to this parameter" href="#num_machines">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_machine``, constraints: ``num_machines > 0`` - ``num_machines`` :raw-html:`<a id="num_machines" title="Permalink to this parameter" href="#num_machines">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``num_machine``, constraints: ``num_machines > 0``
- the number of machines for parallel learning application - the number of machines for distributed learning application
- this parameter is needed to be set in both **socket** and **mpi** versions - this parameter is needed to be set in both **socket** and **mpi** versions
...@@ -1149,7 +1149,7 @@ Network Parameters ...@@ -1149,7 +1149,7 @@ Network Parameters
- ``machine_list_filename`` :raw-html:`<a id="machine_list_filename" title="Permalink to this parameter" href="#machine_list_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``machine_list_file``, ``machine_list``, ``mlist`` - ``machine_list_filename`` :raw-html:`<a id="machine_list_filename" title="Permalink to this parameter" href="#machine_list_filename">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``machine_list_file``, ``machine_list``, ``mlist``
- path of file that lists machines for this parallel learning application - path of file that lists machines for this distributed learning application
- each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator) - each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator)
......
...@@ -77,7 +77,7 @@ Examples ...@@ -77,7 +77,7 @@ Examples
- `Lambdarank <https://github.com/microsoft/LightGBM/tree/master/examples/lambdarank>`__ - `Lambdarank <https://github.com/microsoft/LightGBM/tree/master/examples/lambdarank>`__
- `Parallel Learning <https://github.com/microsoft/LightGBM/tree/master/examples/parallel_learning>`__ - `Distributed Learning <https://github.com/microsoft/LightGBM/tree/master/examples/parallel_learning>`__
.. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values .. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values
......
...@@ -17,7 +17,7 @@ Welcome to LightGBM's documentation! ...@@ -17,7 +17,7 @@ Welcome to LightGBM's documentation!
- Faster training speed and higher efficiency. - Faster training speed and higher efficiency.
- Lower memory usage. - Lower memory usage.
- Better accuracy. - Better accuracy.
- Support of parallel and GPU learning. - Support of parallel, distributed, and GPU learning.
- Capable of handling large-scale data. - Capable of handling large-scale data.
For more details, please refer to `Features <./Features.rst>`__. For more details, please refer to `Features <./Features.rst>`__.
...@@ -36,7 +36,7 @@ For more details, please refer to `Features <./Features.rst>`__. ...@@ -36,7 +36,7 @@ For more details, please refer to `Features <./Features.rst>`__.
C API <C-API> C API <C-API>
Python API <Python-API> Python API <Python-API>
R API <https://lightgbm.readthedocs.io/en/latest/R/reference/> R API <https://lightgbm.readthedocs.io/en/latest/R/reference/>
Parallel Learning Guide <Parallel-Learning-Guide> Distributed Learning Guide <Parallel-Learning-Guide>
GPU Tutorial <GPU-Tutorial> GPU Tutorial <GPU-Tutorial>
Advanced Topics <Advanced-Topics> Advanced Topics <Advanced-Topics>
FAQ <FAQ> FAQ <FAQ>
......
...@@ -98,13 +98,13 @@ output_model = LightGBM_model.txt ...@@ -98,13 +98,13 @@ output_model = LightGBM_model.txt
# output_result= prediction.txt # output_result= prediction.txt
# number of machines in parallel training, alias: num_machine # number of machines in distributed training, alias: num_machine
num_machines = 1 num_machines = 1
# local listening port in parallel training, alias: local_port # local listening port in distributed training, alias: local_port
local_listen_port = 12400 local_listen_port = 12400
# machines list file for parallel training, alias: mlist # machines list file for distributed training, alias: mlist
machine_list_file = mlist.txt machine_list_file = mlist.txt
# force splits # force splits
......
...@@ -100,13 +100,13 @@ output_model = LightGBM_model.txt ...@@ -100,13 +100,13 @@ output_model = LightGBM_model.txt
# output_result= prediction.txt # output_result= prediction.txt
# number of machines in parallel training, alias: num_machine # number of machines in distributed training, alias: num_machine
num_machines = 1 num_machines = 1
# local listening port in parallel training, alias: local_port # local listening port in distributed training, alias: local_port
local_listen_port = 12400 local_listen_port = 12400
# machines list file for parallel training, alias: mlist # machines list file for distributed training, alias: mlist
machine_list_file = mlist.txt machine_list_file = mlist.txt
# force splits # force splits
......
...@@ -103,11 +103,11 @@ output_model = LightGBM_model.txt ...@@ -103,11 +103,11 @@ output_model = LightGBM_model.txt
# output_result= prediction.txt # output_result= prediction.txt
# number of machines in parallel training, alias: num_machine # number of machines in distributed training, alias: num_machine
num_machines = 1 num_machines = 1
# local listening port in parallel training, alias: local_port # local listening port in distributed training, alias: local_port
local_listen_port = 12400 local_listen_port = 12400
# machines list file for parallel training, alias: mlist # machines list file for distributed training, alias: mlist
machine_list_file = mlist.txt machine_list_file = mlist.txt
Parallel Learning Example Parallel Learning Example
========================= =========================
Here is an example for LightGBM to perform parallel learning for 2 machines. Here is an example for LightGBM to perform distributed learning for 2 machines.
1. Edit [mlist.txt](./mlist.txt): write the ip of these 2 machines that you want to run application on. 1. Edit [mlist.txt](./mlist.txt): write the ip of these 2 machines that you want to run application on.
...@@ -16,6 +16,6 @@ Here is an example for LightGBM to perform parallel learning for 2 machines. ...@@ -16,6 +16,6 @@ Here is an example for LightGBM to perform parallel learning for 2 machines.
```"./lightgbm" config=train.conf``` ```"./lightgbm" config=train.conf```
This parallel learning example is based on socket. LightGBM also supports parallel learning based on mpi. This distributed learning example is based on socket. LightGBM also supports distributed learning based on MPI.
For more details about the usage of parallel learning, please refer to [this](https://github.com/microsoft/LightGBM/blob/master/docs/Parallel-Learning-Guide.rst). For more details about the usage of distributed learning, please refer to [this](https://github.com/microsoft/LightGBM/blob/master/docs/Parallel-Learning-Guide.rst).
...@@ -101,11 +101,11 @@ output_model = LightGBM_model.txt ...@@ -101,11 +101,11 @@ output_model = LightGBM_model.txt
# output_result= prediction.txt # output_result= prediction.txt
# number of machines in parallel training, alias: num_machine # number of machines in distributed training, alias: num_machine
num_machines = 1 num_machines = 1
# local listening port in parallel training, alias: local_port # local listening port in distributed training, alias: local_port
local_listen_port = 12400 local_listen_port = 12400
# machines list file for parallel training, alias: mlist # machines list file for distributed training, alias: mlist
machine_list_file = mlist.txt machine_list_file = mlist.txt
...@@ -104,11 +104,11 @@ output_model = LightGBM_model.txt ...@@ -104,11 +104,11 @@ output_model = LightGBM_model.txt
# output_result= prediction.txt # output_result= prediction.txt
# number of machines in parallel training, alias: num_machine # number of machines in distributed training, alias: num_machine
num_machines = 1 num_machines = 1
# local listening port in parallel training, alias: local_port # local listening port in distributed training, alias: local_port
local_listen_port = 12400 local_listen_port = 12400
# machines list file for parallel training, alias: mlist # machines list file for distributed training, alias: mlist
machine_list_file = mlist.txt machine_list_file = mlist.txt
...@@ -200,7 +200,7 @@ struct Config { ...@@ -200,7 +200,7 @@ struct Config {
// desc = ``feature``, feature parallel tree learner, aliases: ``feature_parallel`` // desc = ``feature``, feature parallel tree learner, aliases: ``feature_parallel``
// desc = ``data``, data parallel tree learner, aliases: ``data_parallel`` // desc = ``data``, data parallel tree learner, aliases: ``data_parallel``
// desc = ``voting``, voting parallel tree learner, aliases: ``voting_parallel`` // desc = ``voting``, voting parallel tree learner, aliases: ``voting_parallel``
// desc = refer to `Parallel Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details // desc = refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details
std::string tree_learner = "serial"; std::string tree_learner = "serial";
// alias = num_thread, nthread, nthreads, n_jobs // alias = num_thread, nthread, nthreads, n_jobs
...@@ -209,7 +209,7 @@ struct Config { ...@@ -209,7 +209,7 @@ struct Config {
// desc = for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core) // desc = for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core)
// desc = do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows) // desc = do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows)
// desc = be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal** // desc = be aware a task manager or any similar CPU monitoring tool might report that cores not being fully utilized. **This is normal**
// desc = for parallel learning, do not use all CPU cores because this will cause poor performance for the network communication // desc = for distributed learning, do not use all CPU cores because this will cause poor performance for the network communication
// desc = **Note**: please **don't** change this during training, especially when running multiple jobs simultaneously by external packages, otherwise it may cause undesirable errors // desc = **Note**: please **don't** change this during training, especially when running multiple jobs simultaneously by external packages, otherwise it may cause undesirable errors
int num_threads = 0; int num_threads = 0;
...@@ -634,7 +634,7 @@ struct Config { ...@@ -634,7 +634,7 @@ struct Config {
bool feature_pre_filter = true; bool feature_pre_filter = true;
// alias = is_pre_partition // alias = is_pre_partition
// desc = used for parallel learning (excluding the ``feature_parallel`` mode) // desc = used for distributed learning (excluding the ``feature_parallel`` mode)
// desc = ``true`` if training data are pre-partitioned, and different machines use different partitions // desc = ``true`` if training data are pre-partitioned, and different machines use different partitions
bool pre_partition = false; bool pre_partition = false;
...@@ -961,7 +961,7 @@ struct Config { ...@@ -961,7 +961,7 @@ struct Config {
// check = >0 // check = >0
// alias = num_machine // alias = num_machine
// desc = the number of machines for parallel learning application // desc = the number of machines for distributed learning application
// desc = this parameter is needed to be set in both **socket** and **mpi** versions // desc = this parameter is needed to be set in both **socket** and **mpi** versions
int num_machines = 1; int num_machines = 1;
...@@ -976,7 +976,7 @@ struct Config { ...@@ -976,7 +976,7 @@ struct Config {
int time_out = 120; int time_out = 120;
// alias = machine_list_file, machine_list, mlist // alias = machine_list_file, machine_list, mlist
// desc = path of file that lists machines for this parallel learning application // desc = path of file that lists machines for this distributed learning application
// desc = each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator) // desc = each line contains one IP and one port for one machine. The format is ``ip port`` (space as a separator)
// desc = **Note**: can be used only in CLI version // desc = **Note**: can be used only in CLI version
std::string machine_list_filename = ""; std::string machine_list_filename = "";
......
...@@ -80,7 +80,7 @@ class Metadata { ...@@ -80,7 +80,7 @@ class Metadata {
/*! /*!
* \brief Partition meta data according to local used indices if need * \brief Partition meta data according to local used indices if need
* \param num_all_data Number of total training data, including other machines' data on parallel learning * \param num_all_data Number of total training data, including other machines' data on distributed learning
* \param used_data_indices Indices of local used training data * \param used_data_indices Indices of local used training data
*/ */
void CheckOrPartition(data_size_t num_all_data, void CheckOrPartition(data_size_t num_all_data,
......
...@@ -2333,7 +2333,7 @@ class Booster: ...@@ -2333,7 +2333,7 @@ class Booster:
listen_time_out : int, optional (default=120) listen_time_out : int, optional (default=120)
Socket time-out in minutes. Socket time-out in minutes.
num_machines : int, optional (default=1) num_machines : int, optional (default=1)
The number of machines for parallel learning application. The number of machines for distributed learning application.
Returns Returns
------- -------
......
...@@ -105,7 +105,7 @@ void Application::LoadData() { ...@@ -105,7 +105,7 @@ void Application::LoadData() {
config_.num_class, config_.data.c_str()); config_.num_class, config_.data.c_str());
// load Training data // load Training data
if (config_.is_data_based_parallel) { if (config_.is_data_based_parallel) {
// load data for parallel training // load data for distributed training
train_data_.reset(dataset_loader.LoadFromFile(config_.data.c_str(), train_data_.reset(dataset_loader.LoadFromFile(config_.data.c_str(),
Network::rank(), Network::num_machines())); Network::rank(), Network::num_machines()));
} else { } else {
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment