installation.md 2.67 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
id: installation
---

# Installation

SuperBench is used to run validations for AI infrastructure,
thus you need to prepare one __control node__ which is used to run SuperBench commands,
and one or multiple __managed nodes__ which are going to be validated.

Usually __control node__ could be a CPU node, while __managed nodes__ are GPU nodes with high speed inter-connection.

:::tip Tips
It is fine if you have only one GPU node and want to try SuperBench on it.
Control node and managed node can co-locate on the same machine.
:::

## Control node

Here're the system requirements for control node.

### Requirements

* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
* [Python](https://www.python.org/) version 3.6 or later (which can be checked by running `python3 --version`).
* [Pip](https://pip.pypa.io/en/stable/installing/) version 18.0 or later (which can be checked by running `python3 -m pip --version`).

:::note
Windows is not supported due to lack of Ansible support, but you still can use WSL2.
:::

Besides, control node should be able to access all managed nodes through SSH.
If you are going to use password instead of private key for SSH, you also need to install `sshpass`.

```bash
sudo apt-get install sshpass
```

It is also recommended to use [venv](https://docs.python.org/3/library/venv.html) for virtual environments,
but it is not strictly necessary.

```bash
# create a new virtual environment
python3 -m venv --system-site-packages ./venv
# activate the virtual environment
source ./venv/bin/activate

# exit the virtual environment later
# after you finish running superbench
deactivate
```

### Build

You can clone the source from GitHub and build it.

57
58
59
:::note Note
You should checkout corresponding tag to use release version, for example,

60
`git clone -b v0.3.0 https://github.com/microsoft/superbenchmark`
61
62
:::

63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
```bash
git clone https://github.com/microsoft/superbenchmark
cd superbenchmark

python3 -m pip install .
make postinstall
```

After installation, you should be able to run SB CLI.

```bash
sb
```

## Managed nodes

Here're the system requirements for all managed GPU nodes.

### Requirements

* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
* Compatible GPU drivers should be install correctly.
  * For NVIDIA GPUs, driver version can be checked by running `nvidia-smi`.
* [Docker CE](https://docs.docker.com/engine/install/) version 19.03 or later (which can be checked by running `docker --version`).
* GPU support in Docker.
  * For NVIDIA GPUs, install
  [nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit).