README.md 2.13 KB
Newer Older
Gilbert Lee's avatar
Gilbert Lee committed
1
2
# TransferBench

Lisa Delaney's avatar
Lisa Delaney committed
3
4
TransferBench is a simple utility capable of benchmarking simultaneous copies between user-specified
CPU and GPU devices.
Gilbert Lee's avatar
Gilbert Lee committed
5

Lisa Delaney's avatar
Lisa Delaney committed
6
7
8
Documentation for TransferBench is available at
[https://rocm.docs.amd.com/projects/TransferBench/en/latest/index.html](https://rocm.docs.amd.com/projects/TransferBench/en/latest/index.html).

Gilbert Lee's avatar
Gilbert Lee committed
9
10
## Requirements

Lisa Delaney's avatar
Lisa Delaney committed
11
12
* You must have a ROCm stack installed on your system (HIP runtime)
* You must have `libnuma` installed on your system
Gilbert Lee's avatar
Gilbert Lee committed
13

14
15
## Documentation

Lisa Delaney's avatar
Lisa Delaney committed
16
To build documentation locally, use the following code:
17

Lisa Delaney's avatar
Lisa Delaney committed
18
```shell
19
20
21
22
23
24
25
cd docs

pip3 install -r .sphinx/requirements.txt

python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html
```

Lisa Delaney's avatar
Lisa Delaney committed
26
## Building TransferBench
Lisa Delaney's avatar
Lisa Delaney committed
27

Lisa Delaney's avatar
Lisa Delaney committed
28
You can build TransferBench using Makefile or CMake.
Lisa Delaney's avatar
Lisa Delaney committed
29

Lisa Delaney's avatar
Lisa Delaney committed
30
* Makefile:
Lisa Delaney's avatar
Lisa Delaney committed
31

Lisa Delaney's avatar
Lisa Delaney committed
32
33
34
  ```shell
  make
  ```
Lisa Delaney's avatar
Lisa Delaney committed
35

Lisa Delaney's avatar
Lisa Delaney committed
36
* CMake:
Gilbert Lee's avatar
Gilbert Lee committed
37

Lisa Delaney's avatar
Lisa Delaney committed
38
39
40
41
42
43
44
45
  ```shell
  mkdir build
  cd build
  CXX=/opt/rocm/bin/hipcc cmake ..
  make
  ```

  If ROCm is not installed in `/opt/rocm/`, you must set `ROCM_PATH` to the correct location.
Gilbert Lee's avatar
Gilbert Lee committed
46

47
48
## NVIDIA platform support

Lisa Delaney's avatar
Lisa Delaney committed
49
You can build TransferBench to run on NVIDIA platforms via HIP or native NVCC.
50

Lisa Delaney's avatar
Lisa Delaney committed
51
52
Use the following code to build with HIP for NVIDIA (note that you must have a HIP-compatible CUDA
version installed, e.g., CUDA 11.5):
Gilbert Lee's avatar
Gilbert Lee committed
53

Lisa Delaney's avatar
Lisa Delaney committed
54
55
```shell
CUDA_PATH=<path_to_CUDA> HIP_PLATFORM=nvidia make`
56
```
Lisa Delaney's avatar
Lisa Delaney committed
57
58
59
60
61

Use the following code to build with native NVCC (builds `TransferBenchCuda`):

```shell
make
62
63
```

Lisa Delaney's avatar
Lisa Delaney committed
64
65
66
67
68
69
70
71
72
73
74
75
76
## Things to note

* Running TransferBench with no arguments displays usage instructions and detected topology
  information
* You can use several preset configurations instead of a configuration file:
  * `p2p`: Peer-to-peer benchmark test
  * `sweep`: Sweep across possible sets of transfers
  * `rsweep`: Random sweep across possible sets of transfers
* When using the same GPU executor in multiple simultaneous transfers, performance may be
  serialized due to the maximum number of hardware queues available
  * The number of maximum hardware queues can be adjusted via `GPU_MAX_HW_QUEUES`
  * Alternatively, running in single-stream mode (`USE_SINGLE_STREAM`=1) may avoid this issue
    by launching all transfers on a single stream, rather than on individual streams