README.md 2.92 KB
Newer Older
liangjing's avatar
liangjing committed
1
2
3
4
5
6
7
# MiniGo
## 模型介绍
Minogo是一个基于深度强化学习的围棋程序,模型灵感来源于Google DeepMind开发的AlphaGo算法。该程序基于Tensorflow框架实现。

## 模型结构
Minigo的核心是AlphaZero论文中描述的强化学习循环。简单地说,使用当前一代网络权重的selfplay被用来生成游戏,这些游戏被用作训练数据来生成下一代网络权重。

liangjing's avatar
liangjing committed
8
9
10
11
12
13
14
15
16
17
## 目标精度

50% win rate vs. checkpoint

## MLPerf代码参考版本

版本:v2.1

原始代码位置:https://github.com/mlcommons/training_results_v2.1/tree/main/NVIDIA/benchmarks/minigo/implementations/tensorflow-22.09

liangjing's avatar
liangjing committed
18
19
20
21
22
23
24
25
26
27
28
29
## 数据集
训练数据:所有的训练数据都是在强化学习循环的selfplay阶段生成的。
唯一需要下载的数据是checkpoint以及target model,下载数据可按照下述进行:

    # Download & extract bootstrap checkpoint.
        gsutil cp gs://minigo-pub/ml_perf/0.7/checkpoint.tar.gz .
        tar xfz checkpoint.tar.gz -C ml_perf/
    # Download and freeze the target model.
    mkdir -p ml_perf/target/
        gsutil cp gs://minigo-pub/ml_perf/0.7/target.* ml_perf/target/
## 训练

liangjing's avatar
liangjing committed
30
31
32
33
### 测试规模

单机8卡进行性能&&精度测试

liangjing's avatar
liangjing committed
34
### 环境配置
liangjing's avatar
liangjing committed
35

liangjing's avatar
liangjing committed
36
提供[光源](https://www.sourcefind.cn/#/service-details)拉取的训练的docker镜像:
liangjing's avatar
liangjing committed
37

liangjing's avatar
liangjing committed
38
    docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:mlperf-minigo-latest
liangjing's avatar
liangjing committed
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

python依赖安装:

    pip3 install -r requirement.txt
    python3 setup.py install
    cd ./cocoapi-0.7.0/PythonAPI; python3 setup.py install

参照Dockerfile文件进行minigo库的构建,主要过程如下:

    ENV MINIGO_BAZEL_CACHE_DIR /opt/reinforcement/minigo-bazel-cache
    
    # Copy TF dependency
    RUN mkdir minigo/cc/tensorflow/lib \
     && cp /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_framework.so.1 minigo/cc/tensorflow/lib \
     && cp /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_cc.so.1 minigo/cc/tensorflow/lib \
     && cp -r /usr/local/lib/python3.8/dist-packages/tensorflow_core/include minigo/cc/tensorflow/include
    
    # Build Minigo
    RUN mkdir -p "${MINIGO_BAZEL_CACHE_DIR}" \
     && bazel --output_user_root="${MINIGO_BAZEL_CACHE_DIR}" build -c opt \
          --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
          --copt=-O3 \
          --define=board_size="${BOARD_SIZE}" \
          --define=tf=1 \
          cc:minigo_python.so
    
    ENV PYTHONPATH "${PYTHONPATH}:/opt/reinforcement/minigo/bazel-bin/cc"
    RUN echo '/usr/local/lib/python3.8/dist-packages/tensorflow_core' > /etc/ld.so.conf.d/tensorflow.conf && ldconfig
liangjing's avatar
liangjing committed
67

liangjing's avatar
liangjing committed
68
69
70

### 训练

liangjing's avatar
liangjing committed
71
训练命令:
liangjing's avatar
liangjing committed
72
73
74
75

    bash sbatch.sh >& log.txt &
    输出结果见log.txt

liangjing's avatar
liangjing committed
76
77
## 测试结果
采用上述输入数据,加速卡采用Z100L*8,可最终达到官方收敛要求;
liangjing's avatar
liangjing committed
78
79
80
81
82
83

## 历史版本
* https://developer.hpccube.com/codes/modelzoo/mlperf_minigo
## 参考
* https://mlcommons.org/en/
* https://github.com/mlcommons