更新dtk24.04.1镜像

7acb6a42 · dcuai · 80683b4f · 7acb6a42
Commit 7acb6a42 authored Aug 23, 2024 by dcuai
Hide whitespace changes
Inline Side-by-side

Showing with 58 additions and 8 deletions

README.md README.md +58 -8

No files found.
--- a/README.md
+++ b/README.md
@@ -27,33 +27,78 @@ SAM分为图像编码器和快速提示编码器/掩码解码器，可以重用
 ### Docker（方法一）
 从[光源](https://www.sourcefind.cn/#/service-list)拉取镜像
 ```
-docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py39-latest
+docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
-docker run -it --network=host --name=SAM_pytorch  --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G --group-add video --cap-add=SYS_PTRACE image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.10.0-centos7.6-dtk-22.10-py39-latest /bin/bash
+docker run -it --network=host --name=SAM_pytorch -v /opt/hyhal:/opt/hyhal:ro --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G --group-add video --cap-add=SYS_PTRACE image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 /bin/bash
 ```
 安装其他依赖：
 ```
-pip install opencv-python pycocotools matplotlib onnxruntime onnx
+pip install opencv-python pycocotools matplotlib onnxruntime onnx 
 ```
 ### Dockerfile（方法二）
 ```
 cd /path/to/dockerfile
-docker build --no-cache -t SAM_pytorch:latest .
+docker build --no-cache -t sam_pytorch:latest .
-docker run -it --network=host --name=SAM_pytorch  --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16g --group-add video --cap-add=SYS_PTRACE -it SAM_pytorch:latest bash
+docker run -it --network=host --name=SAM_pytorch -v /opt/hyhal:/opt/hyhal:ro  --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16g --group-add video --cap-add=SYS_PTRACE -it SAM_pytorch:latest bash
 ```
 ### Anaconda（方法三）
+1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装： https://developer.hpccube.com/tool/
+```
+DTK软件栈：dtk24.04.1
+python：python3.10
+torch：2.1.0
+torchvision：0.16.0
+```
+Tips：以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应
+2、安装其他依赖
 直接使用pip install的方式安装
 ```
+pip install opencv-python pycocotools matplotlib onnxruntime onnx
 pip install git+https://github.com/facebookresearch/segment-anything.git
 ```
 或下载后本地安装
 ```
+pip install opencv-python pycocotools matplotlib onnxruntime onnx
 git clone git@github.com:facebookresearch/segment-anything.git
 cd segment-anything
 pip install -e .
 ```
 ## 数据集
-数据集名称：SA-1B Dataset
+在本测试中训练部分数据集使用COCO2017数据集。
+- 数据集快速下载中心:
+  - [SCNet AIDatasets](http://113.200.138.88:18080/aidatasets)
+- 数据集快速通道下载地址：
+  - [数据集快速下载地址](http://113.200.138.88:18080/aidatasets/coco2017)
+- 官方下载地址
+  - [训练数据](http://images.cocodataset.org/zips/train2017.zip)
+  - [验证数据](http://images.cocodataset.org/zips/val2017.zip)
+  - [测试数据](http://images.cocodataset.org/zips/test2017.zip)
+  - [标签数据](https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip)
+数据集的目录结构如下：
+```
+├── images 
+│   ├── train2017
+│   ├── val2017
+│   ├── test2017
+├── labels
+│   ├── train2017
+│   ├── val2017
+├── annotations
+│   ├── instances_val2017.json
+├── LICENSE
+├── README.txt 
+├── test-dev2017.txt
+├── train2017.txt
+├── val2017.txt
+```
+推理数据集名称：SA-1B Dataset
 完整数据集可在[这里](https://ai.facebook.com/datasets/segment-anything-downloads/)进行下载
 项目中用于试验训练的迷你数据集结构如下
 ```
@@ -68,14 +113,19 @@ pip install -e .
 官网提供了生成掩码的预训练权重和生成掩码的脚本，没有提供训练脚本，但可使用第三方提供的示例脚本微调
 如果您有兴趣，参考[这里](https://github.com/luca-medeiros/lightning-sam/blob/main/lightning_sam/train.py).
-### 单机单卡
+### 单机多卡
+预训练模型在[这里](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)下载
 ```
 git clone https://github.com/luca-medeiros/lightning-sam.git
 cd lightning-sam
+修改pyproject.toml文件中的第六行为documentation = "https://this/needs/to/be/something/otherwise/poetry/complains"
 pip install .
+pip install tensorboardX==2.6.2.2
+cd lightning_sam
+根据实际情况在config.py中修改相关参数：卡数、数据集路径、checkpoint模型路径
 python train.py
 ```
-pip install . 过程中可能顶掉DCU版本的pytorch，可以到[开发者社区](https://cancon.hpccube.com:65024/4/main/pytorch)下载DCU版本对应包
+pip install . 过程中会顶掉DCU版本的pytorch、torchvision、triton，需要到[开发者社区](https://cancon.hpccube.com:65024/4/main/pytorch)下载DCU版本对应包
 ## 推理
 ```
 python scripts/amg.py --checkpoint <path/to/checkpoint> --model-type <model_type> --input <image_or_folder> --output <path/to/output>