Commit 2d952503 authored by dcuai's avatar dcuai
Browse files

Update dtk24.04.1镜像

parent 2f261e8f
......@@ -23,11 +23,12 @@ mkdir -p OFA/checkpoints
../../checkpoints/ofa_large.pt # finetune训练前,下载预训练权重ofa_large.pt到checkpoints文件夹下。
```
- https://ofa-beijing.oss-cn-beijing.aliyuncs.com/checkpoints/ofa_large.pt
也可参考[预训练权重](## 预训练权重)部分进行下载
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
# <your IMAGE ID>用以上拉取的docker的镜像ID替换
docker run --shm-size=32G --name=ofa --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/OFA:/home/OFA -it <your IMAGE ID> bash
docker run --shm-size=32G --name=ofa --privileged=true --device=/dev/kfd --device=/dev/dri/ -v /opt/hyhal:/opt/hyhal:ro --group-add video -v $PWD/OFA:/home/OFA -it <your IMAGE ID> bash
pip install -r requirements.txt
cp -r OFA/nltk_data /root/ # 放置nltk库需要加载的.zip压缩包
```
......@@ -35,7 +36,7 @@ cp -r OFA/nltk_data /root/ # 放置nltk库需要加载的.zip压缩包
```
cd OFA/docker
docker build --no-cache -t ofa:latest .
docker run --shm-size=32G --name=ofa --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../OFA:/home/OFA -it ofa bash
docker run --shm-size=32G --name=ofa --privileged=true --device=/dev/kfd --device=/dev/dri/ -v /opt/hyhal:/opt/hyhal:ro --group-add video -v $PWD/../../OFA:/home/OFA -it ofa:latest bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt
cp -r OFA/nltk_data /root/ # 放置nltk库需要加载的.zip压缩包
cd OFA && pip install -e ./fairseq/
......@@ -44,11 +45,11 @@ cd OFA && pip install -e ./fairseq/
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
- https://developer.hpccube.com/tool/
```
DTK驱动:dtk23.04
python:python3.8
torch:1.13.1
torchvision:0.14.1
torchaudio:0.13.1
DTK驱动:dtk24.04.1
python:python3.10
torch:2.1.0
torchvision:0.16.0
torchaudio:2.1.2
```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应,fairseq只能使用项目中自带的经开源作者改造的版本(v1.0.0)。`
......@@ -101,7 +102,7 @@ nohup sh train_caption_stage2.sh > train_stage2.out & # stage 2, load the best
## 推理
前文中的fairseq版本无法成功推理,此处需要重新安装,且github上fairseq开源的官方代码也可能无法安装成功。建议按以下方式安装:
```
pip install fairseq==0.12.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install fairseq==0.12.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
```
```
cp stage2_checkpoints/1e-5_3/checkpoint_best.pt ../../checkpoints/caption_large_best_clean.pt
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment