最新测试镜像: torch2.4.1-py3.10-dtk25.04-beta-das-alpha 该镜像自带transformer_engine1.8 git下载该项目 启动容器: ```bash docker run -it \ --shm-size=32G \ --device=/dev/kfd \ --device=/dev/mkfd \ --device=/dev/dri \ --cap-add=SYS_PTRACE \ --security-opt seccomp=unconfined \ --ulimit memlock=-1:-1 \ --ipc=host \ --network=host \ --group-add video \ --privileged \ --name nemo_dtk25.4 \ -v /opt/hyhal:/opt/hyhal \ -v /path/to/data/:/data \ -v /path/to/workspace/:/workspace \ ce83b4a462d9 \ /bin/bash ``` 安装依赖 ```bash cd nemo_dtk25-2.0.0.rc0.beta pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple cd .. && cd Megatron-LM-core_r0.7.0.beta pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple ``` 执行微调脚本: 单机八卡: `bash K100AI_finetune.sh >& K100AI_finetune.log`