Commit 95e21a4a authored by zhaoying1's avatar zhaoying1
Browse files

Update README.md

parent 884640da
......@@ -92,7 +92,7 @@ $ tree ./data/
── dataset.pt
```
## 模型权重下载
### 模型权重下载
1. 方式一:下载huggingface格式模型。以 7B 模型为例,首先下载预训练[LLaMA权重](https://huggingface.co/decapoda-research/llama-7b-hf),转换到TencentPretrain格式:
```commandline
python3 scripts/convert_llama_from_huggingface_to_tencentpretrain.py --input_model_path $LLaMA_HF_PATH \
......@@ -183,7 +183,7 @@ cd multi_node
bash run-13b.sh
```
## 模型分块
### 模型分块
训练初始化时,每张卡会加载一个模型的拷贝,因此内存需求为模型大小*GPU数量。内存不足时可以通过以下方式将模型分块,然后使用分块加载。
```commandline
python3 scripts/convert_model_into_blocks.py \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment