"git@developer.sourcefind.cn:orangecat/ollama.git" did not exist on "ed5fb088c4d7f57ea3ccf629e36ecb2e857e22eb"
Commit 74c7c1cc authored by chenych's avatar chenych
Browse files

Add infos in README

parent 21ddcc03
# Contributing to Llama 3
We want to make contributing to this project as easy and transparent as
possible.
## Our Development Process
... (in particular how this is synced with internal changes to the project)
## Pull Requests
We actively welcome your pull requests.
1. Fork the repo and create your branch from `main`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").
## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only need
to do this once to work on any of Meta's open source projects.
Complete your CLA here: <https://code.facebook.com/cla>
## Issues
We use GitHub issues to track public bugs. Please ensure your description is
clear and has sufficient instructions to be able to reproduce the issue.
Meta has a [bounty program](http://facebook.com/whitehat/info) for the safe
disclosure of security bugs. In those cases, please go through the process
outlined on that page and do not file a public issue.
## Coding Style
* 2 spaces for indentation rather than tabs
* 80 character line length
* ...
## License
By contributing to Llama 3, you agree that your contributions will be licensed
under the LICENSE file in the root directory of this source tree.
# Contributors
None
\ No newline at end of file
...@@ -8,7 +8,6 @@ Gemma 2不使用绝对位置编码,而是在每一层前加入RoPE Embedding ...@@ -8,7 +8,6 @@ Gemma 2不使用绝对位置编码,而是在每一层前加入RoPE Embedding
Gemma 2将ReLU的激活替换为GeGLU的激活。GeGLU是一种基于门控线性单元(GLU)的改进技术,具有更好的性能表现。 Gemma 2将ReLU的激活替换为GeGLU的激活。GeGLU是一种基于门控线性单元(GLU)的改进技术,具有更好的性能表现。
在transformer的每一层layer的前后都进行规一化,Gemma 2使用RMSNorm作为规一化层。这种规一化策略有助于提高模型的稳定性和性能。 在transformer的每一层layer的前后都进行规一化,Gemma 2使用RMSNorm作为规一化层。这种规一化策略有助于提高模型的稳定性和性能。
<div align=center> <div align=center>
<img src="./docs/gemma2.jpg"/> <img src="./docs/gemma2.jpg"/>
</div> </div>
...@@ -82,8 +81,16 @@ pip uninstall flash_attn ...@@ -82,8 +81,16 @@ pip uninstall flash_attn
``` ```
### 单机单卡/单机多卡 ### 单机单卡/单机多卡
参考`Llama-Factory/examples``train_full``train_lora`中提供的gemma2样例,根据实际需求修改`model_name_or_path``dataset``learning_rate``cutoff_len`等参数,修改好的样例放入`llama-factory`框架的`examples`下的对应目录中即可。
1. 参考`gemma-2_pytorch/llama-factory-v0.8.3/examples``train_full``train_lora`中提供的gemma2样例,根据实际需求修改`model_name_or_path``dataset``learning_rate``cutoff_len`等参数,修改好的样例放入`llama-factory`框架的`examples`下的对应目录中即可。
```bash
# train_full 样例移动
cp llama-factory-v0.8.3/examples/train_full/gemma2_full_sft_ds3.yaml /path/of/llama-factory/examples/train_full/
# train_lora 样例移动
cp llama-factory-v0.8.3/examples/train_lora/gemma2_lora_sft_ds3.yaml /path/of/llama-factory/examples/train_lora/
```
2. 执行微调命令:
```bash ```bash
cd llama-factory cd llama-factory
# 全参增量微调样例 # 全参增量微调样例
...@@ -94,35 +101,33 @@ HIP_VISIBLE_DEVICES=0,1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_l ...@@ -94,35 +101,33 @@ HIP_VISIBLE_DEVICES=0,1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_l
``` ```
## 推理 ## 推理
使用`transformers`框架推理,vllm版本需>0.5.0 使用`transformers`框架推理,vllm版本需>0.5.0
### 单机单卡 ### 单机单卡
```python ```bash
python inference.py python inference.py --model_path /path/of/gemma2
``` ```
## result ## result
使用的加速卡:2张 K100 - 加速卡: K100*2
- 模型:gemma-2-9b
模型:gemma-2-9b <div align=center>
<img src="./docs/results.png" witdh=1200 height=400/>
<div align=left>
<img src="./docs/results.png"/>
</div> </div>
### 精度 ### 精度
模型: gemma-2-2b - 模型: gemma-2-2b
- 数据: alpaca_en_demo
数据: alpaca_en_demo - 训练模式: LoRA finetune + deepspeed_stage3
- 硬件:4卡,k100/A800
训练模式: LoRA finetune + deepspeed_stage3 在DCU/NV上训练的收敛情况:
硬件:4卡,k100 <div align=center>
在DCU上训练的收敛情况:
<div align=left>
<img src="./docs/training_loss.png"/> <img src="./docs/training_loss.png"/>
<img src="./docs/training_loss_nv.png" />
</div> </div>
## 应用场景 ## 应用场景
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment