Add infos in README

74c7c1cc · chenych · 21ddcc03 · 21ddcc03 · 74c7c1cc · 74c7c1cc
Commit 74c7c1cc authored Nov 06, 2024 by chenych
Hide whitespace changes
Inline Side-by-side

Showing with 25 additions and 57 deletions

CONTRIBUTING.md CONTRIBUTING.md +0 -39

Contributors.md Contributors.md +2 -0

README.md README.md +23 -18

docs/training_loss_nv.png docs/training_loss_nv.png +0 -0

No files found.
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
-# Contributing to Llama 3
-We want to make contributing to this project as easy and transparent as
-possible.
-## Our Development Process
-... (in particular how this is synced with internal changes to the project)
-## Pull Requests
-We actively welcome your pull requests.
-1. Fork the repo and create your branch from `main`.
-2. If you've added code that should be tested, add tests.
-3. If you've changed APIs, update the documentation.
-4. Ensure the test suite passes.
-5. Make sure your code lints.
-6. If you haven't already, complete the Contributor License Agreement ("CLA").
-## Contributor License Agreement ("CLA")
-In order to accept your pull request, we need you to submit a CLA. You only need
-to do this once to work on any of Meta's open source projects.
-Complete your CLA here: <https://code.facebook.com/cla>
-## Issues
-We use GitHub issues to track public bugs. Please ensure your description is
-clear and has sufficient instructions to be able to reproduce the issue.
-Meta has a [bounty program](http://facebook.com/whitehat/info) for the safe
-disclosure of security bugs. In those cases, please go through the process
-outlined on that page and do not file a public issue.
-## Coding Style  
-* 2 spaces for indentation rather than tabs
-* 80 character line length
-* ...
-## License
-By contributing to Llama 3, you agree that your contributions will be licensed
-under the LICENSE file in the root directory of this source tree.
--- a/Contributors.md
+++ b/Contributors.md
+# Contributors
+None
\ No newline at end of file
--- a/README.md
+++ b/README.md
@@ -8,7 +8,6 @@ Gemma 2不使用绝对位置编码，而是在每一层前加入RoPE Embedding
 Gemma 2将ReLU的激活替换为GeGLU的激活。GeGLU是一种基于门控线性单元(GLU)的改进技术，具有更好的性能表现。
 在transformer的每一层layer的前后都进行规一化，Gemma 2使用RMSNorm作为规一化层。这种规一化策略有助于提高模型的稳定性和性能。
 <div align=center>
    <img src="./docs/gemma2.jpg"/>
 </div>
@@ -82,8 +81,16 @@ pip uninstall flash_attn
 ```
 ### 单机单卡/单机多卡
-参考`Llama-Factory/examples`下`train_full`、`train_lora`中提供的gemma2样例，根据实际需求修改`model_name_or_path`、`dataset`、`learning_rate`、`cutoff_len`等参数，修改好的样例放入`llama-factory`框架的`examples`下的对应目录中即可。
+1. 参考`gemma-2_pytorch/llama-factory-v0.8.3/examples`下`train_full`、`train_lora`中提供的gemma2样例，根据实际需求修改`model_name_or_path`、`dataset`、`learning_rate`、`cutoff_len`等参数，修改好的样例放入`llama-factory`框架的`examples`下的对应目录中即可。
+```bash
+# train_full 样例移动
+cp llama-factory-v0.8.3/examples/train_full/gemma2_full_sft_ds3.yaml /path/of/llama-factory/examples/train_full/
+# train_lora 样例移动
+cp llama-factory-v0.8.3/examples/train_lora/gemma2_lora_sft_ds3.yaml /path/of/llama-factory/examples/train_lora/
+```
+2. 执行微调命令：
 ```bash
 cd llama-factory
 # 全参增量微调样例
@@ -94,35 +101,33 @@ HIP_VISIBLE_DEVICES=0,1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_l
 ```
 ## 推理
-使用`transformers`框架推理，vllm版本需>0.5.0
+使用`transformers`框架推理，vllm版本需>0.5.0。
 ### 单机单卡
-```python
+```bash
-python inference.py
+python inference.py --model_path /path/of/gemma2
 ```
 ## result
-使用的加速卡:2张 K100
+- 加速卡: K100*2
+- 模型：gemma-2-9b
-模型：gemma-2-9b
+<div align=center>
+    <img src="./docs/results.png" witdh=1200 height=400/>
-<div align=left>
-    <img src="./docs/results.png"/>
 </div>
 ### 精度
-模型: gemma-2-2b
+- 模型: gemma-2-2b
+- 数据: alpaca_en_demo
-数据: alpaca_en_demo
+- 训练模式: LoRA finetune + deepspeed_stage3
+- 硬件：4卡，k100/A800
-训练模式: LoRA finetune + deepspeed_stage3
+在DCU/NV上训练的收敛情况：
-硬件：4卡，k100
+<div align=center>
-在DCU上训练的收敛情况：
-<div align=left>
    <img src="./docs/training_loss.png"/>
+    <img src="./docs/training_loss_nv.png" />
 </div>
 ## 应用场景

--- a/docs/training_loss_nv.png
+++ b/docs/training_loss_nv.png