Commit b269bb88 authored by chenych's avatar chenych
Browse files

Update README, Fix qwen2 loss 0 and add olmo.

parent 1b350057
......@@ -21,8 +21,9 @@ LLaMA Factory是一个大语言模型训练和推理的框架,支持了魔搭
| [Gemma 2](https://huggingface.co/google) | 2B/9B | gemma |
| [Llama 2](https://huggingface.co/meta-llama) | 7B/13B/70B | llama2 |
| [Llama 3/Llama 3.1](https://huggingface.co/meta-llama) | 8B/70B | llama3 |
| [Qwen1.5 (Code/MoE)](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/32B/72B | qwen |
| [XVERSE](https://hf-mirror.com/xverse) | 7B | xverse |
| [Qwen1.5 (Code/MoE)/Qwen2](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/32B/72B | qwen |
| [XVERSE](https://hf-mirror.com/xverse) | 7B/13B | xverse |
| [OLMo](https://hf-mirror.com/allenai) | 1B/7B | olmo |
持续更新中...
......@@ -37,6 +38,8 @@ LLaMA Factory是一个大语言模型训练和推理的框架,支持了魔搭
> 1. `Baichuan 2` 需要卸载掉环境中的xformers库,当前仅支持Lora方式训练。
>
> 2. `XVERSE`在`tokenizer > 0.19`的版本下有兼容性问题报错`Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrappe`,需要使用[XVERSE-13B-256K-hf](https://huggingface.co/xverse/XVERSE-13B-256K/tree/main)中的`tokenizer_config.json.update`/`tokenizer.json.update`替换原有模型文件中的对应tokenizer文件,具体解决方法参考[xverse-ai/XVERSE-7B issues](https://github.com/xverse-ai/XVERSE-7B/issues/1)
>
> 3. `Qwen2`训练仅支持bf16格式,**fp16会出现loss为0,lr为0的问题**,参考[issuse](https://github.com/hiyouga/LLaMA-Factory/issues/4848)
## 使用源码编译方式安装
### 环境准备
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment