自我认知微调最佳实践.md


# 自我认知微调最佳实践
10分钟微调专属于自己的大模型！

## 目录
- [环境安装](#环境安装)
- [微调前推理](#微调前推理)
- [微调](#微调)
- [微调后推理](#微调后推理)
- [Web-UI](#web-ui)


## 环境安装
```bash
# 设置pip全局镜像 (加速下载)
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
# 安装ms-swift
pip install 'ms-swift[llm]' -U

# 环境对齐 (通常不需要运行. 如果你运行错误, 可以跑下面的代码, 仓库使用最新环境测试)
pip install -r requirements/framework.txt  -U
pip install -r requirements/llm.txt  -U
```

## 微调前推理

使用python:
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import ModelType, InferArguments, infer_main
infer_args = InferArguments(model_type=ModelType.qwen2_7b_instruct)
infer_main(infer_args)

"""
<<< 你是谁？
我是阿里云开发的一款超大规模语言模型，我叫通义千问。
--------------------------------------------------
<<< what's your name?
My name is Qianwen, which is also known as Tongyi Qianwen. I am a large-scale language model created by Alibaba Cloud.
--------------------------------------------------
<<< 你是谁研发的？
我是由阿里云研发的。如果您有任何问题或需要帮助，请随时告诉我，我会尽力提供支持。
--------------------------------------------------
<<< 浙江的省会在哪？
浙江省的省会是杭州市。
--------------------------------------------------
<<< 这有什么好吃的？
浙江省，简称“浙”，位于中国东南沿海长江三角洲南翼，是中国东南沿海的一个重要省份，拥有丰富的美食文化。以下是一些浙江省内非常有名的美食：

1. **西湖醋鱼**：一道以西湖草鱼为主料，用酸甜口味烹制的名菜，口感鲜美，酸甜适中。

2. **东坡肉**：源于宋代大文豪苏东坡的名菜，以五花肉为主料，经过长时间的炖煮，肉质酥软，味道醇厚。

3. **龙井虾仁**：以龙井茶为调料，搭配新鲜虾仁，色香味俱佳，是杭州的特色菜之一。

4. **宁波汤圆**：宁波汤圆以皮薄馅多、甜而不腻著称，是宁波的传统小吃。

5. **金华火腿**：金华火腿以其色泽红润、香气浓郁、肉质鲜美而闻名，是浙江省的特产之一。

6. **绍兴黄酒**：绍兴黄酒是中国最著名的黄酒之一，以其独特的酿造工艺和丰富的口感深受人们喜爱。

7. **海鲜**：浙江省沿海，海鲜种类繁多，新鲜的海鲜如大闸蟹、海虾、海鱼等，是不可错过的美味。

8. **杭州小笼包**：与上海小笼包类似，但杭州的小笼包皮更薄，汤汁更丰富，是杭州的特色小吃之一。

浙江省的美食丰富多样，以上只是其中的一部分，希望您有机会亲自品尝，享受美食带来的乐趣。
--------------------------------------------------
<<< 晚上睡不着觉怎么办
晚上睡不着觉可能由多种原因引起，包括压力、焦虑、生活习惯、环境因素等。以下是一些帮助改善睡眠质量的建议：

1. **建立规律的睡眠习惯**：每天尽量在同一时间上床睡觉和起床，即使在周末也是如此。这有助于调整你的生物钟。

2. **创造良好的睡眠环境**：确保你的卧室安静、黑暗、凉爽，并且床铺舒适。使用遮光窗帘、耳塞或白噪音机可以帮助改善睡眠环境。

3. **限制咖啡因和酒精的摄入**：尤其是在睡前几小时内，避免摄入咖啡因和酒精，因为它们可能干扰睡眠。

4. **减少蓝光暴露**：睡前避免使用手机、电脑和电视等发出蓝光的设备，因为蓝光可能抑制褪黑激素的产生，影响睡眠。

5. **放松身心**：尝试进行深呼吸、冥想、瑜伽或温水浴等放松活动，帮助减轻压力和焦虑。

6. **避免午睡过长**：如果你白天有长时间的午睡，可能会干扰晚上的睡眠。尽量控制午睡时间在30分钟以内。

7. **适量运动**：定期进行适量的体育活动，如散步、游泳或骑自行车，可以帮助改善睡眠质量。但避免在睡前进行剧烈运动。

8. **避免在床上做非睡眠活动**：将床作为睡觉和性活动的地方，避免在床上工作、看电视或使用电子设备。

如果尝试了上述方法后仍然无法改善睡眠问题，建议咨询医生或睡眠专家，以排除潜在的健康问题。
"""
```
如果你要进行单样本推理, 可以参考[LLM推理文档](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E6%8E%A8%E7%90%86%E6%96%87%E6%A1%A3.md#qwen-7b-chat)

使用CLI:
```bash
CUDA_VISIBLE_DEVICES=0 swift infer --model_type qwen2-7b-instruct
```

## 微调
提示: 因为自我认知训练涉及到知识编辑, 建议对**MLP**加lora_target_modules. 你可以通过指定`--lora_target_modules ALL`在所有的linear层(包括qkvo以及mlp)加lora. 这**通常是效果最好的**.

使用python:
```python
# Experimental environment: 3090, V100, ...
# 24GB GPU memory
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import DatasetName, ModelType, SftArguments, sft_main

sft_args = SftArguments(
    model_type=ModelType.qwen2_7b_instruct,
    dataset=[f'{DatasetName.alpaca_zh}#500', f'{DatasetName.alpaca_en}#500',
             f'{DatasetName.self_cognition}#500'],
    logging_steps=5,
    max_length=2048,
    learning_rate=1e-4,
    output_dir='output',
    lora_target_modules=['ALL'],
    model_name=['小黄', 'Xiao Huang'],
    model_author=['魔搭', 'ModelScope'])
output = sft_main(sft_args)
best_model_checkpoint = output['best_model_checkpoint']
print(f'best_model_checkpoint: {best_model_checkpoint}')

"""Out[0]
[INFO:swift] The logging file will be saved in: /xxx/output/qwen2-7b-instruct/v2-20240607-101038/logging.jsonl
{'loss': 1.8210969, 'acc': 0.6236614, 'grad_norm': 2.75, 'learning_rate': 2e-05, 'memory(GiB)': 16.79, 'train_speed(iter/s)': 0.155172, 'epoch': 0.01, 'global_step': 1}
{'loss': 1.75309932, 'acc': 0.63371617, 'grad_norm': 3.765625, 'learning_rate': 0.0001, 'memory(GiB)': 18.48, 'train_speed(iter/s)': 0.210486, 'epoch': 0.05, 'global_step': 5}
{'loss': 1.42493172, 'acc': 0.65476351, 'grad_norm': 1.671875, 'learning_rate': 9.432e-05, 'memory(GiB)': 18.48, 'train_speed(iter/s)': 0.221159, 'epoch': 0.11, 'global_step': 10}
{'loss': 1.16402645, 'acc': 0.69853611, 'grad_norm': 2.3125, 'learning_rate': 8.864e-05, 'memory(GiB)': 18.48, 'train_speed(iter/s)': 0.223072, 'epoch': 0.16, 'global_step': 15}
{'loss': 1.18519087, 'acc': 0.68314366, 'grad_norm': 1.7578125, 'learning_rate': 8.295e-05, 'memory(GiB)': 18.48, 'train_speed(iter/s)': 0.224677, 'epoch': 0.21, 'global_step': 20}
{'loss': 1.09617777, 'acc': 0.69949636, 'grad_norm': 1.4296875, 'learning_rate': 7.727e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.225241, 'epoch': 0.27, 'global_step': 25}
{'loss': 1.09035854, 'acc': 0.70226536, 'grad_norm': 1.34375, 'learning_rate': 7.159e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.226112, 'epoch': 0.32, 'global_step': 30}
{'loss': 1.04421387, 'acc': 0.71705227, 'grad_norm': 1.65625, 'learning_rate': 6.591e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.225783, 'epoch': 0.38, 'global_step': 35}
{'loss': 0.97917967, 'acc': 0.73127871, 'grad_norm': 1.2265625, 'learning_rate': 6.023e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.226212, 'epoch': 0.43, 'global_step': 40}
{'loss': 0.94920969, 'acc': 0.74032536, 'grad_norm': 0.9140625, 'learning_rate': 5.455e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.225991, 'epoch': 0.48, 'global_step': 45}
{'loss': 0.99205322, 'acc': 0.73348026, 'grad_norm': 1.1640625, 'learning_rate': 4.886e-05, 'memory(GiB)': 19.46, 'train_speed(iter/s)': 0.224141, 'epoch': 0.54, 'global_step': 50}
Train:  54%|███████████████████████████████████▍                              | 50/93 [03:42<03:19,  4.64s/it]
{'eval_loss': 1.03679836, 'eval_acc': 0.67676003, 'eval_runtime': 1.2396, 'eval_samples_per_second': 8.874, 'eval_steps_per_second': 8.874, 'epoch': 0.54, 'global_step': 50}
Val: 100%|████████████████████████████████████████████████████████████████████| 11/11 [00:01<00:00, 10.15it/s]
[INFO:swift] Saving model checkpoint to /xxx/output/qwen2-7b-instruct/v2-20240607-101038/checkpoint-50
{'loss': 0.98644152, 'acc': 0.73600368, 'grad_norm': 2.0625, 'learning_rate': 4.318e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.220983, 'epoch': 0.59, 'global_step': 55}
{'loss': 0.97522211, 'acc': 0.7305594, 'grad_norm': 1.1640625, 'learning_rate': 3.75e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.218717, 'epoch': 0.64, 'global_step': 60}
{'loss': 1.02459459, 'acc': 0.71822615, 'grad_norm': 1.125, 'learning_rate': 3.182e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.216185, 'epoch': 0.7, 'global_step': 65}
{'loss': 0.90719929, 'acc': 0.73806977, 'grad_norm': 1.078125, 'learning_rate': 2.614e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.21451, 'epoch': 0.75, 'global_step': 70}
{'loss': 0.88519163, 'acc': 0.74690943, 'grad_norm': 1.3359375, 'learning_rate': 2.045e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.21366, 'epoch': 0.81, 'global_step': 75}
{'loss': 0.95856657, 'acc': 0.72634115, 'grad_norm': 1.359375, 'learning_rate': 1.477e-05, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.213132, 'epoch': 0.86, 'global_step': 80}
{'loss': 0.88609543, 'acc': 0.75917048, 'grad_norm': 0.90625, 'learning_rate': 9.09e-06, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.211609, 'epoch': 0.91, 'global_step': 85}
{'loss': 0.97113533, 'acc': 0.73501945, 'grad_norm': 2.40625, 'learning_rate': 3.41e-06, 'memory(GiB)': 20.5, 'train_speed(iter/s)': 0.210918, 'epoch': 0.97, 'global_step': 90}
Train: 100%|██████████████████████████████████████████████████████████████████| 93/93 [07:21<00:00,  5.05s/it]
{'eval_loss': 1.03077412, 'eval_acc': 0.68508706, 'eval_runtime': 1.2226, 'eval_samples_per_second': 8.997, 'eval_steps_per_second': 8.997, 'epoch': 1.0, 'global_step': 93}
Val: 100%|████████████████████████████████████████████████████████████████████| 11/11 [00:01<00:00, 10.26it/s]
[INFO:swift] Saving model checkpoint to /xxx/output/qwen2-7b-instruct/v2-20240607-101038/checkpoint-93
{'train_runtime': 443.3746, 'train_samples_per_second': 3.358, 'train_steps_per_second': 0.21, 'train_loss': 1.07190883, 'epoch': 1.0, 'global_step': 93}
Train: 100%|██████████████████████████████████████████████████████████████████| 93/93 [07:23<00:00,  4.77s/it]
[INFO:swift] last_model_checkpoint: /xxx/output/qwen2-7b-instruct/v2-20240607-101038/checkpoint-93
[INFO:swift] best_model_checkpoint: /xxx/output/qwen2-7b-instruct/v2-20240607-101038/checkpoint-93
[INFO:swift] images_dir: /xxx/output/qwen2-7b-instruct/v2-20240607-101038/images
[INFO:swift] End time of running main: 2024-06-07 10:18:41.386561
best_model_checkpoint: /xxx/output/qwen2-7b-instruct/v2-20240607-101038/checkpoint-93
"""
```

使用CLI (单卡):
```bash
# Experimental environment: A10, 3090, V100, ...
# 22GB GPU memory
CUDA_VISIBLE_DEVICES=0 \
swift sft \
    --model_type qwen2-7b-instruct \
    --dataset alpaca-zh#500 alpaca-en#500 self-cognition#500 \
    --logging_steps 5 \
    --max_length 2048 \
    --learning_rate 1e-4 \
    --output_dir output \
    --lora_target_modules ALL \
    --model_name 小黄 'Xiao Huang' \
    --model_author 魔搭 ModelScope \
```

使用CLI (DeepSpeed-ZeRO2):
> 如果你使用的是3090等卡, 可以降低`max_length`来减少显存消耗.
```bash
# Experimental environment: 4 * 3090
# 4 * 24GB GPU memory
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NPROC_PER_NODE=4 \
swift sft \
    --model_type qwen2-7b-instruct \
    --dataset alpaca-zh#500 alpaca-en#500 self-cognition#500 \
    --logging_steps 5 \
    --max_length 2048 \
    --learning_rate 1e-4 \
    --output_dir output \
    --lora_target_modules ALL \
    --model_name 小黄 'Xiao Huang' \
    --model_author 魔搭 ModelScope \
    --deepspeed default-zero2
```

## 微调后推理
你需要设置`best_model_checkpoint`的值, 该值会在sft的最后被打印出来.

使用python:
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import InferArguments, merge_lora, infer_main

best_model_checkpoint = 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx'
infer_args = InferArguments(ckpt_dir=best_model_checkpoint)
merge_lora(infer_args, device_map='cpu')
result = infer_main(infer_args)


"""Out[0]
<<< 你好
你好！有什么我可以帮助你的吗？
--------------------------------------------------
<<< clear
<<< 你是谁？
我是小黄，由魔搭训练的人工智能语言模型。我的目的是帮助用户解答问题、提供信息和进行交流。有什么我可以帮助你的吗？
--------------------------------------------------
<<< what's your name?
I am a language model developed by ModelScope, and you can call me Xiao Huang.
--------------------------------------------------
<<< 你是谁研发的？
我是由魔搭研发的人工智能语言模型。
--------------------------------------------------
<<< 浙江的省会在哪？
浙江省的省会是杭州市。
--------------------------------------------------
<<< 这有什么好吃的？
杭州有许多美食，其中一些著名的有：

1. 西湖醋鱼：这是一道经典的杭州菜，以西湖的鱼为主要原料，用醋和糖烹制而成。

2. 龙井虾仁：这道菜以龙井茶和虾仁为主要原料，口感鲜美，清香扑鼻。

3. 红烧肉：这是一道非常受欢迎的杭州菜，以五花肉为主料，用酱油、糖等调料烹制而成。

4. 老鸭汤：这是一道以老鸭为主料的汤，口感鲜美，营养丰富。

5. 龙井虾球：这道菜以龙井茶和虾球为主要原料，口感鲜美，清香扑鼻。

这只是杭州美食中的一部分，还有很多其他美味的菜肴等待您去品尝。
--------------------------------------------------
<<< 晚上睡不着觉怎么办
如果晚上睡不着觉，可以尝试以下方法来帮助自己放松和入睡：

1. 保持规律的作息时间：每天尽量在同一时间上床睡觉和起床，帮助身体建立规律的生物钟。

2. 避免使用电子设备：在睡前一小时内避免使用电子设备，因为屏幕发出的蓝光会抑制褪黑激素的分泌，影响睡眠。

3. 放松身心：可以尝试深呼吸、冥想、瑜伽等放松身心的方法，帮助自己放松。

4. 避免咖啡因和酒精：咖啡因和酒精会影响睡眠质量，尽量避免在睡前摄入。

5. 保持舒适的睡眠环境：保持卧室的温度、湿度和光线适宜，使用舒适的床垫和枕头，有助于提高睡眠质量。

如果以上方法都无法帮助您入睡，建议咨询医生或睡眠专家，以获得更专业的建议和治疗。
"""
```

使用CLI:
```bash
# 直接推理
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx'

# Merge LoRA增量权重并推理
# 如果你需要量化, 可以指定`--quant_bits 4`.
CUDA_VISIBLE_DEVICES=0 swift export \
    --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx' --merge_lora true
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx-merged'
```

## Web-UI
使用python:
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import AppUIArguments, merge_lora, app_ui_main

best_model_checkpoint = 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx'
app_ui_args = AppUIArguments(ckpt_dir=best_model_checkpoint)
merge_lora(app_ui_args, device_map='cpu')
result = app_ui_main(app_ui_args)
```

使用CLI:
```bash
# 直接使用app-ui
CUDA_VISIBLE_DEVICES=0 swift app-ui --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx'

# Merge LoRA增量权重并使用app-ui
# 如果你需要量化, 可以指定`--quant_bits 4`.
CUDA_VISIBLE_DEVICES=0 swift export \
    --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx' --merge_lora true
CUDA_VISIBLE_DEVICES=0 swift app-ui --ckpt_dir 'qwen2-7b-instruct/vx-xxx/checkpoint-xxx-merged'
```