# Add Peft support for SFT and Prompts model training
# Add Peft support for SFT and Prompts model training
The orginal implementation just adopts the loralib and merges the layers into the final model. The huggingface peft is a better lora model implementation and can be easily training and distributed.
The original implementation just adopts the loralib and merges the layers into the final model. The huggingface peft is a better lora model implementation and can be easily training and distributed.
Since reward model is relative small, I just keep it as original one. I suggest train full model to get the proper reward/critic model.
Since reward model is relative small, I just keep it as original one. I suggest train full model to get the proper reward/critic model.
# Prelimenary installation
# Preliminary installation
Since the current pypi peft package(0.2) has some bugs, please install the peft package using source.
Since the current pypi peft package(0.2) has some bugs, please install the peft package using source.
@@ -166,7 +166,7 @@ class EasyRewardDataset(Dataset):
...
@@ -166,7 +166,7 @@ class EasyRewardDataset(Dataset):
'''
'''
Easy SFT just accept a text file which can be read line by line. However the datasest will group texts together to max_length so LLM will learn the texts meaning better.
Easy SFT just accept a text file which can be read line by line. However the datasets will group texts together to max_length so LLM will learn the texts meaning better.
If individual lines are not related, just set is_group_texts to False.
If individual lines are not related, just set is_group_texts to False.