feat(server): Add native support for PEFT Lora models (#762)
- Will detect `peft` model by finding `adapter_config.json`. - This triggers a totally dedicated `download-weights` path - This path, loads the adapter config, finds the base model_id - It loads the base_model - Then peft_model - Then `merge_and_unload()` - Then `save_pretrained(.., safe_serialization=True) - Add back the config + tokenizer.merge_and_unload()` - Then `save_pretrained(.., safe_serialization=True) - Add back the config + tokenizer. - The chosen location is a **local folder with the name of the user chosen model id** PROs: - Easier than to expect user to merge manually - Barely any change outside of `download-weights` command. - This means everything will work in a single load. - Should enable out of the box SM + HFE CONs: - Creates a local merged model in unusual location, potentially not saved across docker reloads, or ovewriting some files if the PEFT itself was local and containing other files ...
Showing
This diff is collapsed.
Please register or sign in to comment