• drbh's avatar
    Enable multiple LoRa adapters (#2010) · 04e1af94
    drbh authored
    * feat: first draft load multiple lora
    
    * feat: load weights within layer and refactor lora pass
    
    * fix: refactor and reduce lora math
    
    * feat: baseline impl single request multi lora support
    
    * feat: prefer lorax implementation and port loading logic
    
    * fix: prefer adapter_data and refactors
    
    * feat: perfer loraxs custom punica kernels and add mlp loras
    
    * fix: adjust batch for bgmv
    
    * fix: adjust adapter_segments logic when in batch
    
    * fix: refactor and move changes to v3 proto
    
    * fix: pass model_id for all flash causal lms
    
    * fix: pass model_id for all causal and seq2seq lms
    
    * fix: add model_id to model test
    
    * feat: add lora support to mistral and refactors
    
    * feat: prefer model id in request
    
    * fix: include rust code for adapter id
    
    * feat: bump launcher and add new lora docs
    
    * feat: support base model generation and refactors
    
    * fix: rename doc to retry ci build
    
    * feat: support if vlm models
    
    * fix: add adapter_data par...
    04e1af94
Dockerfile 9.33 KB