• Daniël de Kok's avatar
    Move to moe-kernels package and switch to common MoE layer (#2511) · ce85efa9
    Daniël de Kok authored
    * Move to moe-kernels package and switch to common MoE layer
    
    This change introduces the new `moe-kernels` package:
    
    - Add `moe-kernels` as a dependency.
    - Introduce a `SparseMoELayer` module that can be used by MoE
      models.
    - Port over Mixtral and Deepseek.
    
    * Make `cargo check` pass
    
    * Update runner
    ce85efa9
weights.py 14.9 KB