• Lei Wang's avatar
    [Example] Add topk into sparse mla example and append some docs (#901) · 6021ef32
    Lei Wang authored
    * Remove unused `fp8_mqa_logits.py` file and update README.md to reflect new directory structure and file descriptions for deepseek_v32 example. Added sections for architecture overview, Lightning Indexer, Top-k Selector, and Sparse MLA Forward implementations.
    
    * Update linting configurations and improve code formatting in deepseek_v32 example scripts
    
    - Added per-file ignores for the inference directory in `pyproject.toml`.
    - Refactored code in `topk_selector.py`, `convert.py`, `generate.py`, `kernel.py`, and `model.py` to enhance readability by adjusting spacing and line breaks.
    - Ensured consistent formatting across function definitions and assertions for better clarity.
    
    * Refactor test functions in deepseek_v32 example scripts for improved clarity and consistency
    
    - Updated `fp8_lighting_indexer.py` to define a dedicated test function for the lighting indexer.
    - Refactored `sparse_mla_fwd_pipelined.py` and `sparse_mla_fwd.py` to standardize test function parameters and improve readability.
    - Enhanced `topk_selector.py` by introducing a test function with parameters for batch size and sequence length.
    - Ensured all test functions are invoked correctly in the main execution block.
    
    * Enhance test functions in deepseek_v32 example scripts with CUDA requirements and parameterization
    
    - Added CUDA requirements decorators to `test_example_sparse_mla_fwd` and `test_example_sparse_mla_fwd_pipelined`.
    - Parameterized test functions to use specific small shapes for testing, improving test coverage and clarity.
    
    * lint fix
    
    * Update README.md to correct image path for DeepSeek V3.2 architecture diagram
    6021ef32
kernel.py 9.77 KB