Update FAQ on interleaving sliding windows support (#29796)

Signed-off-by: Finbarr Timbers <finbarrtimbers@gmail.com>

Update FAQ on interleaving sliding windows support (#29796)
Signed-off-by: Finbarr Timbers <finbarrtimbers@gmail.com>
38caf7fa · Finbarr Timbers · GitHub · cabc77cc · 38caf7fa
Unverified Commit 38caf7fa authored Dec 01, 2025 by Finbarr Timbers Committed by GitHub Dec 01, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 0 additions and 2 deletions

docs/contributing/model/basic.md docs/contributing/model/basic.md +0 -2

No files found.
--- a/docs/contributing/model/basic.md
+++ b/docs/contributing/model/basic.md
@@ -113,8 +113,6 @@ See [this page](registration.md) for instructions on how to register your new mo
 ### How to support models with interleaving sliding windows?
-For models with interleaving sliding windows (e.g. `google/gemma-2-2b-it` and `mistralai/Ministral-8B-Instruct-2410`), the scheduler will treat the model as a full-attention model, i.e., kv-cache of all tokens will not be dropped. This is to make sure prefix caching works with these models. Sliding window only appears as a parameter to the attention kernel computation.
 To support a model with interleaving sliding windows, we need to take care of the following details:
 - Make sure the model's `config.json` contains `layer_types`.