[Doc] Update OOT model docs (#18742)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Doc] Update OOT model docs (#18742)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
25a817f2 · Cyrus Leung · GitHub · d260f799 · 25a817f2 · 25a817f2
Unverified Commit 25a817f2 authored May 27, 2025 by Cyrus Leung Committed by GitHub May 27, 2025
Show whitespace changes
Inline Side-by-side

Showing with 19 additions and 18 deletions

docs/contributing/model/registration.md docs/contributing/model/registration.md +15 -16

docs/design/plugin_system.md docs/design/plugin_system.md +4 -2

No files found.
--- a/docs/contributing/model/registration.md
+++ b/docs/contributing/model/registration.md
@@ -23,33 +23,32 @@ Finally, update our [list of supported models][supported-models] to promote your

 ## Out-of-tree models

-You can load an external model using a plugin without modifying the vLLM codebase.
-
-!!! info
-    [vLLM's Plugin System][plugin-system]
+You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.

 To register the model, use the following code:

 ```python
-from vllm import ModelRegistry
-from your_code import YourModelForCausalLM
-ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
+# The entrypoint of your plugin
+def register():
+    from vllm import ModelRegistry
+    from your_code import YourModelForCausalLM
+
+    ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
 ```

 If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`:

 ```python
-from vllm import ModelRegistry
+# The entrypoint of your plugin
+def register():
+    from vllm import ModelRegistry

-ModelRegistry.register_model(
+    ModelRegistry.register_model(
        "YourModelForCausalLM",
        "your_code:YourModelForCausalLM"
-)
+    )
 ```

 !!! warning
    If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
    Read more about that [here][supports-multimodal].
-
-!!! note
-    Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
--- a/docs/design/plugin_system.md
+++ b/docs/design/plugin_system.md
@@ -30,8 +30,10 @@ def register():
    from vllm import ModelRegistry

    if "MyLlava" not in ModelRegistry.get_supported_archs():
-        ModelRegistry.register_model("MyLlava",
-                                        "vllm_add_dummy_model.my_llava:MyLlava")
+        ModelRegistry.register_model(
+            "MyLlava",
+            "vllm_add_dummy_model.my_llava:MyLlava",
+        )
 ```

 For more information on adding entry points to your package, please check the [official documentation](https://setuptools.pypa.io/en/latest/userguide/entry_point.html).