Unverified Commit 25a817f2 authored by Cyrus Leung's avatar Cyrus Leung Committed by GitHub
Browse files

[Doc] Update OOT model docs (#18742)


Signed-off-by: default avatarDarkLight1337 <tlleungac@connect.ust.hk>
parent d260f799
...@@ -23,33 +23,32 @@ Finally, update our [list of supported models][supported-models] to promote your ...@@ -23,33 +23,32 @@ Finally, update our [list of supported models][supported-models] to promote your
## Out-of-tree models ## Out-of-tree models
You can load an external model using a plugin without modifying the vLLM codebase. You can load an external model [using a plugin][plugin-system] without modifying the vLLM codebase.
!!! info
[vLLM's Plugin System][plugin-system]
To register the model, use the following code: To register the model, use the following code:
```python ```python
from vllm import ModelRegistry # The entrypoint of your plugin
from your_code import YourModelForCausalLM def register():
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM) from vllm import ModelRegistry
from your_code import YourModelForCausalLM
ModelRegistry.register_model("YourModelForCausalLM", YourModelForCausalLM)
``` ```
If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`: If your model imports modules that initialize CUDA, consider lazy-importing it to avoid errors like `RuntimeError: Cannot re-initialize CUDA in forked subprocess`:
```python ```python
from vllm import ModelRegistry # The entrypoint of your plugin
def register():
ModelRegistry.register_model( from vllm import ModelRegistry
"YourModelForCausalLM",
"your_code:YourModelForCausalLM" ModelRegistry.register_model(
) "YourModelForCausalLM",
"your_code:YourModelForCausalLM"
)
``` ```
!!! warning !!! warning
If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface. If your model is a multimodal model, ensure the model class implements the [SupportsMultiModal][vllm.model_executor.models.interfaces.SupportsMultiModal] interface.
Read more about that [here][supports-multimodal]. Read more about that [here][supports-multimodal].
!!! note
Although you can directly put these code snippets in your script using `vllm.LLM`, the recommended way is to place these snippets in a vLLM plugin. This ensures compatibility with various vLLM features like distributed inference and the API server.
...@@ -30,8 +30,10 @@ def register(): ...@@ -30,8 +30,10 @@ def register():
from vllm import ModelRegistry from vllm import ModelRegistry
if "MyLlava" not in ModelRegistry.get_supported_archs(): if "MyLlava" not in ModelRegistry.get_supported_archs():
ModelRegistry.register_model("MyLlava", ModelRegistry.register_model(
"vllm_add_dummy_model.my_llava:MyLlava") "MyLlava",
"vllm_add_dummy_model.my_llava:MyLlava",
)
``` ```
For more information on adding entry points to your package, please check the [official documentation](https://setuptools.pypa.io/en/latest/userguide/entry_point.html). For more information on adding entry points to your package, please check the [official documentation](https://setuptools.pypa.io/en/latest/userguide/entry_point.html).
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment