Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
xdb4_94051
vllm
Commits
62ec38ea
Unverified
Commit
62ec38ea
authored
Jun 02, 2023
by
Woosuk Kwon
Committed by
GitHub
Jun 02, 2023
Browse files
Document supported models (#127)
parent
0eda2e09
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
58 additions
and
3 deletions
+58
-3
cacheflow/entrypoints/llm.py
cacheflow/entrypoints/llm.py
+3
-1
docs/README.md
docs/README.md
+1
-2
docs/source/index.rst
docs/source/index.rst
+7
-0
docs/source/models/adding_model.rst
docs/source/models/adding_model.rst
+7
-0
docs/source/models/supported_models.rst
docs/source/models/supported_models.rst
+40
-0
No files found.
cacheflow/entrypoints/llm.py
View file @
62ec38ea
...
...
@@ -39,11 +39,13 @@ class LLM:
def
generate
(
self
,
prompts
:
List
[
str
],
prompts
:
Union
[
str
,
List
[
str
]
]
,
sampling_params
:
Optional
[
SamplingParams
]
=
None
,
prompt_token_ids
:
Optional
[
List
[
List
[
int
]]]
=
None
,
use_tqdm
:
bool
=
True
,
)
->
List
[
RequestOutput
]:
if
isinstance
(
prompts
,
str
):
prompts
=
[
prompts
]
if
sampling_params
is
None
:
# Use default sampling params.
sampling_params
=
SamplingParams
()
...
...
docs/README.md
View file @
62ec38ea
...
...
@@ -14,7 +14,6 @@ make html
## Open the docs with your brower
```
bash
cd
build/html
python
-m
http.server
python
-m
http.server
-d
build/html/
```
Launch your browser and open localhost:8000.
docs/source/index.rst
View file @
62ec38ea
...
...
@@ -10,3 +10,10 @@ Documentation
getting_started/installation
getting_started/quickstart
.. toctree::
:maxdepth: 1
:caption: Models
models/supported_models
models/adding_model
docs/source/models/adding_model.rst
0 → 100644
View file @
62ec38ea
.. _adding_a_new_model:
Adding a New Model
==================
Placeholder
docs/source/models/supported_models.rst
0 → 100644
View file @
62ec38ea
.. _supported_models:
Supported Models
================
CacheFlow supports a variety of generative Transformer models in `HuggingFace Transformers <https://github.com/huggingface/transformers>`_.
The following is the list of model architectures that are currently supported by CacheFlow.
Alongside each architecture, we include some popular models that use it.
.. list-table::
:widths: 25 75
:header-rows: 1
* - Architecture
- Models
* - :code:`GPT2LMHeadModel`
- GPT-2
* - :code:`GPTNeoXForCausalLM`
- GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
* - :code:`LlamaForCausalLM`
- LLaMA, Vicuna, Alpaca, Koala
* - :code:`OPTForCausalLM`
- OPT, OPT-IML
If your model uses one of the above model architectures, you can seamlessly run your model with CacheFlow.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Alternatively, you can raise an issue on our `GitHub <https://github.com/WoosukKwon/cacheflow/issues>`_ project.
.. tip::
The easiest way to check if your model is supported is to run the program below:
.. code-block:: python
from cacheflow import LLM
llm = LLM(model=...) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
If CacheFlow successfully generates text, it indicates that your model is supported.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment