Unverified Commit 370e2f9e authored by Jan Kaniecki's avatar Jan Kaniecki Committed by GitHub
Browse files

Fix max_tokens handling in vllm_vlms.py (#2637)



* Update vllm_vlms.py

* pre-commit

---------
Co-authored-by: default avatarBaber <baber@hey.com>
parent b2c090cc
......@@ -271,7 +271,9 @@ class VLLM_VLM(VLLM):
left_truncate_len=max_ctx_len,
)
cont = self._model_generate(inputs, stop=until, generate=True, **kwargs)
cont = self._model_generate(
inputs, stop=until, generate=True, max_tokens=max_gen_toks, **kwargs
)
for output, context in zip(cont, contexts):
generated_text = output.outputs[0].text
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment