- 21 May, 2025 3 commits
-
-
achervyakov authored
* first version of image resizing * fixed bug * clean up `resize_image` --------- Co-authored-by:
Artem Safin <artemsafin67@gmail.com> Co-authored-by:
Baber <baber@hey.com>
-
Baber Abbasi authored
* use images with apis * pacify pre-commit
-
Rob Geada authored
* Log tokenized request warning only once * Fix logging for concurrent usecase as well
-
- 19 May, 2025 1 commit
-
-
Baber Abbasi authored
* add `sglang-generate` * nit * nit * nit * pacify pre-commit
-
- 15 May, 2025 1 commit
-
-
Filippo Momentè authored
* fix: pass device arg in model_ar in vllm_causallms * casting device arg to str in vLLM model args
-
- 10 May, 2025 1 commit
-
-
Sungjae Lee authored
-
- 09 May, 2025 1 commit
-
-
Baber Abbasi authored
-
- 06 May, 2025 1 commit
-
-
Alexandre Marques authored
-
- 18 Apr, 2025 1 commit
-
-
Avelina9X authored
* Added softmax_dtype argument to coerce log_softmax computations * move softmax_dtype --------- Co-authored-by:Baber <baber@hey.com>
-
- 16 Apr, 2025 2 commits
-
-
achervyakov authored
-
Baber Abbasi authored
* fix resolve_hf_chat_template version * pre-commit
-
- 15 Apr, 2025 1 commit
-
-
Jerry Zhang authored
* Add support for quantization_config Summary: Previously quantization_config is ignored, so torchao quantized models are not supported, this PR adds that. Test Plan: lm_eval --model hf --model_args pretrained=jerryzh168/gemma3-int4wo --tasks hellaswag --device cuda:0 --batch_size 8 Reviewers: Subscribers: Tasks: Tags: * quantization_config is optional
-
- 14 Apr, 2025 1 commit
-
-
Alexandre Marques authored
* Add support for chat templates defined outside of tokenizer_config.json, as supported by vLLM * Update template name to avoid conflict with other variable
-
- 04 Apr, 2025 1 commit
-
-
Nikodem Szwast authored
* update authnentications methods, add support for deployment_id * run pre-commit on changed file
-
- 20 Mar, 2025 2 commits
-
-
Baber Abbasi authored
-
Yifei Zhang authored
-
- 18 Mar, 2025 1 commit
-
-
Baber Abbasi authored
* add min_pixels, max_pixels * fix
-
- 17 Mar, 2025 1 commit
-
-
Kiersten Stokes authored
* Add support for token-based auth for watsonx models * Fix lint * Move dotenv import to inner scope * Improve readability of _verify_credentials
-
- 14 Mar, 2025 2 commits
-
-
achervyakov authored
* Added audio-modality pipeline for qwen2-audio model * Beauty imports * fix apply_chat_template args * update default audio placeholders list * add demo task - common_voice subset * add audiolm_qwen libs to pyproject.toml * pre-commit beautify --------- Co-authored-by:Alexandra Rak <rakalexandra@mail.ru>
-
daniel-salib authored
-
- 11 Mar, 2025 1 commit
-
-
Baber Abbasi authored
-
- 04 Mar, 2025 1 commit
-
-
Lucia Quirke authored
* Enable steering HF models Co-authored-by:
Matthew Khoriaty <matthewkhoriaty2026@u.northwestern.edu> * increase HF download timeout * Update readme; improve steering vector device handling * Update latest news * remove HF timeout increase * fix tests * ignore sae lens test * fix accidental force push --------- Co-authored-by:
Matthew Khoriaty <matthewkhoriaty2026@u.northwestern.edu>
-
- 27 Feb, 2025 1 commit
-
-
Baber Abbasi authored
* remove ray.remote resources * remove kobtest tag (registered as group)
-
- 25 Feb, 2025 1 commit
-
-
Jinwei authored
* initial components to support sglang * init of class SGLangLM * draft for generate_until of SGLang model * mock loglikelihood * initial loglikelihood_tokens * todo: fix bug of sglang engine init * implement generation tasks and test * support output type loglikelihood and loglikelihood_rolling (#1) * . * loglikelihood_rolling * / * support dp_size>1 * typo * add tests and clean code * skip tests of sglang for now * fix OOM error of sglang pytest * finish test for sglang * add sglang to readme * fix OOM of tests and clean SGLang model * update readme * clean pyproject and add tests for evaluator * add accuracy tests and it passed locally * add notes for test * Update README.md update readme * pre-commit --------- Co-authored-by:
Xiaotong Jiang <xiaotong.jiang@databricks.com> Co-authored-by:
Baber Abbasi <92168766+baberabb@users.noreply.github.com> Co-authored-by:
Baber <baber@hey.com>
-
- 24 Feb, 2025 1 commit
-
-
Jocelyn authored
* add o3-mini support * fix linter tests
-
- 21 Feb, 2025 1 commit
-
-
Lintang Sutawika authored
* changed source of eval_logger * allow eval_logger to be set from args * removed verbosity arg from non-main methods * fix logging * pre-commit * set verbosity in eval logger * replace utils.eval_logger * fix logging in main * add logging to docs * add logging message * nit * add logging to docs * refactor setup_logging to utils --------- Co-authored-by:Baber <baber@hey.com>
-
- 17 Feb, 2025 1 commit
-
-
Baber Abbasi authored
* fix vllm * fix data_parallel * copy to multimodal
-
- 12 Feb, 2025 1 commit
-
-
achervyakov authored
-
- 07 Feb, 2025 1 commit
-
-
Baber Abbasi authored
-
- 21 Jan, 2025 1 commit
-
-
Jan Kaniecki authored
* Update vllm_vlms.py * pre-commit --------- Co-authored-by:Baber <baber@hey.com>
-
- 19 Jan, 2025 1 commit
-
-
Baber Abbasi authored
* update pre-commit
-
- 15 Jan, 2025 1 commit
-
-
Baber Abbasi authored
* add assistant prefix * add arc_challenge from llama * nit * nit * nit * add assistant prefix * add mmlu_llama * nit * nit * Revert "nit" This reverts commit 6a97f8356237305e375212b966b30e8de59dd4bc. * fix regex bug * add assistant_prefix to vllm * add `Question:` * add mmlu_pro * add fewshot assistant_prefix * use `assistant_prefill` * typehints * nits * nits * add to docs * add readme
-
- 07 Jan, 2025 1 commit
-
-
CL-ModelCloud authored
* hf support load gguf file * code review * code review * code clean up * note about use_fast compat with gguf --------- Co-authored-by:Qubitium-ModelCloud <qubitium@modelcloud.ai>
-
- 25 Dec, 2024 1 commit
-
-
Wang, Yi authored
* fix extra_match low if batch_size > 1 Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add sorting to logprobs * nit --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Co-authored-by:
Baber <baber@hey.com>
-
- 19 Dec, 2024 1 commit
-
-
Baber Abbasi authored
* add warning for truncation
-
- 16 Dec, 2024 1 commit
-
-
Baber Abbasi authored
* batch all rolling token windows * nit * copy to vllm * fix max_length for `get_rolling_token_windows` * bugfix * bugfix * add type hints
-
- 13 Dec, 2024 1 commit
-
-
Yao Matrix authored
* initial support for optimum-intel ipex model. LM model as first step * format Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * pass dtype Signed-off-by:
Yao Matrix <matrix.yao@intel.com> * update README Signed-off-by:
Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by:
Yao Matrix <matrix.yao@intel.com>
-
- 09 Dec, 2024 2 commits
-
-
Maanu Grover authored
* update import Signed-off-by:
Maanu Grover <maanug@nvidia.com> * run formatting --------- Signed-off-by:
Maanu Grover <maanug@nvidia.com>
-
Baber Abbasi authored
* left truncate for generate_until * pre-commit
-
- 04 Dec, 2024 1 commit
-
-
Slawomir Strehlke authored
* Handle pipeline_parallel parameter * Add description of pipeline parallelism with OV models
-