- 17 Feb, 2025 1 commit
-
-
shangmingc authored
Signed-off-by:Shangming Cai <caishangming@linux.alibaba.com>
-
- 16 Feb, 2025 2 commits
-
-
凌 authored
-
Roger Wang authored
Signed-off-by:Roger Wang <ywang@roblox.com>
-
- 15 Feb, 2025 2 commits
-
-
Cyrus Leung authored
-
Nicolò Lucchesi authored
-
- 13 Feb, 2025 5 commits
-
-
Nicolò Lucchesi authored
-
Cyrus Leung authored
-
Cyrus Leung authored
-
Russell Bryant authored
-
Cody Yu authored
-
- 11 Feb, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 10 Feb, 2025 4 commits
-
-
Farzad Abdolhosseini authored
Signed-off-by:Farzad Abdolhosseini <farzad@fixie.ai>
-
மனோஜ்குமார் பழனிச்சாமி authored
Signed-off-by:மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Yuan Tang authored
Signed-off-by:Yuan Tang <terrytangyuan@gmail.com>
-
- 08 Feb, 2025 3 commits
-
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Cyrus Leung authored
-
Jun Duan authored
-
- 07 Feb, 2025 1 commit
-
-
TJian authored
[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing (#12501)
-
- 06 Feb, 2025 3 commits
-
-
Jitse Klomp authored
-
Sumit Vij authored
-
Cyrus Leung authored
-
- 05 Feb, 2025 3 commits
-
-
Roger Wang authored
-
Russell Bryant authored
-
Michael Goin authored
-
- 04 Feb, 2025 3 commits
-
-
Isotr0py authored
Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
Cyrus Leung authored
Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <2037008807@qq.com>
-
Thomas Parnell authored
Signed-off-by:Thomas Parnell <tpa@zurich.ibm.com>
-
- 03 Feb, 2025 2 commits
-
-
Arthur authored
# Adds support for `transformers` as a backend Following https://github.com/huggingface/transformers/pull/35235 , a bunch of models should already be supported, we are ramping up support for more models. Thanks @Isotr0py for the TP support, and @hmellor for his help as well! This includes: - `trust_remote_code=True` support: any model on the hub, if it implements attention the correct way can be natively supported!! - tensor parallel support --------- Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Isotr0py <41363108+Isotr0py@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
youkaichao authored
As more and more people are trying deepseek models with multi-node inference, https://github.com/vllm-project/vllm/issues/7815 becomes more frequent. Let's give clear message to users. Signed-off-by:
youkaichao <youkaichao@gmail.com>
-
- 02 Feb, 2025 2 commits
-
-
Russell Bryant authored
- **Add SPDX license headers to python source files** - **Check for SPDX headers using pre-commit** commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by:Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by:
Russell Bryant <rbryant@redhat.com> --------- Signed-off-by:
Russell Bryant <rbryant@redhat.com>
-
Kunshang Ji authored
Signed-off-by:Kunshang Ji <kunshang.ji@intel.com>
-
- 31 Jan, 2025 5 commits
-
-
Brian Dellabetta authored
Based on a request by @mgoin , with @kylesayrs we have added an example doc for int4 w4a16 quantization, following the pre-existing int8 w8a8 quantization example and the example available in [`llm-compressor`](https://github.com/vllm-project/llm-compressor/blob/main/examples/quantization_w4a16/llama3_example.py ) FIX #n/a (no issue created) @kylesayrs and I have discussed a couple additional improvements for the quantization docs. We will revisit at a later date, possibly including: - A section for "choosing the correct quantization scheme/ compression technique" - Additional vision or audio calibration datasets --------- Signed-off-by:
Brian Dellabetta <bdellabe@redhat.com> Co-authored-by:
Michael Goin <michael@neuralmagic.com>
-
Harry Mellor authored
- Make device tab names more explicit - Add comprehensive list of devices to https://docs.vllm.ai/en/latest/getting_started/installation/index.html - Add `attention` blocks to the intro of all devices that don't have pre-built wheels/images --------- Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Cody Yu authored
- Create v1 design document section in docs. - Add prefix caching design doc. @WoosukKwon @ywang96 --------- Signed-off-by:Cody Yu <hao.yu.cody@gmail.com>
-
Cody Yu authored
It's very annoying when I forgot to add `-s` in `git commit` to sign-off, because I then need to `git rebase HEAD~1 --signoff` and `git push -f` to fix the DCO. This PR adds a hook to sign off commits automatically when `-s` is missing to solve this problem. The only change from the user side is now users have to install 2 hooks, so instead of just ``` pre-commit install ``` Now we need to ``` pre-commit install --hook-type pre-commit --hook-type commit-msg ``` Note that even if users still only install the pre-commit hook, they won't get any error in `git commit`. Just the sign-off hook won't run. cc @hmellor @youkaichao --------- Signed-off-by:Cody Yu <hao.yu.cody@gmail.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 30 Jan, 2025 1 commit
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
- 29 Jan, 2025 2 commits
-
-
Alphi authored
Signed-off-by:
hzh <hezhihui_thu@163.com> Signed-off-by:
Sungjae Lee <33976427+llsj14@users.noreply.github.com> Signed-off-by:
shaochangxu.scx <shaochangxu.scx@antgroup.com> Signed-off-by:
DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by:
NickLucche <nlucches@redhat.com> Signed-off-by:
Isotr0py <2037008807@qq.com> Signed-off-by:
Roger Wang <ywang@roblox.com> Signed-off-by:
Rafael Vasquez <rafvasq21@gmail.com> Signed-off-by:
Akshat Tripathi <akshat@krai.ai> Signed-off-by:
Oleg Mosalov <oleg@krai.ai> Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Signed-off-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by:
Yida Wu <yidawu@alumni.cmu.edu> Signed-off-by:
Chenguang Li <757486878@qq.com> Signed-off-by:
youkaichao <youkaichao@gmail.com> Signed-off-by:
Alex-Brooks <Alex.brooks@ibm.com> Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
Shanshan Shen <467638484@qq.com> Signed-off-by:
elijah <f1renze.142857@gmail.com> Signed-off-by:
Yikun <yikunkero@gmail.com> Signed-off-by:
mgoin <michael@neuralmagic.com> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by:
Konrad Zawora <kzawora@habana.ai> Signed-off-by:
tjtanaa <tunjian.tan@embeddedllm.com> Signed-off-by:
wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by:
Rui Qiao <ruisearch42@gmail.com> Co-authored-by:
Sungjae Lee <33976427+llsj14@users.noreply.github.com> Co-authored-by:
shaochangxu <85155497+shaochangxu@users.noreply.github.com> Co-authored-by:
shaochangxu.scx <shaochangxu.scx@antgroup.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by:
Nicolò Lucchesi <nlucches@redhat.com> Co-authored-by:
sixgod <evethwillbeok@outlook.com> Co-authored-by:
Isotr0py <2037008807@qq.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Rafael Vasquez <rafvasq21@gmail.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Akshat Tripathi <Akshat.tripathi6568@gmail.com> Co-authored-by:
Oleg Mosalov <oleg@krai.ai> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Avshalom Manevich <12231371+avshalomman@users.noreply.github.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by:
Yangcheng Li <liyangcheng.lyc@alibaba-inc.com> Co-authored-by:
Siyuan Li <94890248+liaoyanqing666@users.noreply.github.com> Co-authored-by:
Concurrensee <yida.wu@amd.com> Co-authored-by:
Chenguang Li <757486878@qq.com> Co-authored-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
Alex Brooks <alex.brooks@ibm.com> Co-authored-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Shanshan Shen <467638484@qq.com> Co-authored-by:
elijah <30852919+e1ijah1@users.noreply.github.com> Co-authored-by:
Yikun Jiang <yikunkero@gmail.com> Co-authored-by:
Steve Luo <36296769+SunflowerAries@users.noreply.github.com> Co-authored-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
Konrad Zawora <kzawora@habana.ai> Co-authored-by:
TJian <tunjian1996@gmail.com> Co-authored-by:
tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by:
wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by:
maang-h <55082429+maang-h@users.noreply.github.com> Co-authored-by:
Elfie Guo <164945471+elfiegg@users.noreply.github.com> Co-authored-by:
Rui Qiao <161574667+ruisearch42@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-