"git@developer.sourcefind.cn:norm/vllm.git" did not exist on "6208d622ca74789f329fb4e9041a600e1f96659b"
Commit 3a6dfc39 authored by Haotian Tang's avatar Haotian Tang
Browse files

[Minor] Add information for CUDA kernel.

parent d6d6d2d4
/*
Modified from NVIDIA FasterTransformer: https://github.com/NVIDIA/FasterTransformer/blob/main/src/fastertransformer/cutlass_extensions/include/cutlass_extensions/interleaved_numeric_conversion.h
@article{lin2023awq,
title={AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration},
author={Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Dang, Xingyu and Han, Song},
journal={arXiv},
year={2023}
}
*/
#pragma once #pragma once
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment