Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
norm
vllm
Commits
923797fe
Unverified
Commit
923797fe
authored
Feb 02, 2024
by
zhaoyang-star
Committed by
GitHub
Feb 01, 2024
Browse files
Fix compile error when using rocm (#2648)
parent
cd9e60c7
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
9 additions
and
1 deletion
+9
-1
csrc/attention/attention_kernels.cu
csrc/attention/attention_kernels.cu
+2
-0
csrc/cache_kernels.cu
csrc/cache_kernels.cu
+7
-0
csrc/quantization/fp8_e5m2_kvcache/quant_utils.cuh
csrc/quantization/fp8_e5m2_kvcache/quant_utils.cuh
+0
-1
No files found.
csrc/attention/attention_kernels.cu
View file @
923797fe
...
...
@@ -25,7 +25,9 @@
#include "attention_dtypes.h"
#include "attention_utils.cuh"
#ifdef ENABLE_FP8_E5M2
#include "../quantization/fp8_e5m2_kvcache/quant_utils.cuh"
#endif
#include <algorithm>
...
...
csrc/cache_kernels.cu
View file @
923797fe
...
...
@@ -4,13 +4,20 @@
#include "cuda_compat.h"
#include "dispatch_utils.h"
#ifdef ENABLE_FP8_E5M2
#include "quantization/fp8_e5m2_kvcache/quant_utils.cuh"
#endif
#include <algorithm>
#include <cassert>
#include <map>
#include <vector>
#ifdef USE_ROCM
#include <hip/hip_bf16.h>
typedef
__hip_bfloat16
__nv_bfloat16
;
#endif
void
swap_blocks
(
torch
::
Tensor
&
src
,
torch
::
Tensor
&
dst
,
...
...
csrc/quantization/fp8_e5m2_kvcache/quant_utils.cuh
View file @
923797fe
...
...
@@ -9,7 +9,6 @@
#include "../../attention/dtype_float16.cuh"
#include "../../attention/dtype_bfloat16.cuh"
#pragma once
namespace
vllm
{
#ifdef ENABLE_FP8_E5M2
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment