Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
27a09dc5
Unverified
Commit
27a09dc5
authored
Feb 20, 2025
by
Kaixi Hou
Committed by
GitHub
Feb 20, 2025
Browse files
[NVIDIA] Fix an issue to use current stream for the nvfp4 quant (#13632)
parent
981f3c83
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
4 deletions
+1
-4
csrc/quantization/fp4/nvfp4_quant_kernels.cu
csrc/quantization/fp4/nvfp4_quant_kernels.cu
+1
-4
No files found.
csrc/quantization/fp4/nvfp4_quant_kernels.cu
View file @
27a09dc5
...
...
@@ -348,10 +348,7 @@ void scaled_fp4_quant_sm100a(torch::Tensor const& output,
auto
sf_out
=
static_cast
<
int32_t
*>
(
output_sf
.
data_ptr
());
auto
output_ptr
=
static_cast
<
int64_t
*>
(
output
.
data_ptr
());
at
::
cuda
::
CUDAGuard
device_guard
{(
char
)
input
.
get_device
()};
auto
stream
=
at
::
cuda
::
getStreamFromPool
(
false
,
input
.
get_device
());
if
(
stream
==
nullptr
)
{
std
::
cerr
<<
"Warning: Null CUDA stream"
<<
std
::
endl
;
}
auto
stream
=
at
::
cuda
::
getCurrentCUDAStream
(
input
.
get_device
());
// We don't support e8m0 scales at this moment.
bool
useUE8M0
=
false
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment