Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
DeepEP
Commits
adc6e24c
Unverified
Commit
adc6e24c
authored
May 08, 2025
by
fzyzcjy
Committed by
GitHub
May 08, 2025
Browse files
Update deep_ep.cpp
parent
23ded3bd
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
0 deletions
+3
-0
csrc/deep_ep.cpp
csrc/deep_ep.cpp
+3
-0
No files found.
csrc/deep_ep.cpp
View file @
adc6e24c
...
@@ -614,6 +614,9 @@ Buffer::internode_dispatch(const torch::Tensor& x, const std::optional<torch::Te
...
@@ -614,6 +614,9 @@ Buffer::internode_dispatch(const torch::Tensor& x, const std::optional<torch::Te
const
std
::
optional
<
torch
::
Tensor
>&
cached_rdma_channel_prefix_matrix
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_recv_rdma_rank_prefix_sum
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_rdma_channel_prefix_matrix
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_recv_rdma_rank_prefix_sum
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_gbl_channel_prefix_matrix
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_recv_gbl_rank_prefix_sum
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_gbl_channel_prefix_matrix
,
const
std
::
optional
<
torch
::
Tensor
>&
cached_recv_gbl_rank_prefix_sum
,
int
expert_alignment
,
const
Config
&
config
,
std
::
optional
<
EventHandle
>&
previous_event
,
bool
async
,
bool
allocate_on_comm_stream
)
{
int
expert_alignment
,
const
Config
&
config
,
std
::
optional
<
EventHandle
>&
previous_event
,
bool
async
,
bool
allocate_on_comm_stream
)
{
// In dispatch, CPU will busy-wait until GPU receive tensor size metadata from other ranks, which can be quite long.
// If users of DeepEP need to execute other Python code on other threads, such as KV transfer, their code will get stuck due to GIL
// unless we release GIL here.
pybind11
::
gil_scoped_release
release
;
pybind11
::
gil_scoped_release
release
;
const
int
num_channels
=
config
.
num_sms
/
2
;
const
int
num_channels
=
config
.
num_sms
/
2
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment