Unverified Commit c905684c authored by Chenheli Hua's avatar Chenheli Hua Committed by GitHub
Browse files

[Core] Asynchronous h2d in merge_multimodal_embeddings via pinned memory. (#23686)


Signed-off-by: default avatarChenheli Hua <huachenheli@outlook.com>
Co-authored-by: default avatarRoger Wang <hey@rogerw.io>
parent 78683580
...@@ -508,7 +508,9 @@ def merge_multimodal_embeddings( ...@@ -508,7 +508,9 @@ def merge_multimodal_embeddings(
""" """
if isinstance(placeholder_token_id, list): if isinstance(placeholder_token_id, list):
placeholder_token_id = torch.tensor(placeholder_token_id, placeholder_token_id = torch.tensor(placeholder_token_id,
device=input_ids.device) pin_memory=True).to(
device=input_ids.device,
non_blocking=True)
return _merge_multimodal_embeddings( return _merge_multimodal_embeddings(
inputs_embeds, inputs_embeds,
torch.isin(input_ids, placeholder_token_id), torch.isin(input_ids, placeholder_token_id),
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment