Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
SIYIXNI
vllm
Commits
6f88f762
"docs/source/en/using-diffusers/custom_pipeline_overview.md" did not exist on "7761b89d7bf18bad00aa989b1e1bb0369f4e4293"
Unverified
Commit
6f88f762
authored
Sep 28, 2023
by
Woosuk Kwon
Committed by
GitHub
Sep 28, 2023
Browse files
Fix OOM in attention kernel test (#1223)
parent
202351d5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
2 deletions
+5
-2
tests/kernels/test_attention.py
tests/kernels/test_attention.py
+5
-2
No files found.
tests/kernels/test_attention.py
View file @
6f88f762
...
@@ -247,8 +247,11 @@ def test_multi_query_kv_attention(
...
@@ -247,8 +247,11 @@ def test_multi_query_kv_attention(
torch
.
random
.
manual_seed
(
seed
)
torch
.
random
.
manual_seed
(
seed
)
torch
.
cuda
.
manual_seed
(
seed
)
torch
.
cuda
.
manual_seed
(
seed
)
seq_lens
=
random
.
sample
(
range
(
1
,
MAX_SEQ_LEN
),
num_seqs
)
# MAX_SEQ_LEN sometimes causes OOM in the reference implementation.
seq_lens
[
-
1
]
=
MAX_SEQ_LEN
# As the xformers library is already tested with its own tests, we can use
# a smaller MAX_SEQ_LEN here.
max_len
=
min
(
MAX_SEQ_LEN
,
4096
)
seq_lens
=
random
.
sample
(
range
(
1
,
max_len
),
num_seqs
)
num_tokens
=
sum
(
seq_lens
)
num_tokens
=
sum
(
seq_lens
)
scale
=
float
(
1.0
/
(
head_size
**
0.5
))
scale
=
float
(
1.0
/
(
head_size
**
0.5
))
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment