Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
jerrrrry
infinilm
Commits
93cd62d5
Commit
93cd62d5
authored
Jul 31, 2025
by
wooway777
Browse files
issue/21 - Adjusted attn_val_buf Shape
parent
115badb9
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1 addition
and
1 deletion
+1
-1
src/models/jiuge/jiuge.cpp
src/models/jiuge/jiuge.cpp
+1
-1
No files found.
src/models/jiuge/jiuge.cpp
View file @
93cd62d5
...
...
@@ -181,7 +181,7 @@ void inferDeviceBatch(const JiugeMeta &meta, DeviceResource &rsrc,
auto
qk_buf
=
Tensor
::
buffer
(
dt_logits
,
{
nh
,
max_qk_size
},
rsrc
.
memory_pool
);
auto
rearrange_q_buf
=
Tensor
::
buffer
(
dt_logits
,
{
nkvh
,
ngroup
*
max_seq_len
,
dh
},
rsrc
.
memory_pool
);
auto
attn_val_buf
=
Tensor
::
buffer
(
dt_logits
,
{
n
h
,
max_seq_len
,
dh
},
rsrc
.
memory_pool
);
auto
attn_val_buf
=
Tensor
::
buffer
(
dt_logits
,
{
n
kvh
,
ngroup
*
max_seq_len
,
dh
},
rsrc
.
memory_pool
);
// MLP buffers
auto
gate_buf
=
gate_up_buf
->
slice
(
1
,
0
,
di
);
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment