Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
AutoAWQ
Commits
dc99d2fc
Commit
dc99d2fc
authored
Sep 13, 2023
by
Casper Hansen
Browse files
Fix Falcon benchmark format
parent
7cf3c790
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
4 deletions
+5
-4
README.md
README.md
+5
-4
No files found.
README.md
View file @
dc99d2fc
...
@@ -228,10 +228,11 @@ generation_output = model.generate(
...
@@ -228,10 +228,11 @@ generation_output = model.generate(
### Falcon 7B
### Falcon 7B
Note: Fast generation, fast context processing
-
Note: Fast generation, fast context processing
GPU: NVIDIA GeForce RTX 3090
-
GPU: NVIDIA GeForce RTX 3090
Command:
`python examples/benchmark.py --model_path casperhansen/falcon-7b-awq --quant_file awq_model_w4_g64.pt`
-
Command:
`python examples/benchmark.py --model_path casperhansen/falcon-7b-awq --quant_file awq_model_w4_g64.pt`
Version: GEMM
-
Version: GEMM
| Batch Size | Prefill Length | Decode Length | Prefill tokens/s | Decode tokens/s | Memory (VRAM) |
| Batch Size | Prefill Length | Decode Length | Prefill tokens/s | Decode tokens/s | Memory (VRAM) |
|-------------:|-----------------:|----------------:|-------------------:|------------------:|:-----------------|
|-------------:|-----------------:|----------------:|-------------------:|------------------:|:-----------------|
| 1 | 32 | 32 | 466.826 | 95.1413 | 4.47 GB (18.88%) |
| 1 | 32 | 32 | 466.826 | 95.1413 | 4.47 GB (18.88%) |
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment