- 03 Feb, 2026 1 commit
-
-
one authored
- Update run.sh to include new command-line options for prompt stretching and token limits. - Modify test_evo2_generation_batched.py to adjust profiling settings and improve output formatting. - Add support for stretching prompts to the longest length for batch processing.
-
- 01 Feb, 2026 4 commits
-
-
one authored
- Update run.sh to include trace logging options with gzip support. - Modify test_evo2_generation_batched.py to add command-line arguments for trace log directory and gzip option. - Refactor custom trace handler to utilize gzip compression for trace outputs.
-
one authored
- Remove gemv_export.cpp - Update Makefile and README for compiler variable changes - Adjust run-all.sh for consistent build commands
-
one authored
- Introduce kernel_launch_overhead.cu to measure kernel launch latency, system throughput, CPU dispatch overhead, and GPU dispatch time. - Create Makefile for building the benchmark with support for nvcc and hipcc. - Add run-all.sh script to execute the benchmark with specified device settings.
-
one authored
- Add fix-pt-trace.sh for repairing non-UTF-8 traces. - Remove deprecated run-rocblas.sh. - Update trace handler (worker names) and tune GPU bindings in run-all.sh.
-
- 31 Jan, 2026 2 commits
- 30 Jan, 2026 3 commits
- 29 Jan, 2026 3 commits
- 28 Jan, 2026 1 commit
-
-
one authored
-
- 27 Jan, 2026 1 commit
-
-
one authored
-
- 26 Jan, 2026 4 commits