1. 27 Feb, 2026 6 commits
  2. 26 Feb, 2026 4 commits
  3. 24 Feb, 2026 4 commits
  4. 22 Feb, 2026 1 commit
  5. 21 Feb, 2026 6 commits
  6. 20 Feb, 2026 1 commit
  7. 11 Feb, 2026 3 commits
    • one's avatar
      Refactor RCCL log parser to enhance transfer reporting · 4bdccdbc
      one authored
      - Rename transfer fields for clarity and introduce separate methods for reporting non-P2P and P2P transfers.
      - Add new P2P fields extraction and sorting logic to improve data presentation.
      - Update method names and comments for better understanding of functionality.
      4bdccdbc
    • one's avatar
      Update repo structure · e417e7f5
      one authored
      e417e7f5
    • one's avatar
      Add RCCL log parser script · 6ea76bda
      one authored
      - Introduce `rccl_log_parser.py` for parsing RCCL logs, extracting system information, user-defined environment variables, graph info, and transfer arguments.
      - Add usage examples in `README.md` for running the parser as a wrapper and processing existing log files.
      6ea76bda
  8. 05 Feb, 2026 1 commit
    • one's avatar
      Enhance profiling and warmup functionality in evo2 scripts · 72ec54e3
      one authored
      - Update run.sh to include new options for warmups and prompt stretching.
      - Refactor test_evo2_generation_batched.py to improve trace output formatting and add support for warmup sequences.
      - Adjust batch processing to include detailed profiling for each step.
      72ec54e3
  9. 04 Feb, 2026 1 commit
    • one's avatar
      Update prompt_stretch for evo2 · 23db469a
      one authored
      - Remove prompt_stretch option from run.sh
      - Adjust condition in test_evo2_generation_batched.py to allow prompt stretching based on batch size
      23db469a
  10. 03 Feb, 2026 1 commit
    • one's avatar
      Enhance prompt handling and profiling in evo2 scripts · af277ff1
      one authored
      - Update run.sh to include new command-line options for prompt stretching and token limits.
      - Modify test_evo2_generation_batched.py to adjust profiling settings and improve output formatting.
      - Add support for stretching prompts to the longest length for batch processing.
      af277ff1
  11. 01 Feb, 2026 4 commits
    • one's avatar
      Enhance profiling capabilities in evo2 scripts · c647fd9a
      one authored
      - Update run.sh to include trace logging options with gzip support.
      - Modify test_evo2_generation_batched.py to add command-line arguments for trace log directory and gzip option.
      - Refactor custom trace handler to utilize gzip compression for trace outputs.
      c647fd9a
    • one's avatar
      Update gemv benchnmark scripts · 3bb2e7a5
      one authored
      - Remove gemv_export.cpp
      - Update Makefile and README for compiler variable changes
      - Adjust run-all.sh for consistent build commands
      3bb2e7a5
    • one's avatar
      Add kernel launch overhead benchmark and associated build scripts · 0fe0b01f
      one authored
      - Introduce kernel_launch_overhead.cu to measure kernel launch latency, system throughput, CPU dispatch overhead, and GPU dispatch time.
      - Create Makefile for building the benchmark with support for nvcc and hipcc.
      - Add run-all.sh script to execute the benchmark with specified device settings.
      0fe0b01f
    • one's avatar
      Add trace fix script and refactor evo2 launch scripts · 65bf476e
      one authored
      - Add fix-pt-trace.sh for repairing non-UTF-8 traces.
      - Remove deprecated run-rocblas.sh.
      - Update trace handler (worker names) and tune GPU bindings in run-all.sh.
      65bf476e
  12. 31 Jan, 2026 2 commits
  13. 30 Jan, 2026 3 commits
  14. 29 Jan, 2026 3 commits