1. 02 Feb, 2025 1 commit
    • Russell Bryant's avatar
      [Misc] Add SPDX-License-Identifier headers to python source files (#12628) · e489ad7a
      Russell Bryant authored
      - **Add SPDX license headers to python source files**
      - **Check for SPDX headers using pre-commit**
      
      commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
      Author: Russell Bryant <rbryant@redhat.com>
      Date:   Fri Jan 31 14:18:24 2025 -0500
      
          Add SPDX license headers to python source files
          
      This commit adds SPDX license headers to python source files as
      recommended to
      the project by the Linux Foundation. These headers provide a concise way
      that is
      both human and machine readable for communicating license information
      for each
      source file. It helps avoid any ambiguity about the license of the code
      and can
          also be easily used by tools to help manage license compliance.
          
      The Linux Foundation runs license scans against the codebase to help
      ensure
          we are in compliance with the licenses of the code we use, including
      dependencies. Having these headers in place helps that tool do its job.
          
          More information can be found on the SPDX site:
          
          - https://spdx.dev/learn/handling-license-info/
      
      Signed-off-by: default avatarRussell Bryant <rbryant@redhat.com>
      
      commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
      Author: Russell Bryant <rbryant@redhat.com>
      Date:   Fri Jan 31 14:36:32 2025 -0500
      
          Check for SPDX headers using pre-commit
      Signed-off-by: default avatarRussell Bryant <rbryant@redhat.com>
      
      ---------
      Signed-off-by: default avatarRussell Bryant <rbryant@redhat.com>
      e489ad7a
  2. 01 Feb, 2025 1 commit
  3. 30 Jan, 2025 1 commit
  4. 23 Jan, 2025 1 commit
  5. 21 Jan, 2025 1 commit
  6. 17 Jan, 2025 1 commit
  7. 16 Jan, 2025 1 commit
  8. 17 Dec, 2024 1 commit
  9. 19 Nov, 2024 1 commit
  10. 18 Nov, 2024 1 commit
  11. 06 Nov, 2024 1 commit
  12. 29 Oct, 2024 1 commit
  13. 28 Oct, 2024 1 commit
  14. 16 Oct, 2024 1 commit
  15. 23 Sep, 2024 1 commit
  16. 18 Sep, 2024 2 commits
  17. 22 Aug, 2024 1 commit
  18. 20 Aug, 2024 1 commit
  19. 16 Aug, 2024 1 commit
  20. 02 Aug, 2024 1 commit
  21. 27 Jul, 2024 2 commits
  22. 16 Jul, 2024 1 commit
  23. 11 Jul, 2024 1 commit
  24. 20 Jun, 2024 1 commit
  25. 15 Jun, 2024 1 commit
  26. 14 Jun, 2024 1 commit
  27. 05 Jun, 2024 1 commit
  28. 04 Jun, 2024 2 commits
  29. 31 May, 2024 2 commits
  30. 23 May, 2024 1 commit
  31. 22 May, 2024 1 commit
  32. 16 May, 2024 1 commit
  33. 03 May, 2024 1 commit
  34. 01 May, 2024 1 commit
    • Philipp Moritz's avatar
      [Kernel] Update fused_moe tuning script for FP8 (#4457) · 24bb4fe4
      Philipp Moritz authored
      This PR updates the tuning script for the fused_moe kernel to support FP8 and also adds configurations for TP4. Note that for the configuration I removed num_warps and num_stages for small batch sizes since that improved performance and brought the benchmarks on par with the numbers before in that regime to make sure this is a strict improvement over the status quo.
      
      All the numbers below are for mistralai/Mixtral-8x7B-Instruct-v0.1, 1000 input and 50 output tokens.
      
      Before this PR (with static activation scaling):
      
      qps = 1: 9.8 ms ITL, 0.49s e2e latency
      qps = 2: 9.7 ms ITL, 0.49s e2e latency 
      qps = 4: 10.1 ms ITL, 0.52s e2e latency
      qps = 6: 11.9 ms ITL, 0.59s e2e latency
      qps = 8: 14.0 ms ITL, 0.70s e2e latency
      qps = 10: 15.7 ms ITL, 0.79s e2e latency
      
      After this PR (with static activation scaling):
      
      qps = 1: 9.8 ms ITL, 0.49s e2e latency
      qps = 2: 9.7 ms ITL, 0.49s e2e latency
      qps = 4: 10.2 ms ITL, 0.53s e2e latency
      qps = 6: 11.9 ms ITL, 0.59s e2e latency
      qps = 8: 11.9 ms ITL, 0.59s e2e latency
      qps = 10: 12.1 ms ITL, 0.61s e2e latency
      24bb4fe4
  35. 25 Apr, 2024 1 commit
  36. 23 Apr, 2024 1 commit