"vscode:/vscode.git/clone" did not exist on "c72291832df6a62aabe89aa08fff2de184151c52"
  1. 15 Sep, 2025 1 commit
  2. 02 Aug, 2025 1 commit
  3. 20 Jun, 2025 1 commit
    • pnunna93's avatar
      Enable ROCm backend with custom ops integration (#1683) · 888788d7
      pnunna93 authored
      
      
      * Port ROCm changes from multi-backend-refactor branch
      
      * Update ops.py
      
      * Update functional.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update functional.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update test_ops.py
      
      * Update test_functional.py
      
      * Update test_ops.py
      
      * Update test_functional.py
      
      * Update test_functional.py
      
      * Update functional.py
      
      * Update functional.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update test_functional.py
      
      * Update test_functional.py
      
      * Update cextension.py
      
      * Update cuda_specs.py
      
      * Update cuda_specs.py
      
      * Update test_functional.py
      
      * Update test_linear4bit.py
      
      * Update test_cuda_setup_evaluator.py
      
      * Update test_functional.py
      
      * Update modules.py
      
      * Update modules.py
      
      * Update ops.py
      
      * Update test_linear4bit.py
      
      * Update ops.py
      
      * Update ops.py
      
      * Update test_linear4bit.py
      
      * Update test_linear4bit.py
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Create build-rocm.sh
      
      * Update cuda_specs.py
      
      * Fix trailing whitespace
      
      * Remove conflicts.diff
      
      * update for hipblasVersionMajor >=3
      
      * Update test_functional.py
      
      * Update test_linear4bit.py
      
      * Update test_ops.py
      
      * Update main.py
      
      * Update test_functional.py
      
      * Update test_linear4bit.py
      
      * Update test_ops.py
      
      * Update test_linear4bit.py
      
      * Lint
      
      * Lint
      
      * Update helpers.py
      
      * Update test_functional.py
      
      * Update test_linear4bit.py
      
      * Update test_ops.py
      
      * Lint
      
      * Update pythonInterface.cpp
      
      * lint fix
      
      * lint
      
      * Update pythonInterface.cpp
      
      * revert permissions change
      
      * Fix indentation
      
      * Update kernels_hip.cuh
      
      * Update kernels.hip
      
      * Update ops.hip
      
      * Update ops_hip.cuh
      
      * Update kernels_hip.cuh
      
      * Update kernels.hip
      
      * Update kernels.hip
      
      * Update ops.hip
      
      * Update ops_hip.cuh
      
      * Update ops.hip
      
      * Update CMakeLists.txt
      
      * Update functional.py
      
      * Update cextension.py
      
      * Update cextension.py
      
      ---------
      Co-authored-by: default avatarMISHANMAURYA <118961433+MISHANMAURYA@users.noreply.github.com>
      Co-authored-by: default avatarMISHANMAUYRA <mishanmaurya31081@gmail.com>
      Co-authored-by: default avataramcamd <andrew.chapman@amd.com>
      Co-authored-by: default avatarPrasanth Nunna <root@banff-cyxtera-s78-1.amd.com>
      888788d7
  4. 13 Jun, 2025 1 commit
  5. 04 Jun, 2025 1 commit
    • Matthew Douglas's avatar
      Deprecation cleanup (#1669) · 849d9449
      Matthew Douglas authored
      * Deprecation cleanup: remove histogram_scatter_add_2d
      
      * Deprecation cleanup: vectorwise_mm_dequant
      
      * Deprecation cleanup: vectorwise_quant
      
      * Remove unused test
      
      * Optimizer test cleanup
      
      * Deprecations: remove estimate_quantiles, create_quantile_map
      
      * Move deprecated test
      849d9449
  6. 25 Mar, 2025 1 commit
    • Matthew Douglas's avatar
      PyTorch Custom Operator Integration (#1544) · e82f72b3
      Matthew Douglas authored
      
      
      * Sketch out first custom op registration
      
      * Add note
      
      * Initial int8 op registration
      
      * Cleanup some deprecated functions.
      
      * Int8 ops updates; tests
      
      * Implement 4bit quant/dequant ops
      
      * Fix nested quant
      
      * cleanup
      
      * Test improvements
      
      * Clean up and improve tests
      
      * Add higher level custom op for int8 matmul + dequant + bias
      
      * Add gemv 4bit custom op
      
      * Cleanup
      
      * Implement out kwarg overloads for custom ops
      
      * Update PyTorch minimum to 2.1
      
      * Deprecation updates
      
      * Deprecation updates
      
      * Cleanup; rename int8_linear_dequant -> int8_scaled_mm
      
      * Bump min pytorch to 2.2
      
      * cleanup
      
      * Test reorganization
      
      * Remove deprecated supports_igemmlt
      
      * More cleanup
      
      * Cleanup obsolete C++/CUDA code
      
      * Cleanup
      
      * Create 'default' backend for fallback op implementations; initial CPU nf4 work
      
      * Stub out for multi-platform
      
      * Fix serialization tests for torch>=2.6.0
      
      * Add example for torch.compile e2e inference
      
      * Test update
      
      ---------
      Co-authored-by: default avatarTitus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
      e82f72b3
  7. 14 Jan, 2025 1 commit
  8. 05 Dec, 2024 1 commit
    • Matthew Douglas's avatar
      LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d
      Matthew Douglas authored
      
      
      * Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation
      
      * Fix unintended change
      
      * New naive mm_dequant kernel for row-major; cleanup
      
      * fix
      
      * int8 refactor: initial sparse decomp, cleanup
      
      * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup
      
      * int8: inference optimizations, some cleanup
      
      * int8: more tests passing, cleanup
      
      * int8 - more cleanup, most tests passing
      
      * int8: specify CUDA stream for int8 ops
      
      * perf: reduce overhead from getting cudaStream ptr
      
      * Mark some functions for deprecation.
      
      * int8 sparse decomp: small perf improvement
      
      * update setup.py
      
      * Update bitsandbytes/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn
      
      * int8 cleanup
      
      * Ignore ruff rule ISC001 (incompatible with formatter)
      
      * add comment
      
      * int8 more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8: rename / deprecate old fn signatures
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * type annotation
      
      * format update
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Add comment to explain division optimization
      
      * more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Type annotations, cleanup
      
      * remove unused kernels; improved type annotations
      
      * small perf optimization for single-GPU systems
      
      * small perf optimization for single-GPU systems
      
      * update docstrings
      
      * Improve docs and tests
      
      * Update docstring
      
      * Update test
      
      * add benchmarking script
      
      * test cleanup: add deprecated marker, move benchmarks out
      
      * Add int8 dequant function; misc improvements
      
      * int8 matmul fallback for inner dims not divisible by 4
      
      * improve register usage of kInt8VectorQuant - especially for A100/H100
      
      * disable fail-fast for package build
      
      * maxwell compat
      
      * ptxas verbose
      
      * docs update
      
      * doc update
      
      * backward fix
      
      * Bugfix sparse decomp
      
      * Int8 fix for PEFT OLoRA init
      
      * Fix test for deprecated spmm_coo
      
      * test improvement
      
      * doc update
      
      * typo
      
      * doc cleanup
      
      * docs
      
      * add inference benchmark script
      
      * Add benchmarks, doc update
      
      ---------
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      81e6345d
  9. 23 Oct, 2024 1 commit
  10. 20 Sep, 2024 2 commits
  11. 26 Aug, 2024 1 commit
  12. 22 Aug, 2024 1 commit
  13. 12 Jul, 2024 1 commit
    • Markus Hennerbichler's avatar
      Fix CUDA 12.5 build issue (#1273) · 85e01276
      Markus Hennerbichler authored
      pythonInterface.cpp depends on ops.cuh
      which in turn depends on some thrust headers.
      It is defined as a C++ compilation unit
      which is problematic  becuase thrift doesn't guarantee
      compatibility with a host compiler.
      
      This is starting to cause issues with CUDA 12.5.
      There is no dependency on the thrust headers,
      which means they can be removed without other consequences.
      85e01276
  14. 29 Mar, 2024 1 commit
  15. 23 Feb, 2024 1 commit
  16. 14 Feb, 2024 1 commit
  17. 05 Feb, 2024 3 commits
    • Rickard's avatar
      Make native code portable and add GitHub workflow for building (#949) · 73d3e7b6
      Rickard authored
      
      
      * Make native code portable and add GitHub workflow for building
      
      * Removed deprecated Python versions
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update python-package.yml
      
      * Do not test on Python 3.13 until released
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Refactor build stage
      
      * Fixed breaking actions change
      
      * Slim down Windows cuda
      
      * Create dependabot.yml
      
      * Bespoke local dev requirements.txt
      
      * Enable VS integration
      
      * Group Dependabot updates
      
      * Cleanup
      
      * Update python-package.yml
      
      * Reinstate file that was wrongly merged
      
      * Fixed regression caused by new version of download-artifact
      
      * Update python-package.yml
      
      * Update python-package.yml
      
      * Fix matrix
      
      * Update python-package.yml
      
      * Merge
      
      * Pipeline
      
      * Fixed conflict
      
      * Fixed conflict
      
      * Update CMakeLists.txt
      
      * Fixed merge error
      
      * cleanup
      
      * cleanup
      
      * Find CUDA
      
      * Fix
      
      * Fixing merge error from latest merge from main
      
      * Fix setup.py
      
      * Fixed typo in artifact name
      
      * Remove linker flags
      
      * Build nocublaslt versions
      
      * Fixed formatting
      
      * Fixed VS Code format on save
      
      * Ran format on save from VScode
      
      * Re-saved the json files using the new settings
      
      * Re-saved CMakeLists.txt to get formatting right
      
      * Add path filter
      
      * Formatting
      
      ---------
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      73d3e7b6
    • Rickard's avatar
      332530ba
    • Aarni Koskela's avatar
      Enable crate-ci/typos lint; fix typos (#1005) · 8c507d92
      Aarni Koskela authored
      
      Co-authored-by: default avatarTitus von Koeller <titus@vonkoeller.com>
      
      fix erroneous correction
      8c507d92
  18. 01 Feb, 2024 1 commit
  19. 31 Jan, 2024 1 commit
  20. 30 Jan, 2024 1 commit
  21. 09 Dec, 2023 1 commit
  22. 19 Jul, 2023 1 commit
  23. 17 Jul, 2023 1 commit
  24. 11 Jul, 2023 1 commit
  25. 10 Jul, 2023 5 commits
  26. 09 Jul, 2023 2 commits
  27. 08 Jul, 2023 2 commits
  28. 05 Jul, 2023 1 commit
  29. 04 Jul, 2023 2 commits
  30. 31 May, 2023 1 commit