Documentation Cleanup (#1644)

* Start cleaning up docs * Remove page * Minor update * correction * Minor doc revisions * Update installation.mdx * Update _toctree.yml

Documentation Cleanup (#1644)
* Start cleaning up docs * Remove page * Minor update * correction * Minor doc revisions * Update installation.mdx * Update _toctree.yml
198d08fc · Matthew Douglas · GitHub · 9f858294 · 198d08fc · 9f858294
Unverified Commit 198d08fc authored May 23, 2025 by Matthew Douglas Committed by GitHub May 23, 2025
7 changed files
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -2,18 +2,15 @@
  sections:
  - local: index
    title: bitsandbytes
-  - local: quickstart
-    title: Quickstart
  - local: installation
    title: Installation
- title: Guides
+  - local: quickstart
+    title: Quickstart
+- title: Usage Guides
  sections:
  - local: optimizers
    title: 8-bit optimizers
-  - local: algorithms
-    title: Algorithms
-  - local: non_cuda_backends
-    title: Non-CUDA compute backends
  - local: fsdp_qlora
    title: FSDP-QLoRA
  - local: integrations
@@ -56,7 +53,7 @@
      title: RMSprop
    - local: reference/optim/sgd
      title: SGD
-  - title: k-bit quantizers
+  - title: Modules
    sections:
    - local: reference/nn/linear8bit
      title: LLM.int8()

--- a/docs/source/algorithms.mdx
+++ b/docs/source/algorithms.mdx
-# Other algorithms
-_WIP: Still incomplete... Community contributions would be greatly welcome!_
-This is an overview of the `bnb.functional` API in `bitsandbytes` that we think would also be useful as standalone entities.
-## Using Int8 Matrix Multiplication
-For straight Int8 matrix multiplication without mixed precision decomposition you can use ``bnb.matmul(...)``. To enable mixed precision decomposition, use the threshold parameter:
-```py
-bnb.matmul(..., threshold=6.0)
-```
--- a/docs/source/contributing.mdx
+++ b/docs/source/contributing.mdx
-# Contributors guidelines
+# Contribution Guide
-... still under construction ... (feel free to propose materials, `bitsandbytes` is a community project)
 ## Setup

--- a/docs/source/faqs.mdx
+++ b/docs/source/faqs.mdx
@@ -3,5 +3,3 @@
 Please submit your questions in [this Github Discussion thread](https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1013) if you feel that they will likely affect a lot of other users and that they haven't been sufficiently covered in the documentation.
 We'll pick the most generally applicable ones and post the QAs here or integrate them into the general documentation (also feel free to submit doc PRs, please).
-# ... under construction ...
--- a/docs/source/installation.mdx
+++ b/docs/source/installation.mdx
--- a/docs/source/non_cuda_backends.mdx
+++ b/docs/source/non_cuda_backends.mdx
-# Multi-backend support (non-CUDA backends)
-> [!Tip]
-> If you feel these docs need some additional info, please consider submitting a PR or respectfully request the missing info in one of the below mentioned Github discussion spaces.
-As part of a recent refactoring effort, we will soon offer official multi-backend support. Currently, this feature is available in a preview alpha release, allowing us to gather early feedback from users to improve the functionality and identify any bugs.
-At present, the Intel CPU and AMD ROCm backends are considered fully functional. The Intel XPU backend has limited functionality and is less mature.
-Please refer to the [installation instructions](./installation#multi-backend) for details on installing the backend you intend to test (and hopefully provide feedback on).
-> [!Tip]
-> Apple Silicon support is planned for Q4 2024. We are actively seeking contributors to help implement this, develop a concrete plan, and create a detailed list of requirements. Due to limited resources, we rely on community contributions for this implementation effort. To discuss further, please spell out your thoughts and discuss in [this GitHub discussion](https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1340) and tag `@Titus-von-Koeller` and `@matthewdouglas`. Thank you!
-## Alpha Release
-As we are currently in the alpha testing phase, bugs are expected, and performance might not meet expectations. However, this is exactly what we want to discover from **your** perspective as the end user!
-Please share and discuss your feedback with us here:
- [Github Discussion: Multi-backend refactor: Alpha release ( AMD ROCm ONLY )](https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1339)
- [Github Discussion: Multi-backend refactor: Alpha release ( Intel ONLY )](https://github.com/bitsandbytes-foundation/bitsandbytes/discussions/1338)
-Thank you for your support!
-## Benchmarks
-### Intel
-The following performance data is collected from Intel 4th Gen Xeon (SPR) platform. The tables show speed-up and memory compared with different data types of [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf).
-#### Inference (CPU)
-| Data Type | BF16 | INT8 | NF4 | FP4 |
-|---|---|---|---|---|
-| Speed-Up (vs BF16) | 1.0x | 0.6x | 2.3x | 0.03x |
-| Memory (GB) | 13.1 | 7.6 | 5.0 | 4.6 |
-#### Fine-Tuning (CPU)
-| Data Type | AMP BF16 | INT8 | NF4 | FP4 |
-|---|---|---|---|---|
-| Speed-Up (vs AMP BF16) | 1.0x | 0.38x | 0.07x | 0.07x |
-| Memory (GB) | 40 | 9 | 6.6 | 6.6 |
--- a/docs/source/reference/functional.mdx
+++ b/docs/source/reference/functional.mdx
@@ -9,8 +9,6 @@ The `bitsandbytes.functional` API provides the low-level building blocks for the
 * For experimental or research purposes requiring non-standard quantization or performance optimizations.
 ## LLM.int8()
-[[autodoc]] functional.int8_double_quant
 [[autodoc]] functional.int8_linear_matmul
 [[autodoc]] functional.int8_mm_dequant
@@ -19,7 +17,6 @@ The `bitsandbytes.functional` API provides the low-level building blocks for the
 [[autodoc]] functional.int8_vectorwise_quant
 ## 4-bit
 [[autodoc]] functional.dequantize_4bit
@@ -49,5 +46,3 @@ For more details see [8-Bit Approximations for Parallelism in Deep Learning](htt
 ## Utility
 [[autodoc]] functional.get_ptr
-[[autodoc]] functional.is_on_gpu