Safetensors conceptual guide (#905)

IDK what else to add in this guide, I looked for relevant code in TGI codebase and saw that it's used in quantization as well (maybe I could add that?)

Safetensors conceptual guide (#905)
IDK what else to add in this guide, I looked for relevant code in TGI codebase and saw that it's used in quantization as well (maybe I could add that?)
af1ed38f · Merve Noyan · GitHub · b03d2621 · af1ed38f · af1ed38f
Unverified Commit af1ed38f authored Sep 07, 2023 by Merve Noyan Committed by GitHub Sep 07, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 9 additions and 0 deletions

docs/source/_toctree.yml docs/source/_toctree.yml +2 -0

docs/source/conceptual/safetensors.md docs/source/conceptual/safetensors.md +7 -0

No files found.
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -21,6 +21,8 @@
 - sections:
  - local: conceptual/streaming
    title: Streaming
+  - local: conceptual/safetensors
+    title: Safetensors
  - local: conceptual/flash_attention
    title: Flash Attention
  title: Conceptual Guides
--- a/docs/source/conceptual/safetensors.md
+++ b/docs/source/conceptual/safetensors.md
+# Safetensors
+
+Safetensors is a model serialization format for deep learning models. It is [faster](https://huggingface.co/docs/safetensors/speed) and safer compared to other serialization formats like pickle (which is used under the hood in many deep learning libraries). 
+
+TGI depends on safetensors format mainly to enable [tensor parallelism sharding](./tensor_parallelism). For a given model repository during serving, TGI looks for safetensors weights. If there are no safetensors weights, TGI converts the PyTorch weights to safetensors format. 
+
+You can learn more about safetensors by reading the [safetensors documentation](https://huggingface.co/docs/safetensors/index).
\ No newline at end of file