[Doc] Reorganize user guide (#18661)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Doc] Reorganize user guide (#18661)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
1cb194a0 · Cyrus Leung · GitHub · 2cd4d58d · 1cb194a0 · 1cb194a0
Unverified Commit 1cb194a0 authored May 24, 2025 by Cyrus Leung Committed by GitHub May 24, 2025
7 changed files
--- a/docs/serving/seed_parameter_behavior.md
+++ b/docs/serving/seed_parameter_behavior.md
-# Seed Parameter Behavior
+# Reproducibility

 ## Overview


--- a/docs/deployment/security.md
+++ b/docs/deployment/security.md
-# Security Guide
+# Security

 ## Inter-Node Communication


--- a/docs/getting_started/troubleshooting.md
+++ b/docs/getting_started/troubleshooting.md
@@ -23,7 +23,7 @@ It'd be better to store the model in a local disk. Additionally, have a look at

 ## Out of memory

-If the model is too large to fit in a single GPU, you will get an out-of-memory (OOM) error. Consider adopting [these options][reducing-memory-usage] to reduce the memory consumption.
+If the model is too large to fit in a single GPU, you will get an out-of-memory (OOM) error. Consider adopting [these options](../configuration/conserving_memory.md) to reduce the memory consumption.

 ## Generation quality changed

@@ -159,7 +159,7 @@ If you have seen a warning in your logs like this:
 WARNING 12-11 14:50:37 multiproc_worker_utils.py:281] CUDA was previously
    initialized. We must use the `spawn` multiprocessing start method. Setting
    VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See
-    https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#python-multiprocessing
+    https://docs.vllm.ai/en/latest/usage/troubleshooting.html#python-multiprocessing
    for more information.
 ```

@@ -258,7 +258,7 @@ or:
 ValueError: Model architectures ['<arch>'] are not supported for now. Supported architectures: [...]
 ```

-But you are sure that the model is in the [list of supported models][supported-models], there may be some issue with vLLM's model resolution. In that case, please follow [these steps][model-resolution] to explicitly specify the vLLM implementation for the model.
+But you are sure that the model is in the [list of supported models][supported-models], there may be some issue with vLLM's model resolution. In that case, please follow [these steps](../configuration/model_resolution.md) to explicitly specify the vLLM implementation for the model.

 ## Failed to infer device type


--- a/docs/serving/usage_stats.md
+++ b/docs/serving/usage_stats.md
--- a/docs/getting_started/v1_user_guide.md
+++ b/docs/getting_started/v1_user_guide.md
-# vLLM V1 User Guide
+# vLLM V1

 V1 is now enabled by default for all supported use cases, and we will gradually enable it for every use case we plan to support. Please share any feedback on [GitHub](https://github.com/vllm-project/vllm) or in the [vLLM Slack](https://inviter.co/vllm-slack).


--- a/vllm/envs.py
+++ b/vllm/envs.py
@@ -164,7 +164,7 @@ def get_vllm_port() -> Optional[int]:
                raise ValueError(
                    f"VLLM_PORT '{port}' appears to be a URI. "
                    "This may be caused by a Kubernetes service discovery issue"
-                    "check the warning in: https://docs.vllm.ai/en/stable/serving/env_vars.html"
+                    "check the warning in: https://docs.vllm.ai/en/stable/usage/env_vars.html"
                )
        except Exception:
            pass

--- a/vllm/utils.py
+++ b/vllm/utils.py
@@ -2531,7 +2531,7 @@ def _maybe_force_spawn():
        logger.warning(
            "We must use the `spawn` multiprocessing start method. "
            "Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. "
-            "See https://docs.vllm.ai/en/latest/getting_started/"
+            "See https://docs.vllm.ai/en/latest/usage/"
            "troubleshooting.html#python-multiprocessing "
            "for more information. Reason: %s", reason)
        os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"