Generate: update links on LLM tutorial doc (#30550)

1bff6a0b · Joao Gante · GitHub · 75bbfd5b · 1bff6a0b
Unverified Commit 1bff6a0b authored Apr 30, 2024 by Joao Gante Committed by GitHub Apr 30, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 11 additions and 8 deletions

docs/source/en/llm_tutorial.md docs/source/en/llm_tutorial.md +11 -8

No files found.
--- a/docs/source/en/llm_tutorial.md
+++ b/docs/source/en/llm_tutorial.md
@@ -247,10 +247,11 @@ While the autoregressive generation process is relatively straightforward, makin
 ### Advanced generate usage
-1. [Guide](generation_strategies) on how to control different generation methods, how to set up the generation configuration file, and how to stream the output;
+1. Guide on how to [control different generation methods](generation_strategies), how to set up the generation configuration file, and how to stream the output;
-2. [Guide](chat_templating) on the prompt template for chat LLMs;
+2. [Accelerating text generation](llm_optims);
-3. [Guide](tasks/prompting) on to get the most of prompt design;
+3. [Prompt templates for chat LLMs](chat_templating);
-4. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples!
+4. [Prompt design guide](tasks/prompting);
+5. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples!
 ### LLM leaderboards
@@ -259,10 +260,12 @@ While the autoregressive generation process is relatively straightforward, makin
 ### Latency, throughput and memory utilization
-1. [Guide](llm_tutorial_optimization) on how to optimize LLMs for speed and memory;
+1. Guide on how to [optimize LLMs for speed and memory](llm_tutorial_optimization);
-2. [Guide](main_classes/quantization) on quantization such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements.
+2. Guide on [quantization](main_classes/quantization) such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements.
 ### Related libraries
-1. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs;
+1. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices.
-2. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices.
+2. [`outlines`](https://github.com/outlines-dev/outlines), a library where you can constrain text generation (e.g. to generate JSON files);
+3. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs;
+4. [`text-generation-webui`](https://github.com/oobabooga/text-generation-webui), a UI for text generation;