Unverified Commit 1bff6a0b authored by Joao Gante's avatar Joao Gante Committed by GitHub
Browse files

Generate: update links on LLM tutorial doc (#30550)

parent 75bbfd5b
...@@ -247,10 +247,11 @@ While the autoregressive generation process is relatively straightforward, makin ...@@ -247,10 +247,11 @@ While the autoregressive generation process is relatively straightforward, makin
### Advanced generate usage ### Advanced generate usage
1. [Guide](generation_strategies) on how to control different generation methods, how to set up the generation configuration file, and how to stream the output; 1. Guide on how to [control different generation methods](generation_strategies), how to set up the generation configuration file, and how to stream the output;
2. [Guide](chat_templating) on the prompt template for chat LLMs; 2. [Accelerating text generation](llm_optims);
3. [Guide](tasks/prompting) on to get the most of prompt design; 3. [Prompt templates for chat LLMs](chat_templating);
4. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples! 4. [Prompt design guide](tasks/prompting);
5. API reference on [`~generation.GenerationConfig`], [`~generation.GenerationMixin.generate`], and [generate-related classes](internal/generation_utils). Most of the classes, including the logits processors, have usage examples!
### LLM leaderboards ### LLM leaderboards
...@@ -259,10 +260,12 @@ While the autoregressive generation process is relatively straightforward, makin ...@@ -259,10 +260,12 @@ While the autoregressive generation process is relatively straightforward, makin
### Latency, throughput and memory utilization ### Latency, throughput and memory utilization
1. [Guide](llm_tutorial_optimization) on how to optimize LLMs for speed and memory; 1. Guide on how to [optimize LLMs for speed and memory](llm_tutorial_optimization);
2. [Guide](main_classes/quantization) on quantization such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements. 2. Guide on [quantization](main_classes/quantization) such as bitsandbytes and autogptq, which shows you how to drastically reduce your memory requirements.
### Related libraries ### Related libraries
1. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs; 1. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices.
2. [`optimum`](https://github.com/huggingface/optimum), an extension of 🤗 Transformers that optimizes for specific hardware devices. 2. [`outlines`](https://github.com/outlines-dev/outlines), a library where you can constrain text generation (e.g. to generate JSON files);
3. [`text-generation-inference`](https://github.com/huggingface/text-generation-inference), a production-ready server for LLMs;
4. [`text-generation-webui`](https://github.com/oobabooga/text-generation-webui), a UI for text generation;
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment