"docs/source/vscode:/vscode.git/clone" did not exist on "cf4aa3597f2fd333a71bc3c837f2a2ce15c81734"
Unverified Commit 00ba7cad authored by Matt's avatar Matt Committed by GitHub
Browse files

Rewrite a couple of lines in the TF XLA doc (#21177)

* Rewrite a couple of lines in the TF XLA doc to explain that jit_compile can be used in model.compile() too

* Remove extra )
parent c59d71b2
......@@ -18,9 +18,9 @@ Accelerated Linear Algebra, dubbed XLA, is a compiler for accelerating the runti
XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes.
Using XLA in TensorFlow is simple it comes packaged inside the `tensorflow` library, and it can be triggered with the `jit_compile` argument in any graph-creating function such as [`tf.function`](https://www.tensorflow.org/guide/intro_to_graphs).
Using XLA in TensorFlow is simple it comes packaged inside the `tensorflow` library, and it can be triggered with the `jit_compile` argument in any graph-creating function such as [`tf.function`](https://www.tensorflow.org/guide/intro_to_graphs). When using Keras methods like `fit()` and `predict()`, you can enable XLA simply by passing the `jit_compile` argument to `model.compile()`. However, XLA is not limited to these methods - it can also be used to accelerate any arbitrary `tf.function`.
🤗 Transformers supports XLA-compatible TensorFlow (TF) models for text generation (such as [GPT2](https://huggingface.co/docs/transformers/model_doc/gpt2), [T5](https://huggingface.co/docs/transformers/model_doc/t5), [OPT](https://huggingface.co/docs/transformers/model_doc/opt)) and speech processing (such as [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper)). XLA can be used during training as well as inference. This document focuses on the inference part.
Several TensorFlow methods in 🤗 Transformers have been rewritten to be XLA-compatible, including text generation for models such as [GPT2](https://huggingface.co/docs/transformers/model_doc/gpt2), [T5](https://huggingface.co/docs/transformers/model_doc/t5) and [OPT](https://huggingface.co/docs/transformers/model_doc/opt), as well as speech processing for models such as [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper).
While the exact amount of speed-up is very much model-dependent, for TensorFlow text generation models inside 🤗 Transformers, we noticed a speed-up of ~100x. This document will explain how you can use XLA for these models to get the maximum amount of performance. Well also provide links to additional resources if youre interested to learn more about the benchmarks and our design philosophy behind the XLA integration.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment