Rewrite a couple of lines in the TF XLA doc (#21177)

* Rewrite a couple of lines in the TF XLA doc to explain that jit_compile can be used in model.compile() too * Remove extra )

Rewrite a couple of lines in the TF XLA doc (#21177)
* Rewrite a couple of lines in the TF XLA doc to explain that jit_compile can be used in model.compile() too * Remove extra )
00ba7cad · Matt · GitHub · c59d71b2 · 00ba7cad
Unverified Commit 00ba7cad authored Jan 18, 2023 by Matt Committed by GitHub Jan 18, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

docs/source/en/tf_xla.mdx docs/source/en/tf_xla.mdx +2 -2

No files found.
--- a/docs/source/en/tf_xla.mdx
+++ b/docs/source/en/tf_xla.mdx
@@ -18,9 +18,9 @@ Accelerated Linear Algebra, dubbed XLA, is a compiler for accelerating the runti

 XLA (Accelerated Linear Algebra) is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes.

-Using XLA in TensorFlow is simple – it comes packaged inside the `tensorflow` library, and it can be triggered with the `jit_compile` argument in any graph-creating function such as [`tf.function`](https://www.tensorflow.org/guide/intro_to_graphs). 
+Using XLA in TensorFlow is simple – it comes packaged inside the `tensorflow` library, and it can be triggered with the `jit_compile` argument in any graph-creating function such as [`tf.function`](https://www.tensorflow.org/guide/intro_to_graphs). When using Keras methods like `fit()` and `predict()`, you can enable XLA simply by passing the `jit_compile` argument to `model.compile()`. However, XLA is not limited to these methods - it can also be used to accelerate any arbitrary `tf.function`.

-🤗 Transformers supports XLA-compatible TensorFlow (TF) models for text generation (such as [GPT2](https://huggingface.co/docs/transformers/model_doc/gpt2), [T5](https://huggingface.co/docs/transformers/model_doc/t5), [OPT](https://huggingface.co/docs/transformers/model_doc/opt)) and speech processing (such as [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper)). XLA can be used during training as well as inference. This document focuses on the inference part. 
+Several TensorFlow methods in 🤗 Transformers have been rewritten to be XLA-compatible, including text generation for models such as [GPT2](https://huggingface.co/docs/transformers/model_doc/gpt2), [T5](https://huggingface.co/docs/transformers/model_doc/t5) and [OPT](https://huggingface.co/docs/transformers/model_doc/opt), as well as speech processing for models such as [Whisper](https://huggingface.co/docs/transformers/model_doc/whisper).

 While the exact amount of speed-up is very much model-dependent, for TensorFlow text generation models inside 🤗 Transformers, we noticed a speed-up of ~100x. This document will explain how you can use XLA for these models to get the maximum amount of performance. We’ll also provide links to additional resources if you’re interested to learn more about the benchmarks and our design philosophy behind the XLA integration.