Commit 25387b24 authored by Tri Dao's avatar Tri Dao
Browse files

Mention AITemplate Stable Diffusion in usage.md

parent 2e33fc8e
...@@ -64,6 +64,11 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June ...@@ -64,6 +64,11 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June
of Stable Diffusion: with FlashAttention as one of its components, it speeds up of Stable Diffusion: with FlashAttention as one of its components, it speeds up
pretraining by up to 6.5x, and reduces the hardware cost of fine-tuning by 7x. pretraining by up to 6.5x, and reduces the hardware cost of fine-tuning by 7x.
- Meta's
[AITemplate](https://ai.facebook.com/blog/gpu-inference-engine-nvidia-amd-open-source/)
with FlashAttention one of the components, is currently the [fastest](https://twitter.com/bing_xu_/status/1590447334055632897) Stable
Diffusion inference engine that we know of.
- Stable Diffusion inference from - Stable Diffusion inference from
[Labml.ai](https://twitter.com/labmlai/status/1573634095732490240): 50% speedup. [Labml.ai](https://twitter.com/labmlai/status/1573634095732490240): 50% speedup.
...@@ -84,8 +89,10 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June ...@@ -84,8 +89,10 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June
language and compiler for parallel programming. language and compiler for parallel programming.
- [xformers](https://github.com/facebookresearch/xformers): The xformers team - [xformers](https://github.com/facebookresearch/xformers): The xformers team
has implemented [memory-efficient attention](https://twitter.com/fvsmassa/status/1580229170629849089) in a similar spirit to FlashAttention. has implemented [memory-efficient
attention](https://twitter.com/fvsmassa/status/1580229170629849089) in a
similar spirit to FlashAttention.
xformers dynamically dispatches to whichever implementation is available / faster.
- [Jax](https://github.com/google/jax): an [implementation](https://github.com/lucidrains/flash-attention-jax) - [Jax](https://github.com/google/jax): an [implementation](https://github.com/lucidrains/flash-attention-jax)
in Jax by [lucidrains](https://github.com/lucidrains/). in Jax by [lucidrains](https://github.com/lucidrains/).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment