Add instruction about limiting number of ninja jobs

d1a3b52f · Tri Dao · b4cc152e · d1a3b52f
Commit d1a3b52f authored Jul 17, 2023 by Tri Dao
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 0 deletions

README.md README.md +8 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -54,6 +54,14 @@ Alternatively you can compile from source:
 python setup.py install
 ```
+If your machine has less than 96GB of RAM and lots of CPU cores, `ninja` might
+run too many parallel compilation jobs that could exhaust the amount of RAM. To
+limit the number of parallel compilation jobs, you can set the environment
+variable `MAX_JOBS`:
+```
+MAX_JOBS=4 pip install flash-attn --no-build-isolation
+```
 Interface: `src/flash_attention_interface.py`
 FlashAttention-2 currently supports: