Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
flash-attention
Commits
e2e4333c
"tools/vscode:/vscode.git/clone" did not exist on "0b37c11878645174b275400ce7ee5a148e6a3485"
Commit
e2e4333c
authored
May 26, 2024
by
Tri Dao
Browse files
Limit to MAX_JOBS=1 with CUDA 12.2
parent
ce735035
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
5 additions
and
4 deletions
+5
-4
.github/workflows/publish.yml
.github/workflows/publish.yml
+2
-1
flash_attn/__init__.py
flash_attn/__init__.py
+1
-1
training/Dockerfile
training/Dockerfile
+2
-2
No files found.
.github/workflows/publish.yml
View file @
e2e4333c
...
@@ -168,7 +168,8 @@ jobs:
...
@@ -168,7 +168,8 @@ jobs:
export PATH=/usr/local/nvidia/bin:/usr/local/nvidia/lib64:$PATH
export PATH=/usr/local/nvidia/bin:/usr/local/nvidia/lib64:$PATH
export LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/nvidia/lib64:/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# Limit MAX_JOBS otherwise the github runner goes OOM
# Limit MAX_JOBS otherwise the github runner goes OOM
MAX_JOBS=2 FLASH_ATTENTION_FORCE_BUILD="TRUE" FLASH_ATTENTION_FORCE_CXX11_ABI=${{ matrix.cxx11_abi}} python setup.py bdist_wheel --dist-dir=dist
# CUDA 11.8 can compile with 2 jobs, but CUDA 12.2 goes OOM
MAX_JOBS=$([ "$MATRIX_CUDA_VERSION" == "122" ] && echo 1 || echo 2) FLASH_ATTENTION_FORCE_BUILD="TRUE" FLASH_ATTENTION_FORCE_CXX11_ABI=${{ matrix.cxx11_abi}} python setup.py bdist_wheel --dist-dir=dist
tmpname=cu${MATRIX_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ matrix.cxx11_abi }}
tmpname=cu${MATRIX_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ matrix.cxx11_abi }}
wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+$tmpname-/2")
wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+$tmpname-/2")
ls dist/*whl |xargs -I {} mv {} dist/${wheel_name}
ls dist/*whl |xargs -I {} mv {} dist/${wheel_name}
...
...
flash_attn/__init__.py
View file @
e2e4333c
__version__
=
"2.5.9"
__version__
=
"2.5.9
.post1
"
from
flash_attn.flash_attn_interface
import
(
from
flash_attn.flash_attn_interface
import
(
flash_attn_func
,
flash_attn_func
,
...
...
training/Dockerfile
View file @
e2e4333c
...
@@ -85,7 +85,7 @@ RUN pip install transformers==4.25.1 datasets==2.8.0 pytorch-lightning==1.8.6 tr
...
@@ -85,7 +85,7 @@ RUN pip install transformers==4.25.1 datasets==2.8.0 pytorch-lightning==1.8.6 tr
RUN
pip
install
git+https://github.com/mlcommons/logging.git@2.1.0
RUN
pip
install
git+https://github.com/mlcommons/logging.git@2.1.0
# Install FlashAttention
# Install FlashAttention
RUN
pip
install
flash-attn
==
2.5.9
RUN
pip
install
flash-attn
==
2.5.9
.post1
# Install CUDA extensions for fused dense
# Install CUDA extensions for fused dense
RUN
pip
install
git+https://github.com/HazyResearch/flash-attention@v2.5.9#subdirectory
=
csrc/fused_dense_lib
RUN
pip
install
git+https://github.com/HazyResearch/flash-attention@v2.5.9
.post1
#subdirectory
=
csrc/fused_dense_lib
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment