"lib/llm/tests/vscode:/vscode.git/clone" did not exist on "ecf53ce2b38971dc9b5dde6f70d74cfc5b870c35"
Unverified Commit 070df689 authored by Hongxin Liu's avatar Hongxin Liu Committed by GitHub
Browse files

[devops] fix extention building (#5427)

parent 822241a9
{ {
"build": [ "build": [
{ {
"torch_command": "pip install torch==1.12.1+cu102 torchvision==0.13.1+cu102 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu102", "torch_command": "pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121",
"cuda_image": "hpcaitech/cuda-conda:10.2" "cuda_image": "hpcaitech/cuda-conda:12.1"
}, },
{ {
"torch_command": "pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113", "torch_command": "pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118",
"cuda_image": "hpcaitech/cuda-conda:11.3" "cuda_image": "hpcaitech/cuda-conda:11.8"
}, },
{ {
"torch_command": "pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116", "torch_command": "pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1",
"cuda_image": "hpcaitech/cuda-conda:11.6" "cuda_image": "hpcaitech/cuda-conda:11.7"
} }
] ]
} }
...@@ -83,7 +83,7 @@ jobs: ...@@ -83,7 +83,7 @@ jobs:
fi fi
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
pip install -r requirements/requirements-test.txt pip install -r requirements/requirements-test.txt
- name: Unit Testing - name: Unit Testing
run: | run: |
......
...@@ -78,7 +78,7 @@ jobs: ...@@ -78,7 +78,7 @@ jobs:
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
pip install -r requirements/requirements-test.txt pip install -r requirements/requirements-test.txt
- name: Unit Testing - name: Unit Testing
run: | run: |
......
...@@ -75,7 +75,7 @@ jobs: ...@@ -75,7 +75,7 @@ jobs:
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
pip install -r requirements/requirements-test.txt pip install -r requirements/requirements-test.txt
- name: Unit Testing - name: Unit Testing
......
...@@ -51,4 +51,4 @@ jobs: ...@@ -51,4 +51,4 @@ jobs:
- name: Build - name: Build
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
...@@ -89,7 +89,7 @@ jobs: ...@@ -89,7 +89,7 @@ jobs:
- name: Install ColossalAI - name: Install ColossalAI
run: | run: |
source activate pytorch source activate pytorch
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
- name: Test the Doc - name: Test the Doc
run: | run: |
......
...@@ -32,7 +32,7 @@ jobs: ...@@ -32,7 +32,7 @@ jobs:
- name: Install ColossalAI - name: Install ColossalAI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
- name: Install Doc Test Requirements - name: Install Doc Test Requirements
run: | run: |
......
...@@ -53,7 +53,7 @@ jobs: ...@@ -53,7 +53,7 @@ jobs:
uses: actions/checkout@v3 uses: actions/checkout@v3
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
- name: Test the example - name: Test the example
run: | run: |
dir=${{ matrix.directory }} dir=${{ matrix.directory }}
......
...@@ -88,7 +88,7 @@ jobs: ...@@ -88,7 +88,7 @@ jobs:
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
- name: Test the example - name: Test the example
run: | run: |
......
...@@ -42,7 +42,7 @@ jobs: ...@@ -42,7 +42,7 @@ jobs:
- name: Install Colossal-AI - name: Install Colossal-AI
run: | run: |
CUDA_EXT=1 pip install -v . BUILD_EXT=1 pip install -v .
- name: Traverse all files - name: Traverse all files
run: | run: |
......
...@@ -76,7 +76,7 @@ def check_installation(): ...@@ -76,7 +76,7 @@ def check_installation():
click.echo("") click.echo("")
click.echo(f"Note:") click.echo(f"Note:")
click.echo( click.echo(
f"1. AOT (ahead-of-time) compilation of the CUDA kernels occurs during installation when the environment variable CUDA_EXT=1 is set" f"1. AOT (ahead-of-time) compilation of the CUDA kernels occurs during installation when the environment variable BUILD_EXT=1 is set"
) )
click.echo(f"2. If AOT compilation is not enabled, stay calm as the CUDA kernels can still be built during runtime") click.echo(f"2. If AOT compilation is not enabled, stay calm as the CUDA kernels can still be built during runtime")
......
...@@ -25,7 +25,7 @@ conda install -c conda-forge cupy cudnn cutensor nccl cuda-version=11.6 ...@@ -25,7 +25,7 @@ conda install -c conda-forge cupy cudnn cutensor nccl cuda-version=11.6
# install colossalai with PyTorch extensions # install colossalai with PyTorch extensions
cd <path_to_ColossalAI_repo> cd <path_to_ColossalAI_repo>
CUDA_EXT=1 pip install -e . BUILD_EXT=1 pip install -e .
# install other dependencies # install other dependencies
pip install triton==2.0.0.dev20221202 pip install triton==2.0.0.dev20221202
......
...@@ -25,7 +25,7 @@ conda install -c "nvidia/label/cuda-11.6.2" cuda-toolkit ...@@ -25,7 +25,7 @@ conda install -c "nvidia/label/cuda-11.6.2" cuda-toolkit
cd <path_to_ColossalAI_repo> cd <path_to_ColossalAI_repo>
pip install -r requirements/requirements.txt pip install -r requirements/requirements.txt
pip install -r requirements/requirements-test.txt pip install -r requirements/requirements-test.txt
CUDA_EXT=1 pip install -e . BUILD_EXT=1 pip install -e .
# install torchserve # install torchserve
cd <path_to_torch_serve_repo> cd <path_to_torch_serve_repo>
......
...@@ -38,7 +38,7 @@ ARG VERSION=main ...@@ -38,7 +38,7 @@ ARG VERSION=main
RUN git clone -b ${VERSION} https://github.com/hpcaitech/ColossalAI.git && \ RUN git clone -b ${VERSION} https://github.com/hpcaitech/ColossalAI.git && \
cd ./ColossalAI && \ cd ./ColossalAI && \
git checkout 3e05c07bb8921f2a8f9736b6f6673d4e9f1697d0 && \ git checkout 3e05c07bb8921f2a8f9736b6f6673d4e9f1697d0 && \
CUDA_EXT=1 pip install -v --no-cache-dir . BUILD_EXT=1 pip install -v --no-cache-dir .
# install titans # install titans
RUN pip install --no-cache-dir titans RUN pip install --no-cache-dir titans
......
...@@ -78,7 +78,7 @@ class CPUAdam(NVMeOptimizer): ...@@ -78,7 +78,7 @@ class CPUAdam(NVMeOptimizer):
super(CPUAdam, self).__init__(model_params, default_args, nvme_offload_fraction, nvme_offload_dir) super(CPUAdam, self).__init__(model_params, default_args, nvme_offload_fraction, nvme_offload_dir)
self.adamw_mode = adamw_mode self.adamw_mode = adamw_mode
cpu_adam = CPUAdamLoader().load() cpu_adam = CPUAdamLoader().load()
# if you find yourself stuck here, make sure that you install colossalai with CUDA_EXT=1 specification # if you find yourself stuck here, make sure that you install colossalai with BUILD_EXT=1 specification
self.cpu_adam_op = cpu_adam.CPUAdamOptimizer(lr, betas[0], betas[1], eps, weight_decay, adamw_mode) self.cpu_adam_op = cpu_adam.CPUAdamOptimizer(lr, betas[0], betas[1], eps, weight_decay, adamw_mode)
def torch_adam_update( def torch_adam_update(
......
...@@ -37,7 +37,7 @@ RUN git clone https://github.com/NVIDIA/apex && \ ...@@ -37,7 +37,7 @@ RUN git clone https://github.com/NVIDIA/apex && \
ARG VERSION=main ARG VERSION=main
RUN git clone -b ${VERSION} https://github.com/hpcaitech/ColossalAI.git \ RUN git clone -b ${VERSION} https://github.com/hpcaitech/ColossalAI.git \
&& cd ./ColossalAI \ && cd ./ColossalAI \
&& CUDA_EXT=1 pip install -v --no-cache-dir . && BUILD_EXT=1 pip install -v --no-cache-dir .
# install titans # install titans
RUN pip install --no-cache-dir titans RUN pip install --no-cache-dir titans
......
...@@ -146,8 +146,8 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的 ...@@ -146,8 +146,8 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
[[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base) [[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base)
[[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary) [[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary)
| Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot)| AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) | | Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot) | AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) |
| :----------------------------: | :--------: | :-------------: | :------------------: | :-----------: | :--------------: | :-------------: | :-------------: | |:------------------------------:|:----------:|:---------------:|:-------------:|:--------------:|:----------------:|:---------------:|:--------------:|
| Baichuan-7B | - | 1.2T | 42.32 (42.30) | 44.53 (44.02) | 38.72 | 36.74 | 42.80 | | Baichuan-7B | - | 1.2T | 42.32 (42.30) | 44.53 (44.02) | 38.72 | 36.74 | 42.80 |
| Baichuan-13B-Base | - | 1.4T | 50.51 (51.60) | 55.73 (55.30) | 47.20 | 51.41 | 53.60 | | Baichuan-13B-Base | - | 1.4T | 50.51 (51.60) | 55.73 (55.30) | 47.20 | 51.41 | 53.60 |
| Baichuan2-7B-Base | - | 2.6T | 46.97 (54.16) | 57.67 (57.07) | 45.76 | 52.60 | 54.00 | | Baichuan2-7B-Base | - | 2.6T | 46.97 (54.16) | 57.67 (57.07) | 45.76 | 52.60 | 54.00 |
...@@ -406,10 +406,10 @@ pip install colossalai ...@@ -406,10 +406,10 @@ pip install colossalai
**注:目前只支持Linux。** **注:目前只支持Linux。**
但是,如果你想在安装时就直接构建PyTorch扩展,您可以设置环境变量`CUDA_EXT=1`. 但是,如果你想在安装时就直接构建PyTorch扩展,您可以设置环境变量`BUILD_EXT=1`.
```bash ```bash
CUDA_EXT=1 pip install colossalai BUILD_EXT=1 pip install colossalai
``` ```
**否则,PyTorch扩展只会在你实际需要使用他们时在运行时里被构建。** **否则,PyTorch扩展只会在你实际需要使用他们时在运行时里被构建。**
...@@ -438,7 +438,7 @@ pip install . ...@@ -438,7 +438,7 @@ pip install .
我们默认在`pip install`时不安装PyTorch扩展,而是在运行时临时编译,如果你想要提前安装这些扩展的话(在使用融合优化器时会用到),可以使用一下命令。 我们默认在`pip install`时不安装PyTorch扩展,而是在运行时临时编译,如果你想要提前安装这些扩展的话(在使用融合优化器时会用到),可以使用一下命令。
```shell ```shell
CUDA_EXT=1 pip install . BUILD_EXT=1 pip install .
``` ```
<p align="right">(<a href="#top">返回顶端</a>)</p> <p align="right">(<a href="#top">返回顶端</a>)</p>
......
...@@ -42,7 +42,7 @@ pip install -r requirements/requirements.txt ...@@ -42,7 +42,7 @@ pip install -r requirements/requirements.txt
BUILD_EXT=1 pip install . BUILD_EXT=1 pip install .
``` ```
If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer), just don't specify the `CUDA_EXT`: If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer), just don't specify the `BUILD_EXT`:
```shell ```shell
pip install . pip install .
......
...@@ -77,7 +77,7 @@ git clone https://github.com/hpcaitech/ColossalAI.git ...@@ -77,7 +77,7 @@ git clone https://github.com/hpcaitech/ColossalAI.git
cd ColossalAI cd ColossalAI
# install colossalai # install colossalai
CUDA_EXT=1 pip install . BUILD_EXT=1 pip install .
``` ```
#### Step 3: Accelerate with flash attention by xformers (Optional) #### Step 3: Accelerate with flash attention by xformers (Optional)
......
...@@ -8,7 +8,7 @@ conda activate ldm ...@@ -8,7 +8,7 @@ conda activate ldm
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install transformers diffusers invisible-watermark pip install transformers diffusers invisible-watermark
CUDA_EXT=1 pip install colossalai BUILD_EXT=1 pip install colossalai
wget https://huggingface.co/stabilityai/stable-diffusion-2-base/resolve/main/512-base-ema.ckpt wget https://huggingface.co/stabilityai/stable-diffusion-2-base/resolve/main/512-base-ema.ckpt
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment