add Uni-Core src code and change setup.py

5900e997 · zhanggzh · 44f6386f · 5900e997 · 5900e997
Commit 5900e997 authored May 27, 2025 by zhanggzh
Show whitespace changes
Inline Side-by-side

Showing with 48 additions and 65 deletions

README.md README.md +38 -55

setup.py setup.py +10 -10

No files found.
--- a/README.md
+++ b/README.md
-Uni-Core, an efficient distributed PyTorch framework
-====================================================
+# <div align="center"><strong>Uni-Core</strong></div>
+## 简介
+Uni-Core 专为快速创建高性能 PyTorch 模型而构建，尤其是基于 Transfromer 的模型。详细信息可参考README_ORIGIN.md
+## 安装
+  源码编译安装，该方式需要安装torch及fastpt工具包；注意使用fastpt包进行源码编译安装时，要严格匹配fastpt、torch、dtk之间的版本号，例如基于dtk2504编译，则fastpt、torch都必须是dtk2504的包，其中fastpt与torch对应的版本号关系为
+|   | fastpt版本 | torch版本    | DTK版本 | 
+| - | -------- | ------- | ------------ | 
+| 1 | 2.0.1+das.dtk2504   | v2.4.1 |  dtk2504| 
+| 1 | 2.1.0+das.dtk2504   | v2.5.1 |  dtk2504| 
+| 1 | 2.0.1+das.dtk25041   | v2.4.1 |  dtk25041| 
+| 1 | 2.1.0+das.dtk25041   | v2.5.1 |  dtk25041| 
+
+### 编译流程
+  ```
+  pip3 install wandb
+  pip3 install -r requirements.txt
+  pip3 install fastpt-2.0.1+das.dtk2504-py3-none-any.whl #以torch2.4.1，dtk2504为例
+  git clone https://developer.sourcefind.cn/codes/OpenDAS/Uni-Core.git
+  cd Uni-Core
+  git checkout v0.0.1-fastpt #切换到相应分支
+  source  /usr/local/bin/fastpt -c
+  python3 setup.py bdist_wheel
+  ```
+## 验证安装
+```
+pip3 list | grep unicore
+python3
+import unicore 
+unicore.__version__
+#返回版本号
+```
+## 测试
+```
+source  /usr/local/bin/fastpt -e
+cd tests
+pytest vs
+
+```

-Uni-Core is built for rapidly creating PyTorch models with high performance, especially for Transfromer-based models. It supports the following features:
- Distributed training over multi-GPUs and multi-nodes
- Mixed-precision training with fp16 and bf16
- High-performance fused CUDA kernels
- model checkpoint management
- Friendly logging
- Buffered (GPU-CPU overlapping) data loader
- Gradient accumulation
- Commonly used optimizers and LR schedulers
- Easy to create new models

-
-Installation
------------
-
-**Build from source**
-
-You can use `python setup.py install` or `pip install .` to build Uni-Core from source. The CUDA version in the build environment should be the same as the one in PyTorch.
-
-You can also use `python setup.py install --disable-cuda-ext` to disalbe the cuda extension operator when cuda is not available.
-
-**Use pre-compiled python wheels**
-
-We also pre-compiled wheels by GitHub Actions. You can download them from the [Release](https://github.com/dptech-corp/Uni-Core/releases). And you should check the pyhon version, PyTorch version and CUDA version. For example, for PyToch 1.12.1, python 3.7, and CUDA 11.3, you can install [unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl](https://github.com/dptech-corp/Uni-Core/releases/download/0.0.1/unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl). 
-
-**Docker image**
-
-We also provide the docker image. you can pull it by `docker pull dptechnology/unicore:0.0.1-pytorch1.11.0-cuda11.3`. To use GPUs within docker, you need to [install nvidia-docker-2](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) first.
-
-
-Example
-------
-
-To build a model, you can refer to [example/bert](https://github.com/dptech-corp/Uni-Core/tree/main/examples/bert). 
-
-Related projects
----------------
-
- [Uni-Mol](https://github.com/dptech-corp/Uni-Mol)
- [Uni-Fold](https://github.com/dptech-corp/Uni-Fold)
-
-Acknowledgement
---------------
-
-The main framework is from [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq).
-
-The fused kernels are from [guolinke/fused_ops](https://github.com/guolinke/fused_ops).
-
-Dockerfile is from [guolinke/pytorch-docker](https://github.com/guolinke/pytorch-docker).
-
-License
-------
-
-This project is licensed under the terms of the MIT license. See [LICENSE](https://github.com/dptech-corp/Uni-Core/blob/main/LICENSE) for additional details.
--- a/setup.py
+++ b/setup.py
@@ -128,8 +128,8 @@ if not DISABLE_CUDA_EXTENSION:
                    include_dirs=[os.path.join(this_dir, 'csrc')],
                    extra_compile_args={'cxx': ['-O3',] + generator_flag,
                                            'nvcc':['-O3', '--use_fast_math',
-                                                    '-gencode', 'arch=compute_70,code=sm_70',
-                                                    '-gencode', 'arch=compute_80,code=sm_80',
+                                                    '-gencode=arch=compute_70,code=sm_70',
+                                                    '-gencode=arch=compute_80,code=sm_80',
                                                    '-U__CUDA_NO_HALF_OPERATORS__',
                                                    '-U__CUDA_NO_BFLOAT16_OPERATORS__',
                                                    '-U__CUDA_NO_HALF_CONVERSIONS__',
@@ -144,8 +144,8 @@ if not DISABLE_CUDA_EXTENSION:
                        include_dirs=[os.path.join(this_dir, 'csrc')],
                        extra_compile_args={'cxx': ['-O3'],
                                            'nvcc':['-O3', '--use_fast_math',
-                                                    '-gencode', 'arch=compute_70,code=sm_70',
-                                                    '-gencode', 'arch=compute_80,code=sm_80',
+                                                    '-gencode=arch=compute_70,code=sm_70',
+                                                    '-gencode=arch=compute_80,code=sm_80',
                                                    '-U__CUDA_NO_HALF_OPERATORS__',
                                                    '-U__CUDA_NO_BFLOAT16_OPERATORS__',
                                                    '-U__CUDA_NO_HALF_CONVERSIONS__',
@@ -170,8 +170,8 @@ if not DISABLE_CUDA_EXTENSION:
                        include_dirs=[os.path.join(this_dir, 'csrc')],
                        extra_compile_args={'cxx': ['-O3',] + generator_flag,
                                            'nvcc':['-O3', '--use_fast_math',
-                                                    '-gencode', 'arch=compute_70,code=sm_70',
-                                                    '-gencode', 'arch=compute_80,code=sm_80',
+                                                    '-gencode=arch=compute_70,code=sm_70',
+                                                    '-gencode=arch=compute_80,code=sm_80',
                                                    '-U__CUDA_NO_HALF_OPERATORS__',
                                                    '-U__CUDA_NO_BFLOAT16_OPERATORS__',
                                                    '-U__CUDA_NO_HALF_CONVERSIONS__',
@@ -187,8 +187,8 @@ if not DISABLE_CUDA_EXTENSION:
                        include_dirs=[os.path.join(this_dir, 'csrc')],
                        extra_compile_args={'cxx': ['-O3',] + generator_flag,
                                            'nvcc':['-O3', '--use_fast_math',
-                                                    '-gencode', 'arch=compute_70,code=sm_70',
-                                                    '-gencode', 'arch=compute_80,code=sm_80',
+                                                    '-gencode=arch=compute_70,code=sm_70',
+                                                    '-gencode=arch=compute_80,code=sm_80',
                                                    '-U__CUDA_NO_HALF_OPERATORS__',
                                                    '-U__CUDA_NO_BFLOAT16_OPERATORS__',
                                                    '-U__CUDA_NO_HALF_CONVERSIONS__',
@@ -204,8 +204,8 @@ if not DISABLE_CUDA_EXTENSION:
                        include_dirs=[os.path.join(this_dir, 'csrc')],
                        extra_compile_args={'cxx': ['-O3',] + generator_flag,
                                            'nvcc':['-O3', '--use_fast_math', '-maxrregcount=50',
-                                                    '-gencode', 'arch=compute_70,code=sm_70',
-                                                    '-gencode', 'arch=compute_80,code=sm_80',
+                                                    '-gencode=arch=compute_70,code=sm_70',
+                                                    '-gencode=arch=compute_80,code=sm_80',
                                                    '-U__CUDA_NO_HALF_OPERATORS__',
                                                    '-U__CUDA_NO_BFLOAT16_OPERATORS__',
                                                    '-U__CUDA_NO_HALF_CONVERSIONS__',