Commit 9989eaf1 authored by sangwz's avatar sangwz
Browse files

Update code or info in DCU env,

parent ec396a79
Pipeline #1069 failed with stages
in 0 seconds
Uni-Core, an efficient distributed PyTorch framework
====================================================
# Uni-Core
Uni-Core is built for rapidly creating PyTorch models with high performance, especially for Transfromer-based models. It supports the following features:
- Distributed training over multi-GPUs and multi-nodes
- Mixed-precision training with fp16 and bf16
- High-performance fused CUDA kernels
- model checkpoint management
- Friendly logging
- Buffered (GPU-CPU overlapping) data loader
- Gradient accumulation
- Commonly used optimizers and LR schedulers
- Easy to create new models
Uni-Core 专为快速创建高性能 PyTorch 模型而构建,尤其是基于 Transfromer 的模型。详细信息可参考[README_ORIGIN.md](README_ORIGIN.md)
# 安装
Installation
------------
组件支持:
**Build from source**
* Python >= 3.7
You can use `python setup.py install` or `pip install .` to build Uni-Core from source. The CUDA version in the build environment should be the same as the one in PyTorch.
## 使用pip方式安装
You can also use `python setup.py install --disable-cuda-ext` to disalbe the cuda extension operator when cuda is not available.
从http://10.6.10.68:8000/customized/ 下载Uni-core安装包,选择对应torch、python版本的whl包
**Use pre-compiled python wheels**
```bash
pip install unicore*.whl
```
We also pre-compiled wheels by GitHub Actions. You can download them from the [Release](https://github.com/dptech-corp/Uni-Core/releases). And you should check the pyhon version, PyTorch version and CUDA version. For example, for PyToch 1.12.1, python 3.7, and CUDA 11.3, you can install [unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl](https://github.com/dptech-corp/Uni-Core/releases/download/0.0.1/unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl).
## 源码编译方式安装
**Docker image**
确认环境中已安装torch,并安装fastpt工具。从[Index of /debug/fastpt/](http://10.6.10.68:8000/debug/fastpt/)中下载对应python版本的安装包,执行
We also provide the docker image. you can pull it by `docker pull dptechnology/unicore:0.0.1-pytorch1.11.0-cuda11.3`. To use GPUs within docker, you need to [install nvidia-docker-2](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) first.
```bash
pip install fastpt*.whl
```
下载Uni-Core源码,并执行安装:
Example
-------
```bash
# 代码下载之后,注意切换分支到develop或其它目标分支
git clone http://developer.hpccube.com/codes/OpenDAS/Uni-Core.git
cd Uni-Core
python setup.py install
```
To build a model, you can refer to [example/bert](https://github.com/dptech-corp/Uni-Core/tree/main/examples/bert).
# 验证
Related projects
----------------
执行以下代码查询软件版本号,验证是否安装完成:
- [Uni-Mol](https://github.com/dptech-corp/Uni-Mol)
- [Uni-Fold](https://github.com/dptech-corp/Uni-Fold)
`python -c "import unicore;print(unicore.__version__)"`
Acknowledgement
---------------
The main framework is from [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq).
The fused kernels are from [guolinke/fused_ops](https://github.com/guolinke/fused_ops).
Dockerfile is from [guolinke/pytorch-docker](https://github.com/guolinke/pytorch-docker).
License
-------
This project is licensed under the terms of the MIT license. See [LICENSE](https://github.com/dptech-corp/Uni-Core/blob/main/LICENSE) for additional details.
Uni-Core, an efficient distributed PyTorch framework
====================================================
Uni-Core is built for rapidly creating PyTorch models with high performance, especially for Transfromer-based models. It supports the following features:
- Distributed training over multi-GPUs and multi-nodes
- Mixed-precision training with fp16 and bf16
- High-performance fused CUDA kernels
- model checkpoint management
- Friendly logging
- Buffered (GPU-CPU overlapping) data loader
- Gradient accumulation
- Commonly used optimizers and LR schedulers
- Easy to create new models
Installation
------------
**Build from source**
You can use `python setup.py install` or `pip install .` to build Uni-Core from source. The CUDA version in the build environment should be the same as the one in PyTorch.
You can also use `python setup.py install --disable-cuda-ext` to disalbe the cuda extension operator when cuda is not available.
**Use pre-compiled python wheels**
We also pre-compiled wheels by GitHub Actions. You can download them from the [Release](https://github.com/dptech-corp/Uni-Core/releases). And you should check the pyhon version, PyTorch version and CUDA version. For example, for PyToch 1.12.1, python 3.7, and CUDA 11.3, you can install [unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl](https://github.com/dptech-corp/Uni-Core/releases/download/0.0.1/unicore-0.0.1+cu113torch1.12.1-cp37-cp37m-linux_x86_64.whl).
**Docker image**
We also provide the docker image. you can pull it by `docker pull dptechnology/unicore:0.0.1-pytorch1.11.0-cuda11.3`. To use GPUs within docker, you need to [install nvidia-docker-2](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) first.
Example
-------
To build a model, you can refer to [example/bert](https://github.com/dptech-corp/Uni-Core/tree/main/examples/bert).
Related projects
----------------
- [Uni-Mol](https://github.com/dptech-corp/Uni-Mol)
- [Uni-Fold](https://github.com/dptech-corp/Uni-Fold)
Acknowledgement
---------------
The main framework is from [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq).
The fused kernels are from [guolinke/fused_ops](https://github.com/guolinke/fused_ops).
Dockerfile is from [guolinke/pytorch-docker](https://github.com/guolinke/pytorch-docker).
License
-------
This project is licensed under the terms of the MIT license. See [LICENSE](https://github.com/dptech-corp/Uni-Core/blob/main/LICENSE) for additional details.
......@@ -6,10 +6,47 @@
#define IF_CONSTEXPR
#endif
// swz
#ifdef __HIP_PLATFORM_HCC__
#include<hip/hip_bfloat16.h>
#if defined(__HIPCC_RTC__)
#define __HOST_DEVICE__ __device__
#else
#define __HOST_DEVICE__ __host__ __device__
// TODO: Clang has a bug which allows device functions to call std functions
// when std functions are introduced into default namespace by using statement.
// math.h may be included after this bug is fixed.
#if __cplusplus
#include <cmath>
#else
#include "math.h"
#endif
#endif // !defined(__HIPCC_RTC__)
struct hip_bfloat162
{
hip_bfloat16 x;
hip_bfloat16 y;
public:
__HOST_DEVICE__
hip_bfloat162() = default;
__HOST_DEVICE__
hip_bfloat162(const hip_bfloat16& in1, const hip_bfloat16& in2):x{in1},y{in2}
{}
__HOST_DEVICE__
hip_bfloat162& operator =(const hip_bfloat162& x)
{
this->x = hip_bfloat16(float(x.x));
this->y = hip_bfloat16(float(x.y));
return *this;
}
};
#endif
template <typename T>
__device__ __forceinline__ T SHFL_XOR(T value, int laneMask, int width, unsigned int mask = 0xffffffff)
{
#if CUDA_VERSION >= 9000
#if CUDA_VERSION >= 9000&& !defined(__HIP_PLATFORM_HCC__)
return __shfl_xor_sync(mask, value, laneMask, width);
#else
return __shfl_xor(value, laneMask, width);
......@@ -29,7 +66,11 @@ DEFINE_VEC_TYPE(half, 1, half)
DEFINE_VEC_TYPE(__nv_bfloat16, 1, __nv_bfloat16)
DEFINE_VEC_TYPE(float, 1, float)
DEFINE_VEC_TYPE(half, 2, half2)
#ifdef __HIP_PLATFORM_HCC__
DEFINE_VEC_TYPE(__nv_bfloat16, 2, hip_bfloat162)
#else
DEFINE_VEC_TYPE(__nv_bfloat16, 2, __nv_bfloat162)
#endif
DEFINE_VEC_TYPE(float, 2, float2)
DEFINE_VEC_TYPE(half, 4, uint64_t)
DEFINE_VEC_TYPE(__nv_bfloat16, 4, uint64_t)
......
......@@ -7,6 +7,7 @@
import torch
from torch.utils import cpp_extension
from torch.utils.cpp_extension import CUDAExtension, BuildExtension
from fastpt import CUDAExtension
import os
import subprocess
......@@ -27,6 +28,42 @@ sys.argv = filtered_args
if sys.version_info < (3, 7):
sys.exit("Sorry, Python >= 3.7 is required for unicore.")
import subprocess
def get_abi():
try:
command = "echo '#include <string>' | gcc -x c++ -E -dM - | fgrep _GLIBCXX_USE_CXX11_ABI"
result = subprocess.run(command, shell=True, capture_output=True, text=True)
output = result.stdout.strip()
abi = "abi" + output.split(" ")[-1]
return abi
except Exception:
return 'abiUnknown'
def _get_project_version():
with open(os.path.join("unicore", "version.txt")) as f:
version = f.read().strip()
return version
dcu_version = _get_project_version()
dcu_version += '+das1.1'
sha = "Unknown"
cwd = os.path.dirname(os.path.abspath(__file__))
try:
sha = subprocess.check_output(["git", "rev-parse", "HEAD"], cwd=cwd).decode("ascii").strip()
except Exception:
pass
if sha != 'Unknown':
dcu_version += '.git' + sha[:7]
dcu_version += "." + get_abi()
if os.getenv("ROCM_PATH"):
rocm_path = os.getenv('ROCM_PATH', "")
rocm_version_path = os.path.join(rocm_path, '.info', "rocm_version")
with open(rocm_version_path, 'r',encoding='utf-8') as file:
lines = file.readlines()
rocm_version=lines[0][:-2].replace(".", "")
dcu_version += ".dtk" + rocm_version
# torch version
import torch
dcu_version += ".torch" + torch.__version__[:]
def write_version_py():
with open(os.path.join("unicore", "version.txt")) as f:
......@@ -35,6 +72,7 @@ def write_version_py():
# write version info to unicore/version.py
with open(os.path.join("unicore", "version.py"), "w") as f:
f.write('__version__ = "{}"\n'.format(version))
f.write("__dcu_version__ = '{}'\n".format(dcu_version))
return version
......@@ -111,7 +149,7 @@ if not DISABLE_CUDA_EXTENSION:
cmdclass['build_ext'] = BuildExtension
if torch.utils.cpp_extension.CUDA_HOME is None:
if torch.utils.cpp_extension.CUDA_HOME is None and torch.utils.cpp_extension.ROCM_HOME is None:
raise RuntimeError("Nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.")
# check_cuda_torch_binary_vs_bare_metal(torch.utils.cpp_extension.CUDA_HOME)
......@@ -216,7 +254,7 @@ if not DISABLE_CUDA_EXTENSION:
setup(
name="unicore",
version=version,
version=dcu_version,
description="DP Technology's Core AI Framework",
url="https://github.com/dptech-corp/unicore",
classifiers=[
......
......@@ -15,6 +15,11 @@ except ImportError:
with open(version_txt) as f:
__version__ = f.read().strip()
try:
from .version import __dcu_version__ # noqa
except ImportError:
pass
__all__ = ["pdb"]
# backwards compatibility to support `from unicore.X import Y`
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment