Commit f38252a4 authored by fengzch-das's avatar fengzch-das
Browse files

update readme

parent 0ed05516
Pipeline #2965 failed with stages
in 0 seconds
# veTurboIO # <div align="center"><strong>VeTurboIO</strong></div>
## 简介
VeTurboIO 是一个由 Volcano Engine 开发的用于高性能读写 PyTorch 模型文件的 Python 库。该库主要基于 safetensors 文件格式实现,以实现对张量数据的高效存储和读取。
[En](./README.md) | [中文](./README.zh.md)
## 安装
组件支持组合
A Python library for high-performance reading and writing of PyTorch model files
developed by Volcano Engine. This library mainly implements based on the safetensors | PyTorch版本 | fastpt版本 |VeTurboIO版本 | DTK版本 | Python版本 | 推荐编译方式 |
file format to achieve efficient storage and reading of tensor data. | ----------- | ----------- | ----------- | ------------------------ | -----------------| ------------ |
| 2.5.1 | 2.1.0 |main | >= 25.04 | 3.8、3.10、3.11 | fastpt不转码 |
## Install | 2.4.1 | 2.0.1 |main | >= 25.04 | 3.8、3.10、3.11 | fastpt不转码 |
| 其他 | 其他 | 其他 | 其他 | 3.8、3.10、3.11 | hip转码 |
It can be installed directly through the following way:
```bash + pytorch版本大于2.4.1 && dtk版本大于25.04 推荐使用fastpt不转码编译。
cd veturboio
python setup.py get_libcfs ### 1、使用pip方式安装
python setup.py install veturboio whl包下载目录:[光和开发者社区](https://download.sourcefind.cn:65024/4/main),选择对应的pytorch版本和python版本下载对应veturboio的whl包
```shell
pip install torch* (下载torch的whl包)
pip install fastpt* --no-deps (下载fastpt的whl包)
source /usr/local/bin/fastpt -E
pip install veturboio* (下载的veturboio的whl包)
``` ```
### 2、使用源码编译方式安装
Tips: This instruction will preferentially download the whl file that matches the #### 编译环境准备
current Python and PyTorch versions. If no matching whl file is found, it will 提供基于fastpt不转码编译:
automatically download the source code for compilation and installation.
If the installation fails, you can also try to install by downloading the source code, 1. 基于光源pytorch基础镜像环境:镜像下载地址:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch、python、dtk及系统下载对应的镜像版本。
and then compile and install it manually.
```bash
# CUDA ops, default
python setup.py install --cuda_ext
# NPU ops 2. 基于现有python环境:安装pytorch,fastpt whl包下载目录:[光合开发者社区](https://sourcefind.cn/#/image/dcu/pytorch),根据python、dtk版本,下载对应pytorch的whl包。安装命令如下:
python setup.py install --npu_ext ```shell
pip install torch* (下载torch的whl包)
# CPU only pip install fastpt* --no-deps (下载fastpt的whl包, 安装顺序,先安装torch,后安装fastpt)
python setup.py install --cpu_ext pip install pytest
pip install wheel
``` ```
#### 源码编译安装
## Quick Start - 代码下载
```shell
### Read and write model files git clone http://developer.sourcefind.cn/codes/OpenDAS/veturboio.git # 根据编译需要切换分支
```python
import torch
import veturboio
tensors = {
"weight1": torch.zeros((1024, 1024)),
"weight2": torch.zeros((1024, 1024))
}
veturboio.save_file(tensors, "model.safetensors")
new_tensors = veturboio.load("model.safetensors")
# check if the tensors are the same
for k, v in tensors.items():
assert torch.allclose(v, new_tensors[k])
``` ```
- 提供2种源码编译方式(进入veturboio目录):
### Convert existing PyTorch files
```bash
python -m veturboio.convert -i model.pt -o model.safetensors
``` ```
1. 设置不转码编译环境变量
source /usr/local/bin/fastpt -C
## Performance test 2. 编译whl包并安装
pip install loguru
Run directly: python setup.py get_libcfs
```bash python setup.py bdist_wheel --cuda_ext
bash bench/io_bench.sh pip install dist/veturboio*
```
Then, you can get the following results:
```
fs_name tensor_size veturboio load_time(s) torch load_time(s)
shm 1073741824 0.08 0.63
shm 2147483648 0.19 1.26
shm 4294967296 0.36 2.32
```
Also, you can run the following command to get more options: 3. 源码编译安装
```bash pip install loguru
python bench/io_bench.py -h python setup.py get_libcfs
python setup.py install --cuda_ext
``` ```
#### 注意事项
## Advance Features + 若使用pip install下载安装过慢,可添加pypi清华源:-i https://pypi.tuna.tsinghua.edu.cn/simple/
+ ROCM_PATH为dtk的路径,默认为/opt/dtk
### Using veMLP to accelerate reading and writing + 在pytorch2.5.1环境下编译需要支持c++17语法,打开setup.py文件,把文件中的 -std=c++14 修改为 -std=c++17
Volcano Engine Machine Learning Platform (veMLP) provides a distributed cache file system
based on the physical disks of the GPU cluster. ## Known Issue
-
<p align="center">
<img src="./docs/imgs/SFCS.png" style="zoom:15%;"> ## 参考资料
</p> - [README_ORIGIN](README_ORIGIN.md)
- [README_zh-CN](README_zh-CN.md)
When a cluster-level task needs to read - [https://github.com/princeton-vl/veturboio](https://github.com/volcengine/veTurboIO.git)
a model file, the caching system can efficiently distribute the model file between GPU
machines via RDMA transfer, thus avoiding network transfer bottlenecks. When using this
system, veTurboIO can maximize its performance advantages.
### Encrypt and decrypt model files
veTurboIO supports encryption and decryption of model files. You can read the [tutorial](./docs/encrypt_model.md)
to learn how to keep your model files secure. When you use GPU as target device, veTurboIO can decrypt the model file on the fly.
## License
[Apache License 2.0](./LICENSE)
# veTurboIO
[En](./README.md) | [中文](./README.zh.md)
A Python library for high-performance reading and writing of PyTorch model files
developed by Volcano Engine. This library mainly implements based on the safetensors
file format to achieve efficient storage and reading of tensor data.
## Install
It can be installed directly through the following way:
```bash
cd veturboio
python setup.py get_libcfs
python setup.py install
```
Tips: This instruction will preferentially download the whl file that matches the
current Python and PyTorch versions. If no matching whl file is found, it will
automatically download the source code for compilation and installation.
If the installation fails, you can also try to install by downloading the source code,
and then compile and install it manually.
```bash
# CUDA ops, default
python setup.py install --cuda_ext
# NPU ops
python setup.py install --npu_ext
# CPU only
python setup.py install --cpu_ext
```
## Quick Start
### Read and write model files
```python
import torch
import veturboio
tensors = {
"weight1": torch.zeros((1024, 1024)),
"weight2": torch.zeros((1024, 1024))
}
veturboio.save_file(tensors, "model.safetensors")
new_tensors = veturboio.load("model.safetensors")
# check if the tensors are the same
for k, v in tensors.items():
assert torch.allclose(v, new_tensors[k])
```
### Convert existing PyTorch files
```bash
python -m veturboio.convert -i model.pt -o model.safetensors
```
## Performance test
Run directly:
```bash
bash bench/io_bench.sh
```
Then, you can get the following results:
```
fs_name tensor_size veturboio load_time(s) torch load_time(s)
shm 1073741824 0.08 0.63
shm 2147483648 0.19 1.26
shm 4294967296 0.36 2.32
```
Also, you can run the following command to get more options:
```bash
python bench/io_bench.py -h
```
## Advance Features
### Using veMLP to accelerate reading and writing
Volcano Engine Machine Learning Platform (veMLP) provides a distributed cache file system
based on the physical disks of the GPU cluster.
<p align="center">
<img src="./docs/imgs/SFCS.png" style="zoom:15%;">
</p>
When a cluster-level task needs to read
a model file, the caching system can efficiently distribute the model file between GPU
machines via RDMA transfer, thus avoiding network transfer bottlenecks. When using this
system, veTurboIO can maximize its performance advantages.
### Encrypt and decrypt model files
veTurboIO supports encryption and decryption of model files. You can read the [tutorial](./docs/encrypt_model.md)
to learn how to keep your model files secure. When you use GPU as target device, veTurboIO can decrypt the model file on the fly.
## License
[Apache License 2.0](./LICENSE)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment