Commit 28d696f8 authored by dcuai's avatar dcuai
Browse files

Update README.md

parent 163fb0ff
# LLAMA
## 论文
......@@ -20,13 +19,12 @@ LLama是一个基础语言模型的集合,参数范围从7B到65B。在数万亿
## 环境配置
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.2-py3.10
## **TODO**
### 源码编译安装(方法二)
基于光源pytorch2.1.0基础镜像环境:镜像下载地址:[https://sourcefind.cn/#/image/dcu/pytorch](https://sourcefind.cn/#/image/dcu/pytorch),根据pytorch2.1.0、python、dtk及系统下载对应的镜像版本。pytorch2.1.0镜像里已经安装了trition,flash-attn
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal:/opt/hyhal:ro --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
```
1. 安装Rust
```shell
......@@ -47,7 +45,6 @@ rm -f $PROTOC_ZIP
```bash
cd llama_tgi
git clone http://developer.hpccube.com/codes/OpenDAS/text-generation-inference.git #根据需要的分支进行切换 例:-b v2.1.1
cd text-generation-inference
#安装exllama
cd server
......@@ -73,18 +70,12 @@ pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
另外,`cargo install` 太慢也可以通过在`~/.cargo/config`中添加源来提速。
## 查看安装的版本号
5. 查看安装的版本号
```bash
text-generation-launcher -V #版本号与官方版本同步
```
## 使用前
```bash
export PYTORCH_TUNABLEOP_ENABLED=0
```
## 数据集
......@@ -94,14 +85,19 @@ export PYTORCH_TUNABLEOP_ENABLED=0
| 基座模型 | chat模型 | GPTQ模型 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
| [Llama-2-7b-hf](http://113.200.138.88:18080/aimodels/Llama-2-7b-hf/-/archive/main/Llama-2-7b-hf-main.tar.gz) | [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) | [Llama-2-7B-Chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GPTQ/tree/gptq-4bit-128g-actorder_True) |
| [Llama-2-13b-hf](http://113.200.138.88:18080/aimodels/Llama-2-13b-hf/-/archive/main/Llama-2-13b-hf-main.tar.gz) | [Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) | [Llama-2-13B-GPTQ](https://huggingface.co/TheBloke/Llama-2-13B-GPTQ/tree/gptq-4bit-128g-actorder_True) |
| [Llama-2-70b-hf](http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-70b-hf/-/archive/main/Llama-2-70b-hf-main.tar.gz) | [Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) | [Llama-2-70B-Chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GPTQ/tree/gptq-4bit-128g-actorder_True) |
| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | |
| [Meta-Llama-3-70B](http://113.200.138.88:18080/aimodels/meta-llama/Meta-Llama-3.1-70B/-/archive/main/Meta-Llama-3.1-70B-main.tar.gz) | [Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | |
| [Llama-2-7b-hf](http://113.200.138.88:18080/aimodels/Llama-2-7b-hf/) | [Llama-2-7b-chat-hf](http://113.200.138.88:18080/aimodels/findsource-dependency/llama-2-7b-chat-hf) | [Llama-2-7B-Chat-GPTQ](http://113.200.138.88:18080/aimodels/Llama-2-7B-Chat-GPTQ) |
| [Llama-2-13b-hf](http://113.200.138.88:18080/aimodels/Llama-2-13b-hf/) | [Llama-2-13b-chat-hf](http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-13b-chat-hf) | [Llama-2-13B-chat-GPTQ](http://113.200.138.88:18080/aimodels/Llama-2-13B-chat-GPTQ) |
| [Llama-2-70b-hf](http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-70b-hf/) | [Llama-2-70b-chat-hf](http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-70b-chat-hf) | [Llama-2-70B-Chat-GPTQ](http://113.200.138.88:18080/aimodels/Llama-2-70B-Chat-GPTQ) |
| [Meta-Llama-3-8B](http://113.200.138.88:18080/aimodels/Meta-Llama-3-8B) | [Meta-Llama-3-8B-Instruct](http://113.200.138.88:18080/aimodels/Meta-Llama-3-8B-Instruct) | |
| [Meta-Llama-3-70B](http://113.200.138.88:18080/aimodels/Meta-Llama-3-70B) | [Meta-Llama-3-70B-Instruct](http://113.200.138.88:18080/aimodels/Meta-Llama-3-70B-Instruct) | |
### 部署TGI
#### 使用前
```bash
export PYTORCH_TUNABLEOP_ENABLED=0
```
#### 1. 启动TGI服务
```
HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001
......@@ -150,9 +146,12 @@ text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama
text-generation-benchmark --help
```
### 推理结果
## result
## ![img1](./readme_images/img1.png)应用场景
![img1](./readme_images/img1.png)应用场景
### 精度
### 算法类别
对话问答
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment