Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
llama_tgi
Commits
28d696f8
Commit
28d696f8
authored
Nov 04, 2024
by
dcuai
Browse files
Update README.md
parent
163fb0ff
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
20 additions
and
21 deletions
+20
-21
README.md
README.md
+20
-21
No files found.
README.md
View file @
28d696f8
# LLAMA
## 论文
...
...
@@ -20,13 +19,12 @@ LLama是一个基础语言模型的集合,参数范围从7B到65B。在数万亿
## 环境配置
### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.2-py3.10
## **TODO**
### 源码编译安装(方法二)
基于光源pytorch2.1.0基础镜像环境:镜像下载地址:
[
https://sourcefind.cn/#/image/dcu/pytorch
](
https://sourcefind.cn/#/image/dcu/pytorch
)
,根据pytorch2.1.0、python、dtk及系统下载对应的镜像版本。pytorch2.1.0镜像里已经安装了trition,flash-attn
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal:/opt/hyhal:ro --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
```
1.
安装Rust
```
shell
...
...
@@ -47,7 +45,6 @@ rm -f $PROTOC_ZIP
```
bash
cd
llama_tgi
git clone http://developer.hpccube.com/codes/OpenDAS/text-generation-inference.git
#根据需要的分支进行切换 例:-b v2.1.1
cd
text-generation-inference
#安装exllama
cd
server
...
...
@@ -73,18 +70,12 @@ pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
另外,
`cargo install`
太慢也可以通过在
`~/.cargo/config`
中添加源来提速。
##
查看安装的版本号
5.
查看安装的版本号
```
bash
text-generation-launcher
-V
#版本号与官方版本同步
```
## 使用前
```
bash
export
PYTORCH_TUNABLEOP_ENABLED
=
0
```
## 数据集
无
...
...
@@ -94,14 +85,19 @@ export PYTORCH_TUNABLEOP_ENABLED=0
| 基座模型 | chat模型 | GPTQ模型 |
| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
|
[
Llama-2-7b-hf
](
http://113.200.138.88:18080/aimodels/Llama-2-7b-hf/
-/archive/main/Llama-2-7b-hf-main.tar.gz
)
|
[
Llama-2-7b-chat-hf
](
https://huggingface.co/meta-llama/L
lama-2-7b-chat-hf
)
|
[
Llama-2-7B-Chat-GPTQ
](
http
s
://
huggingface.co/TheBloke/Llama-2-7B-Chat-GPTQ/tree/gptq-4bit-128g-actorder_True
)
|
|
[
Llama-2-13b-hf
](
http://113.200.138.88:18080/aimodels/Llama-2-13b-hf/
-/archive/main/Llama-2-13b-hf-main.tar.gz
)
|
[
Llama-2-13b-chat-hf
](
http
s
://
huggingface.co
/meta-llama/Llama-2-13b-chat-hf
)
|
[
Llama-2-13B-GPTQ
](
http
s
://
huggingface.co/TheBloke/Llama-2-13B-GPTQ/tree/gptq-4bit-128g-actorder_True
)
|
|
[
Llama-2-70b-hf
](
http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-70b-hf/
-/archive/main/Llama-2-70b-hf-main.tar.gz
)
|
[
Llama-2-70b-chat-hf
](
http
s
://
huggingface.co
/meta-llama/Llama-2-70b-chat-hf
)
|
[
Llama-2-70B-Chat-GPTQ
](
http
s
://
huggingface.co/TheBloke/Llama-2-70B-Chat-GPTQ/tree/gptq-4bit-128g-actorder_True
)
|
|
[
Meta-Llama-3-8B
](
http
s
://
huggingface.co/meta-llama
/Meta-Llama-3-8B
)
|
[
Meta-Llama-3-8B-Instruct
](
http
s
://
huggingface.co/meta-llama
/Meta-Llama-3-8B-Instruct
)
| |
|
[
Meta-Llama-3-70B
](
http://113.200.138.88:18080/aimodels/
meta-llama/
Meta-Llama-3
.1
-70B
/-/archive/main/Meta-Llama-3.1-70B-main.tar.gz
)
|
[
Meta-Llama-3-70B-Instruct
](
http
s
://
huggingface.co/meta-llama
/Meta-Llama-3-70B-Instruct
)
| |
|
[
Llama-2-7b-hf
](
http://113.200.138.88:18080/aimodels/Llama-2-7b-hf/
)
|
[
Llama-2-7b-chat-hf
](
http://113.200.138.88:18080/aimodels/findsource-dependency/l
lama-2-7b-chat-hf
)
|
[
Llama-2-7B-Chat-GPTQ
](
http://
113.200.138.88:18080/aimodels/Llama-2-7B-Chat-GPTQ
)
|
|
[
Llama-2-13b-hf
](
http://113.200.138.88:18080/aimodels/Llama-2-13b-hf/
)
|
[
Llama-2-13b-chat-hf
](
http://
113.200.138.88:18080/aimodels
/meta-llama/Llama-2-13b-chat-hf
)
|
[
Llama-2-13B-
chat-
GPTQ
](
http://
113.200.138.88:18080/aimodels/Llama-2-13B-chat-GPTQ
)
|
|
[
Llama-2-70b-hf
](
http://113.200.138.88:18080/aimodels/meta-llama/Llama-2-70b-hf/
)
|
[
Llama-2-70b-chat-hf
](
http://
113.200.138.88:18080/aimodels
/meta-llama/Llama-2-70b-chat-hf
)
|
[
Llama-2-70B-Chat-GPTQ
](
http://
113.200.138.88:18080/aimodels/Llama-2-70B-Chat-GPTQ
)
|
|
[
Meta-Llama-3-8B
](
http://
113.200.138.88:18080/aimodels
/Meta-Llama-3-8B
)
|
[
Meta-Llama-3-8B-Instruct
](
http://
113.200.138.88:18080/aimodels
/Meta-Llama-3-8B-Instruct
)
| |
|
[
Meta-Llama-3-70B
](
http://113.200.138.88:18080/aimodels/Meta-Llama-3-70B
)
|
[
Meta-Llama-3-70B-Instruct
](
http://
113.200.138.88:18080/aimodels
/Meta-Llama-3-70B-Instruct
)
| |
### 部署TGI
#### 使用前
```
bash
export
PYTORCH_TUNABLEOP_ENABLED
=
0
```
#### 1. 启动TGI服务
```
HIP_VISIBLE_DEVICES=3 text-generation-launcher --dtype=float16 --model-id /path/to/Llama-2-7b-chat-hf --port 3001
...
...
@@ -150,9 +146,12 @@ text-generation-benchmark -s 32 -d 128 --runs 10 --tokenizer-name /path/to/Llama
text-generation-benchmark --help
```
##
# 推理结果
##
result
## 应用场景

应用场景
### 精度
无
### 算法类别
对话问答
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment