"megatron/vscode:/vscode.git/clone" did not exist on "75a86a1d6c8f585b2bc9ce5c2c7d491462a05731"
README.md 2.98 KB
Newer Older
jerrrrry's avatar
jerrrrry committed
1
2
3
4
5
6
7
8
9
just _**bash start.sh**_!
 







jerrrrry's avatar
jerrrrry committed
10
Refer this link (https://r0ddbu55vzx.feishu.cn/wiki/De0rwlTzQig8jqkkdnqc5IrjnJg) to know the detils.
jerrrrry's avatar
jerrrrry committed
11

jerrrrry's avatar
jerrrrry committed
12
easystart_v0.1:Merge the dcu_env_check (refer to https://developer.sourcefind.cn/codes/OpenDAS/dcu_env_check/-/tree/main) & rccl-test (refer to https://www.ghproxy.cn/github.com/ROCm/rccl-tests.git) and sth else.
jerrrrry's avatar
jerrrrry committed
13
14

The easystart of online-test is coming soon!
jerrrrry's avatar
jerrrrry committed
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32




**一键启动使用指南**

**一键启动当前版本支持:**

1. 一键启动环境测试(适用于一切场景)
2. 一键启动环境测试+模型下载+大模型推理(更适用于交付场景)
3. 一键启动环境测试+批量大模型推理(适用于大批量测试需求的场景)

   可根据需求进行相应测试。

   <a name="heading_0"></a>**1\_env\_check**

   一键启动环境测试:

jerrrrry's avatar
jerrrrry committed
33
   https://developer.sourcefind.cn/codes/jerrrrry/easystart_v0.1/-/tree/main/1_env_check
jerrrrry's avatar
jerrrrry committed
34

jerrrrry's avatar
jerrrrry committed
35
   |<br>git clone http://developer.sourcefind.cn/codes/jerrrrry/easystart\_v0.1.git<br>cd 1\_env\_check/<br>bash start.sh|
jerrrrry's avatar
jerrrrry committed
36
37
   | :- |

jerrrrry's avatar
jerrrrry committed
38

jerrrrry's avatar
jerrrrry committed
39
40
   测试项包含:

jerrrrry's avatar
jerrrrry committed
41
42
43
44
45
46
47
rocm\_bandwidth\_test
Rccl 4卡&8卡带宽测试
贵哥发版的dcu\_env\_check
ACS监控
CPU&DCU状态
存储&内存状态
网络状态
jerrrrry's avatar
jerrrrry committed
48
49
50
51
52

   <a name="heading_1"></a>**测试结果**

   测试结果保存在 ./outputs/env\_check\_outputs

jerrrrry's avatar
jerrrrry committed
53

jerrrrry's avatar
jerrrrry committed
54
55
56
57
58
59
60
61
62
63

   <a name="heading_2"></a>**2\_env\_check&model\_download&llm\_inference**

   一键启动环境测试+模型下载+llm推理:

   https://developer.sourcefind.cn/codes/jerrrrry/easystart\_v0.1/-/tree/main/2\_env\_check%26model\_download%26llm\_inference

   |Plain Text<br>git clone http://developer.sourcefind.cn/codes/jerrrrry/easystart\_v0.1.git<br>cd 2\_env\_check&model\_download&llm\_inference/<br>bash start.sh|
   | :- |

jerrrrry's avatar
jerrrrry committed
64
  
jerrrrry's avatar
jerrrrry committed
65
66
67
68
69
70
71

   只需将要测试的模型ID**(对应modelscope的模型ID)**传入**download-list.cfg**

   <a name="heading_3"></a>**Tips**

1. download-list.cfg里的格式为:**模型ID;本地保存路径**

jerrrrry's avatar
jerrrrry committed
72
   
jerrrrry's avatar
jerrrrry committed
73
74
75
76

2. 可以写入多个模型下载且**会进行批量测试**
3. 模型测试参数通过**model\_to\_test.cfg**传入,**需注意model\_to\_test.cfg的传参格式**

jerrrrry's avatar
jerrrrry committed
77
  
jerrrrry's avatar
jerrrrry committed
78
79
80
81
82
83
84

   <a name="heading_4"></a>**测试结果**

   测试结果保存在 ./outputs/env\_check\_outputs和 ./outputs/inference\_outputs

   下载的模型会保存在 ./outputs/models

jerrrrry's avatar
jerrrrry committed
85
   
jerrrrry's avatar
jerrrrry committed
86
87
88

   <a name="heading_5"></a>**推理结果**

jerrrrry's avatar
jerrrrry committed
89
 
jerrrrry's avatar
jerrrrry committed
90
91
92

   <a name="heading_6"></a>**测试日志**

jerrrrry's avatar
jerrrrry committed
93

jerrrrry's avatar
jerrrrry committed
94
95
96
97
98
99
100
101
102
103
104
105

   <a name="heading_7"></a>**3\_env\_check&batches\_llm\_inference**

   一键启动环境测试+批量llm推理:

   https://developer.sourcefind.cn/codes/jerrrrry/easystart\_v0.1/-/tree/main/3\_env\_check%26batches\_llm\_inference

   |Plain Text<br>git clone http://developer.sourcefind.cn/codes/jerrrrry/easystart\_v0.1.git<br>cd 3\_env\_check&batches\_llm\_inference/<br>bash start.sh|
   | :- |

   **只需在start.sh中挂载本地大模型到docker里** 

jerrrrry's avatar
jerrrrry committed
106

jerrrrry's avatar
jerrrrry committed
107
108
109

   **修改model\_to\_test.cfg里的测试参数**

jerrrrry's avatar
jerrrrry committed
110
 
jerrrrry's avatar
jerrrrry committed
111
112
113
114
115
116
117

   <a name="heading_8"></a>**测试结果**

   测试结果保存在 ./outputs/inference\_outputs