Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
LLaMA-Factory
Commits
84987715
Commit
84987715
authored
Apr 07, 2025
by
chenych
Browse files
update to v0.9.2
parent
317a82e2
Changes
58
Show whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
69 additions
and
378 deletions
+69
-378
README_en.md
README_en.md
+21
-20
README_zh.md
README_zh.md
+20
-19
assets/wechat.jpg
assets/wechat.jpg
+0
-0
assets/wechat_npu.jpg
assets/wechat_npu.jpg
+0
-0
data/mllm_demo.json
data/mllm_demo.json
+4
-2
docker/docker-cuda/Dockerfile
docker/docker-cuda/Dockerfile
+0
-101
docker/docker-cuda/docker-compose.yml
docker/docker-cuda/docker-compose.yml
+0
-37
docker/docker-npu/Dockerfile
docker/docker-npu/Dockerfile
+0
-67
docker/docker-npu/docker-compose.yml
docker/docker-npu/docker-compose.yml
+0
-33
docker/docker-rocm/docker-compose.yml
docker/docker-rocm/docker-compose.yml
+7
-7
examples/accelerate/fsdp_config.yaml
examples/accelerate/fsdp_config.yaml
+1
-1
examples/deepspeed/ds_z0_config.json
examples/deepspeed/ds_z0_config.json
+1
-1
examples/deepspeed/ds_z2_config.json
examples/deepspeed/ds_z2_config.json
+1
-1
examples/deepspeed/ds_z2_offload_config.json
examples/deepspeed/ds_z2_offload_config.json
+1
-1
examples/deepspeed/ds_z3_config.json
examples/deepspeed/ds_z3_config.json
+1
-1
examples/deepspeed/ds_z3_offload_config.json
examples/deepspeed/ds_z3_offload_config.json
+1
-1
examples/train_full/llama3_full_sft.yaml
examples/train_full/llama3_full_sft.yaml
+10
-5
examples/train_full/qwen2vl_full_sft.yaml
examples/train_full/qwen2vl_full_sft.yaml
+1
-1
examples/train_lora/qwen2.5_lora_sft_ds3.yaml
examples/train_lora/qwen2.5_lora_sft_ds3.yaml
+0
-40
examples/train_lora/qwen2.5_lora_sft_offload_ds3.yaml
examples/train_lora/qwen2.5_lora_sft_offload_ds3.yaml
+0
-40
No files found.
README_en.md
View file @
84987715
...
...
@@ -5,7 +5,7 @@
[

](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
[

](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
[

](https://pypi.org/project/llamafactory/)
[

](https://scholar.google.com/scholar?cites=12620864006390196564)
[

](https://scholar.google.com/scholar?cites=12620864006390196564)
[

](https://github.com/hiyouga/LLaMA-Factory/pulls)
[

](https://twitter.com/llamafactory_ai)
...
...
@@ -37,10 +37,10 @@ https://github.com/user-attachments/assets/7c96b465-9df7-45f4-8053-bf03e58386d3
Choose your path:
-
**Documentation
(WIP)
**
: https://llamafactory.readthedocs.io/
zh-c
n/latest/
-
**Colab**
: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
-
**Documentation**
: https://llamafactory.readthedocs.io/
e
n/latest/
-
**Colab
(free)
**
: https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
-
**Local machine**
: Please refer to
[
usage
](
#getting-started
)
-
**PAI-DSW**
:
[
Llama3 Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
)
|
[
Qwen2-VL Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl
)
|
[
DeepSeek-R1-Distill Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b
)
-
**PAI-DSW
(free trial)
**
:
[
Llama3 Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
)
|
[
Qwen2-VL Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl
)
|
[
DeepSeek-R1-Distill Example
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b
)
-
**Amazon SageMaker**
:
[
Blog
](
https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/
)
> [!NOTE]
...
...
@@ -403,7 +403,7 @@ huggingface-cli login
| Optional | Minimum | Recommend |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.16.
2
|
| deepspeed | 0.10.0 | 0.16.
4
|
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.3 | 0.7.3 |
| flash-attn | 2.3.0 | 2.7.2 |
...
...
@@ -412,15 +412,14 @@ huggingface-cli login
\*
*estimated*
| Method | Bits | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
| ------------------------ | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ |
| Full | 32 | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
| LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
| Method | Bits | 7B | 14B | 30B | 70B |
`x`
B |
| ------------------------------- | ---- | ----- | ----- | ----- | ------ | ------- |
| Full (
`bf16`
or
`fp16`
) | 32 | 120GB | 240GB | 600GB | 1200GB |
`18x`
GB |
| Full (
`pure_bf16`
) | 16 | 60GB | 120GB | 300GB | 600GB |
`8x`
GB |
| Freeze/LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB |
`2x`
GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB |
`x`
GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB |
`x/2`
GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB |
`x/4`
GB |
## Getting Started
...
...
@@ -491,11 +490,11 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
```
| Requirement | Minimum | Recommend |
| ------------ | ------- | ----------- |
| ------------ | ------- | -----------
---
|
| CANN | 8.0.RC1 | 8.0.0.alpha002 |
| torch | 2.1.0 | 2.4.0 |
| torch-npu | 2.1.0 | 2.4.0.post2 |
| deepspeed | 0.13.2 | 0.1
6
.2 |
| deepspeed | 0.13.2 | 0.1
3
.2
|
Remember to use
`ASCEND_RT_VISIBLE_DEVICES`
instead of
`CUDA_VISIBLE_DEVICES`
to specify the device to use.
...
...
@@ -560,6 +559,8 @@ See [examples/README.md](examples/README.md) for advanced usage (including distr
> [!TIP]
> Use `llamafactory-cli help` to show help information.
>
> Read [FAQs](https://github.com/hiyouga/LLaMA-Factory/issues/4614) first if you encounter any problems.
### Fine-Tuning with LLaMA Board GUI (powered by [Gradio](https://github.com/gradio-app/gradio))
...
...
README_zh.md
View file @
84987715
...
...
@@ -5,7 +5,7 @@
[

](https://github.com/hiyouga/LLaMA-Factory/graphs/contributors)
[

](https://github.com/hiyouga/LLaMA-Factory/actions/workflows/tests.yml)
[

](https://pypi.org/project/llamafactory/)
[

](https://scholar.google.com/scholar?cites=12620864006390196564)
[

](https://scholar.google.com/scholar?cites=12620864006390196564)
[

](https://github.com/hiyouga/LLaMA-Factory/pulls)
[

](https://twitter.com/llamafactory_ai)
...
...
@@ -40,9 +40,9 @@ https://github.com/user-attachments/assets/e6ce34b0-52d5-4f3e-a830-592106c4c272
-
**入门教程**
:https://zhuanlan.zhihu.com/p/695287607
-
**框架文档**
:https://llamafactory.readthedocs.io/zh-cn/latest/
-
**Colab**
:https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing
-
**Colab
(免费)
**
:https://colab.research.google.com/drive/1d5KQtbemerlSDSxZIfAaWXhKr30QypiK?usp=sharing
-
**本地机器**
:请见
[
如何使用
](
#如何使用
)
-
**PAI-DSW**
:
[
Llama3 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
)
|
[
Qwen2-VL 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl
)
|
[
DeepSeek-R1-Distill 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b
)
-
**PAI-DSW
(免费试用)
**
:
[
Llama3 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
)
|
[
Qwen2-VL 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_qwen2vl
)
|
[
DeepSeek-R1-Distill 案例
](
https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory_deepseek_r1_distill_7b
)
-
**Amazon SageMaker**
:
[
博客
](
https://aws.amazon.com/cn/blogs/china/a-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory/
)
> [!NOTE]
...
...
@@ -405,7 +405,7 @@ huggingface-cli login
| 可选项 | 至少 | 推荐 |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.16.
2
|
| deepspeed | 0.10.0 | 0.16.
4
|
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.3 | 0.7.3 |
| flash-attn | 2.3.0 | 2.7.2 |
...
...
@@ -414,15 +414,14 @@ huggingface-cli login
\*
*估算值*
| 方法 | 精度 | 7B | 13B | 30B | 70B | 110B | 8x7B | 8x22B |
| ------------------------ | ---- | ----- | ----- | ----- | ------ | ------ | ----- | ------ |
| Full | 32 | 120GB | 240GB | 600GB | 1200GB | 2000GB | 900GB | 2400GB |
| Full | 16 | 60GB | 120GB | 300GB | 600GB | 900GB | 400GB | 1200GB |
| Freeze | 16 | 20GB | 40GB | 80GB | 200GB | 360GB | 160GB | 400GB |
| LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB | 240GB | 120GB | 320GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB | 140GB | 60GB | 160GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB | 72GB | 30GB | 96GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB | 48GB | 18GB | 48GB |
| 方法 | 精度 | 7B | 14B | 30B | 70B |
`x`
B |
| ------------------------------- | ---- | ----- | ----- | ----- | ------ | ------- |
| Full (
`bf16`
or
`fp16`
) | 32 | 120GB | 240GB | 600GB | 1200GB |
`18x`
GB |
| Full (
`pure_bf16`
) | 16 | 60GB | 120GB | 300GB | 600GB |
`8x`
GB |
| Freeze/LoRA/GaLore/APOLLO/BAdam | 16 | 16GB | 32GB | 64GB | 160GB |
`2x`
GB |
| QLoRA | 8 | 10GB | 20GB | 40GB | 80GB |
`x`
GB |
| QLoRA | 4 | 6GB | 12GB | 24GB | 48GB |
`x/2`
GB |
| QLoRA | 2 | 4GB | 8GB | 16GB | 24GB |
`x/4`
GB |
## 如何使用
...
...
@@ -494,10 +493,10 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
```
| 依赖项 | 至少 | 推荐 |
| ------------ | ------- | ----------- |
| CANN | 8.0.RC1 | 8.0.
RC1
|
| torch | 2.1.0 | 2.
1
.0 |
| torch-npu | 2.1.0 | 2.
1
.0.post
3
|
| ------------ | ------- | -----------
---
|
| CANN | 8.0.RC1 | 8.0.
0.alpha002
|
| torch | 2.1.0 | 2.
4
.0
|
| torch-npu | 2.1.0 | 2.
4
.0.post
2
|
| deepspeed | 0.13.2 | 0.13.2 |
请使用
`ASCEND_RT_VISIBLE_DEVICES`
而非
`CUDA_VISIBLE_DEVICES`
来指定运算设备。
...
...
@@ -563,6 +562,8 @@ llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
> [!TIP]
> 使用 `llamafactory-cli help` 显示帮助信息。
>
> 遇到报错请先看[常见问题](https://github.com/hiyouga/LLaMA-Factory/issues/4614)。
### LLaMA Board 可视化微调(由 [Gradio](https://github.com/gradio-app/gradio) 驱动)
...
...
assets/wechat.jpg
View replaced file @
317a82e2
View file @
84987715
167 KB
|
W:
|
H:
164 KB
|
W:
|
H:
2-up
Swipe
Onion skin
assets/wechat_npu.jpg
View replaced file @
317a82e2
View file @
84987715
167 KB
|
W:
|
H:
167 KB
|
W:
|
H:
2-up
Swipe
Onion skin
data/mllm_demo.json
View file @
84987715
...
...
@@ -10,7 +10,7 @@
"role"
:
"assistant"
},
{
"content"
:
"What are they doing?"
,
"content"
:
"What are they doing?
<image>
"
,
"role"
:
"user"
},
{
...
...
@@ -19,6 +19,7 @@
}
],
"images"
:
[
"mllm_demo_data/1.jpg"
,
"mllm_demo_data/1.jpg"
]
},
...
...
@@ -79,7 +80,7 @@
"role"
:
"assistant"
},
{
"content"
:
"他们在做什么?"
,
"content"
:
"他们在做什么?
<image>
"
,
"role"
:
"user"
},
{
...
...
@@ -88,6 +89,7 @@
}
],
"images"
:
[
"mllm_demo_data/1.jpg"
,
"mllm_demo_data/1.jpg"
]
},
...
...
docker/docker-cuda/Dockerfile
deleted
100644 → 0
View file @
317a82e2
# Default use the NVIDIA official image with PyTorch 2.3.0
# https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html
ARG
BASE_IMAGE=nvcr.io/nvidia/pytorch:24.02-py3
FROM
${BASE_IMAGE}
# Define environments
ENV
MAX_JOBS=4
ENV
FLASH_ATTENTION_FORCE_BUILD=TRUE
ENV
VLLM_WORKER_MULTIPROC_METHOD=spawn
# Define installation arguments
ARG
INSTALL_BNB=false
ARG
INSTALL_VLLM=false
ARG
INSTALL_DEEPSPEED=false
ARG
INSTALL_FLASHATTN=false
ARG
INSTALL_LIGER_KERNEL=false
ARG
INSTALL_HQQ=false
ARG
INSTALL_EETQ=false
ARG
PIP_INDEX=https://pypi.org/simple
ARG
HTTP_PROXY=
# Set the working directory
WORKDIR
/app
# Set http proxy
RUN if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
echo
"Configuring proxy..."
;
\
export
http_proxy
=
$HTTP_PROXY
;
\
export
https_proxy
=
$HTTP_PROXY
;
\
fi
# Install the requirements
COPY
requirements.txt /app
RUN
pip config
set
global.index-url
"
$PIP_INDEX
"
&&
\
pip config
set
global.extra-index-url
"
$PIP_INDEX
"
&&
\
python
-m
pip
install
--upgrade
pip
&&
\
if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
python
-m
pip
install
--proxy
=
$HTTP_PROXY
-r
requirements.txt
;
\
else
\
python
-m
pip
install
-r
requirements.txt
;
\
fi
# Copy the rest of the application into the image
COPY
. /app
# Install the LLaMA Factory
RUN
EXTRA_PACKAGES
=
"metrics"
;
\
if
[
"
$INSTALL_BNB
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,bitsandbytes"
;
\
fi
;
\
if
[
"
$INSTALL_VLLM
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,vllm"
;
\
fi
;
\
if
[
"
$INSTALL_DEEPSPEED
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,deepspeed"
;
\
fi
;
\
if
[
"
$INSTALL_LIGER_KERNEL
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,liger-kernel"
;
\
fi
;
\
if
[
"
$INSTALL_HQQ
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,hqq"
;
\
fi
;
\
if
[
"
$INSTALL_EETQ
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,eetq"
;
\
fi
;
\
if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
pip
install
--proxy
=
$HTTP_PROXY
-e
".[
$EXTRA_PACKAGES
]"
;
\
else
\
pip
install
-e
".[
$EXTRA_PACKAGES
]"
;
\
fi
# Rebuild flash attention
RUN
pip uninstall
-y
transformer-engine flash-attn
&&
\
if
[
"
$INSTALL_FLASHATTN
"
==
"true"
]
;
then
\
pip uninstall
-y
ninja
&&
\
if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
pip
install
--proxy
=
$HTTP_PROXY
ninja
&&
\
pip
install
--proxy
=
$HTTP_PROXY
--no-cache-dir
flash-attn
--no-build-isolation
;
\
else
\
pip
install
ninja
&&
\
pip
install
--no-cache-dir
flash-attn
--no-build-isolation
;
\
fi
;
\
fi
# Unset http proxy
RUN if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
unset
http_proxy
;
\
unset
https_proxy
;
\
fi
# Set up volumes
VOLUME
[ "/root/.cache/huggingface", "/root/.cache/modelscope", "/app/data", "/app/output" ]
# Expose port 7860 for the LLaMA Board
ENV
GRADIO_SERVER_PORT 7860
EXPOSE
7860
# Expose port 8000 for the API service
ENV
API_PORT 8000
EXPOSE
8000
docker/docker-cuda/docker-compose.yml
deleted
100644 → 0
View file @
317a82e2
services
:
llamafactory
:
build
:
dockerfile
:
./docker/docker-cuda/Dockerfile
context
:
../..
args
:
INSTALL_BNB
:
false
INSTALL_VLLM
:
false
INSTALL_DEEPSPEED
:
false
INSTALL_FLASHATTN
:
false
INSTALL_LIGER_KERNEL
:
false
INSTALL_HQQ
:
false
INSTALL_EETQ
:
false
PIP_INDEX
:
https://pypi.org/simple
container_name
:
llamafactory
volumes
:
-
../../hf_cache:/root/.cache/huggingface
-
../../ms_cache:/root/.cache/modelscope
-
../../om_cache:/root/.cache/openmind
-
../../data:/app/data
-
../../output:/app/output
ports
:
-
"
7860:7860"
-
"
8000:8000"
ipc
:
host
tty
:
true
shm_size
:
'
16gb'
stdin_open
:
true
command
:
bash
deploy
:
resources
:
reservations
:
devices
:
-
driver
:
nvidia
count
:
"
all"
capabilities
:
[
gpu
]
restart
:
unless-stopped
docker/docker-npu/Dockerfile
deleted
100644 → 0
View file @
317a82e2
# Use the Ubuntu 22.04 image with CANN 8.0.rc1
# More versions can be found at https://hub.docker.com/r/ascendai/cann/tags
# FROM ascendai/cann:8.0.rc1-910-ubuntu22.04-py3.8
FROM
ascendai/cann:8.0.0-910b-ubuntu22.04-py3.10
# FROM ascendai/cann:8.0.rc1-910-openeuler22.03-py3.8
# FROM ascendai/cann:8.0.rc1-910b-openeuler22.03-py3.8
# Define environments
ENV
DEBIAN_FRONTEND=noninteractive
# Define installation arguments
ARG
INSTALL_DEEPSPEED=false
ARG
PIP_INDEX=https://pypi.org/simple
ARG
TORCH_INDEX=https://download.pytorch.org/whl/cpu
ARG
HTTP_PROXY=
# Set the working directory
WORKDIR
/app
# Set http proxy
RUN if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
echo
"Configuring proxy..."
;
\
export
http_proxy
=
$HTTP_PROXY
;
\
export
https_proxy
=
$HTTP_PROXY
;
\
fi
# Install the requirements
COPY
requirements.txt /app
RUN
pip config
set
global.index-url
"
$PIP_INDEX
"
&&
\
pip config
set
global.extra-index-url
"
$TORCH_INDEX
"
&&
\
python
-m
pip
install
--upgrade
pip
&&
\
if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
python
-m
pip
install
--proxy
=
$HTTP_PROXY
-r
requirements.txt
;
\
else
\
python
-m
pip
install
-r
requirements.txt
;
\
fi
# Copy the rest of the application into the image
COPY
. /app
# Install the LLaMA Factory
RUN
EXTRA_PACKAGES
=
"torch-npu,metrics"
;
\
if
[
"
$INSTALL_DEEPSPEED
"
==
"true"
]
;
then
\
EXTRA_PACKAGES
=
"
${
EXTRA_PACKAGES
}
,deepspeed"
;
\
fi
;
\
if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
pip
install
--proxy
=
$HTTP_PROXY
-e
".[
$EXTRA_PACKAGES
]"
;
\
else
\
pip
install
-e
".[
$EXTRA_PACKAGES
]"
;
\
fi
# Unset http proxy
RUN if
[
-n
"
$HTTP_PROXY
"
]
;
then
\
unset
http_proxy
;
\
unset
https_proxy
;
\
fi
# Set up volumes
VOLUME
[ "/root/.cache/huggingface", "/root/.cache/modelscope", "/app/data", "/app/output" ]
# Expose port 7860 for the LLaMA Board
ENV
GRADIO_SERVER_PORT 7860
EXPOSE
7860
# Expose port 8000 for the API service
ENV
API_PORT 8000
EXPOSE
8000
docker/docker-npu/docker-compose.yml
deleted
100644 → 0
View file @
317a82e2
services
:
llamafactory
:
build
:
dockerfile
:
./docker/docker-npu/Dockerfile
context
:
../..
args
:
INSTALL_DEEPSPEED
:
"
false"
PIP_INDEX
:
https://pypi.org/simple
container_name
:
llamafactory
volumes
:
-
../../hf_cache:/root/.cache/huggingface
-
../../ms_cache:/root/.cache/modelscope
-
../../om_cache:/root/.cache/openmind
-
../../data:/app/data
-
../../output:/app/output
-
/usr/local/dcmi:/usr/local/dcmi
-
/usr/local/bin/npu-smi:/usr/local/bin/npu-smi
-
/usr/local/Ascend/driver:/usr/local/Ascend/driver
-
/etc/ascend_install.info:/etc/ascend_install.info
ports
:
-
"
7860:7860"
-
"
8000:8000"
ipc
:
host
tty
:
true
shm_size
:
'
16gb'
stdin_open
:
true
command
:
bash
devices
:
-
/dev/davinci0
-
/dev/davinci_manager
-
/dev/devmm_svm
-
/dev/hisi_hdc
restart
:
unless-stopped
docker/docker-rocm/docker-compose.yml
View file @
84987715
...
...
@@ -4,12 +4,12 @@ services:
dockerfile
:
./docker/docker-rocm/Dockerfile
context
:
../..
args
:
INSTALL_BNB
:
false
INSTALL_VLLM
:
false
INSTALL_DEEPSPEED
:
false
INSTALL_FLASHATTN
:
false
INSTALL_LIGER_KERNEL
:
false
INSTALL_HQQ
:
false
INSTALL_BNB
:
"
false
"
INSTALL_VLLM
:
"
false
"
INSTALL_DEEPSPEED
:
"
false
"
INSTALL_FLASHATTN
:
"
false
"
INSTALL_LIGER_KERNEL
:
"
false
"
INSTALL_HQQ
:
"
false
"
PIP_INDEX
:
https://pypi.org/simple
container_name
:
llamafactory
volumes
:
...
...
@@ -24,7 +24,7 @@ services:
-
"
8000:8000"
ipc
:
host
tty
:
true
shm_size
:
'
16gb
'
shm_size
:
"
16gb
"
stdin_open
:
true
command
:
bash
devices
:
...
...
examples/accelerate/fsdp_config.yaml
View file @
84987715
...
...
@@ -14,7 +14,7 @@ fsdp_config:
fsdp_use_orig_params
:
true
machine_rank
:
0
main_training_function
:
main
mixed_precision
:
f
p
16
# or
b
f16
mixed_precision
:
b
f16
# or f
p
16
num_machines
:
1
# the number of nodes
num_processes
:
2
# the number of GPUs in all nodes
rdzv_backend
:
static
...
...
examples/deepspeed/ds_z0_config.json
View file @
84987715
...
...
@@ -19,7 +19,7 @@
"stage"
:
0
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
tru
e
,
"overlap_comm"
:
fals
e
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
...
...
examples/deepspeed/ds_z2_config.json
View file @
84987715
...
...
@@ -19,7 +19,7 @@
"stage"
:
2
,
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
tru
e
,
"overlap_comm"
:
fals
e
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
...
...
examples/deepspeed/ds_z2_offload_config.json
View file @
84987715
...
...
@@ -23,7 +23,7 @@
},
"allgather_partitions"
:
true
,
"allgather_bucket_size"
:
5e8
,
"overlap_comm"
:
tru
e
,
"overlap_comm"
:
fals
e
,
"reduce_scatter"
:
true
,
"reduce_bucket_size"
:
5e8
,
"contiguous_gradients"
:
true
,
...
...
examples/deepspeed/ds_z3_config.json
View file @
84987715
...
...
@@ -17,7 +17,7 @@
},
"zero_optimization"
:
{
"stage"
:
3
,
"overlap_comm"
:
tru
e
,
"overlap_comm"
:
fals
e
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
...
...
examples/deepspeed/ds_z3_offload_config.json
View file @
84987715
...
...
@@ -25,7 +25,7 @@
"device"
:
"cpu"
,
"pin_memory"
:
true
},
"overlap_comm"
:
tru
e
,
"overlap_comm"
:
fals
e
,
"contiguous_gradients"
:
true
,
"sub_group_size"
:
1e9
,
"reduce_bucket_size"
:
"auto"
,
...
...
examples/train_full/llama3_full_sft
_ds3
.yaml
→
examples/train_full/llama3_full_sft.yaml
View file @
84987715
### model
model_name_or_path
:
meta-llama/Meta-Llama-3-8B-Instruct
trust_remote_code
:
true
### method
stage
:
sft
do_train
:
true
finetuning_type
:
full
deepspeed
:
examples/deepspeed/ds_z3_config.json
deepspeed
:
examples/deepspeed/ds_z3_config.json
# choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]
### dataset
dataset
:
identity,alpaca_en_demo
...
...
@@ -14,6 +15,7 @@ cutoff_len: 2048
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
16
dataloader_num_workers
:
4
### output
output_dir
:
saves/llama3-8b/full/sft
...
...
@@ -21,6 +23,7 @@ logging_steps: 10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
save_only_model
:
false
### train
per_device_train_batch_size
:
1
...
...
@@ -31,9 +34,11 @@ lr_scheduler_type: cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
resume_from_checkpoint
:
null
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
500
# eval_dataset: alpaca_en_demo
# val_size: 0.1
# per_device_eval_batch_size: 1
# eval_strategy: steps
# eval_steps: 500
examples/train_full/qwen2vl_full_sft.yaml
View file @
84987715
...
...
@@ -10,7 +10,7 @@ do_train: true
finetuning_type
:
full
freeze_vision_tower
:
true
# choices: [true, false]
freeze_multi_modal_projector
:
true
# choices: [true, false]
train_mm_proj_only
:
false
# choices: [true, false]
freeze_language_model
:
false
# choices: [true, false]
deepspeed
:
examples/deepspeed/ds_z3_config.json
# choices: [ds_z0_config.json, ds_z2_config.json, ds_z3_config.json]
### dataset
...
...
examples/train_lora/qwen2.5_lora_sft_ds3.yaml
deleted
100644 → 0
View file @
317a82e2
### model
model_name_or_path
:
/data/luopl/Qwen/Qwen2.5-72B
### method
stage
:
sft
do_train
:
true
finetuning_type
:
lora
lora_target
:
q_proj,v_proj
deepspeed
:
examples/deepspeed/ds_z3_config.json
### dataset
dataset
:
identity,alpaca_zh_demo,alpaca_en_demo
template
:
qwen
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
4
### output
output_dir
:
saves/qwen2.5_72b/lora/sft/
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
1
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
250
examples/train_lora/qwen2.5_lora_sft_offload_ds3.yaml
deleted
100644 → 0
View file @
317a82e2
### model
model_name_or_path
:
/data/luopl/Qwen/Qwen2.5-72B
### method
stage
:
sft
do_train
:
true
finetuning_type
:
lora
lora_target
:
q_proj,v_proj
deepspeed
:
examples/deepspeed/ds_z3_offload_config.json
### dataset
dataset
:
identity,alpaca_zh_demo,alpaca_en_demo
template
:
qwen
cutoff_len
:
1024
max_samples
:
1000
overwrite_cache
:
true
preprocessing_num_workers
:
4
### output
output_dir
:
saves/qwen2.5_72b/lora/sft/
logging_steps
:
10
save_steps
:
500
plot_loss
:
true
overwrite_output_dir
:
true
### train
per_device_train_batch_size
:
1
gradient_accumulation_steps
:
1
learning_rate
:
1.0e-5
num_train_epochs
:
3.0
lr_scheduler_type
:
cosine
warmup_ratio
:
0.1
bf16
:
true
ddp_timeout
:
180000000
### eval
val_size
:
0.1
per_device_eval_batch_size
:
1
eval_strategy
:
steps
eval_steps
:
250
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment