instruct_tuning.md 1.24 KB
Newer Older
Rayyyyy's avatar
Rayyyyy committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Yuan2.0 Supervised Finetuning

## Introduction

This document provides instructions for supervised finetuning (SFT) of Yuan2.0.


## Usage

An example script to run Yuan-102B SFT is:

```shell
bash examples/pretrain_yuan2.0_102B_sft.sh
```

### Arguments setting

Before running the script, the relevant arguments should be set correctly.

Firstly,  make any desired modifications including setting the environment variables for `CHECKPOINT_PATH`, `DATA_PATH`,  `TOKENIZER_MODEL_PATH ` and `TENSORBOARD_PATH`.

`--train-reset` allows you to begin your training iters from 0.
`--sft-stage` is highly recommended to be on since it control the calculate of loss mask during SFT.
`--override-opt-param-scheduler` allows you to set your own scheduler.
`--finetune` load model for finetuning. do not load optimizer or rng state from checkpoint and set iters to 0. Assumed when loading a release checkpoint.

If the dataset path is:

```
/path/dataset.bin
```

The `DATA_PATH` can be set :

```shell
DATA_PATH='1 /path/dataset'
```

For dataset preprocesss please refer to [documentation]().

Further command line arguments are described in the source file [`arguments.py`](./megatron/arguments) and [REAMME.md](https://github.com/NVIDIA/Megatron-LM/blob/main/README.md)