README.md 1.12 KB
Newer Older
mashun1's avatar
v1  
mashun1 committed
1
2
3
4
5
# export-lora

Apply LORA adapters to base model and export the resulting model.

```
xuxzh1's avatar
init  
xuxzh1 committed
6
usage: llama-export-lora [options]
mashun1's avatar
v1  
mashun1 committed
7
8

options:
xuxzh1's avatar
init  
xuxzh1 committed
9
10
11
12
13
  -m,    --model                  model path from which to load base model (default '')
         --lora FNAME             path to LoRA adapter  (can be repeated to use multiple adapters)
         --lora-scaled FNAME S    path to LoRA adapter with user defined scaling S  (can be repeated to use multiple adapters)
  -t,    --threads N              number of threads to use during computation (default: 4)
  -o,    --output FNAME           output file (default: 'ggml-lora-merged-f16.gguf')
mashun1's avatar
v1  
mashun1 committed
14
15
16
17
18
```

For example:

```bash
xuxzh1's avatar
init  
xuxzh1 committed
19
./bin/llama-export-lora \
mashun1's avatar
v1  
mashun1 committed
20
21
    -m open-llama-3b-v2-q8_0.gguf \
    -o open-llama-3b-v2-q8_0-english2tokipona-chat.gguf \
xuxzh1's avatar
init  
xuxzh1 committed
22
    --lora lora-open-llama-3b-v2-q8_0-english2tokipona-chat-LATEST.gguf
mashun1's avatar
v1  
mashun1 committed
23
24
```

xuxzh1's avatar
init  
xuxzh1 committed
25
26
27
28
29
30
31
32
33
Multiple LORA adapters can be applied by passing multiple `--lora FNAME` or `--lora-scaled FNAME S` command line parameters:

```bash
./bin/llama-export-lora \
    -m your_base_model.gguf \
    -o your_merged_model.gguf \
    --lora-scaled lora_task_A.gguf 0.5 \
    --lora-scaled lora_task_B.gguf 0.5
```