README.md 18.7 KB
Newer Older
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
1
<h1 align="center">
2
3
4
  <img width="auto" height="100px", src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/logo_coati.png"/>
  <br/>
  <span>ColossalChat</span>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
5
6
7
8
9
10
</h1>


## Table of Contents

- [Table of Contents](#table-of-contents)
11
- [What is ColossalChat and Coati ?](#what-is-colossalchat-and-coati-)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
12
13
14
15
16
17
18
19
20
- [Online demo](#online-demo)
- [Install](#install)
  - [Install the environment](#install-the-environment)
  - [Install the Transformers](#install-the-transformers)
- [How to use?](#how-to-use)
  - [Supervised datasets collection](#supervised-datasets-collection)
  - [Stage1 - Supervised instructs tuning](#stage1---supervised-instructs-tuning)
  - [Stage2 - Training reward model](#stage2---training-reward-model)
  - [Stage3 - Training model with reinforcement learning by human feedback](#stage3---training-model-with-reinforcement-learning-by-human-feedback)
21
  - [Inference - After Training](#inference---after-training)
22
23
    - [8-bit setup](#8-bit-setup)
    - [4-bit setup](#4-bit-setup)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
24
- [Coati7B examples](#coati7b-examples)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
25
26
  - [Generation](#generation)
  - [Open QA](#open-qa)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
27
28
  - [Limitation for LLaMA-finetuned models](#limitation-for-llama-finetuned-models)
  - [Limitation of dataset](#limitation-of-dataset)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
29
30
- [FAQ](#faq)
  - [How to save/load checkpoint](#how-to-saveload-checkpoint)
31
  - [How to train with limited resources](#how-to-train-with-limited-resources)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
32
33
34
35
36
37
38
39
- [The Plan](#the-plan)
  - [Real-time progress](#real-time-progress)
- [Invitation to open-source contribution](#invitation-to-open-source-contribution)
- [Quick Preview](#quick-preview)
- [Authors](#authors)
- [Citations](#citations)
- [Licenses](#licenses)
---
40
## What is ColossalChat and Coati ?
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
41

42
[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) is the project to implement LLM with RLHF, powered by the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) project.
43
44
45
46

Coati stands for `ColossalAI Talking Intelligence`. It is the name for the module implemented in this project and is also the name of the large language model developed by the ColossalChat project.

The Coati package provides a unified large language model framework that has implemented the following functions
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
47
48
- Supports comprehensive large-model training acceleration capabilities for ColossalAI, without requiring knowledge of complex distributed training algorithms
- Supervised datasets collection
49
- Supervised instructions fine-tuning
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
50
51
52
53
- Training reward model
- Reinforcement learning with human feedback
- Quantization inference
- Fast model deploying
54
- Perfectly integrated with the Hugging Face ecosystem, a high degree of model customization
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
55

56
57
58
59
<div align="center">
  <p align="center">
    <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/chatgpt.png" width=700/>
  </p>
60

61
62
   Image source: https://openai.com/blog/chatgpt
</div>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
63

64
**As Colossa-AI is undergoing some major updates, this project will be actively maintained to stay in line with the Colossal-AI project.**
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
65
66


67
68
69
More details can be found in the latest news.
* [2023/03] [ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b)
* [2023/02] [Open Source Solution Replicates ChatGPT Training Process! Ready to go with only 1.6GB GPU Memory](https://www.hpc-ai.tech/blog/colossal-ai-chatgpt)
70

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
71
72
73
74
75
## Online demo
You can experience the performance of Coati7B on this page.

[chat.colossalai.org](https://chat.colossalai.org/)

76
77
Due to resource constraints, we will only provide this service from 29th Mar 2023 to 5 April 2023. However, we have provided the inference code in the [inference](./inference/) folder. The WebUI will be open-sourced soon as well.

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
78
79
80
81
82
83
> Warning: Due to model and dataset size limitations, Coati is just a baby model, Coati7B may output incorrect information and lack the ability for multi-turn dialogue. There is still significant room for improvement.
## Install

### Install the environment

```shell
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
84
conda create -n coati
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
conda activate coati
pip install .
```

### Install the Transformers
Given Hugging Face hasn't officially supported the LLaMA models, We fork a branch of Transformers that can be compatible with our code

```shell
git clone https://github.com/hpcaitech/transformers
cd transformers
pip install .
```

## How to use?

### Supervised datasets collection

102
we collected 104K bilingual datasets of Chinese and English, and you can find the datasets in this repo
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
103
[InstructionWild](https://github.com/XueFuzhao/InstructionWild)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
104
105
106

Here is how we collected the data
<p align="center">
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
107
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/data-collect.png" width=500/>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
</p>

### Stage1 - Supervised instructs tuning

Stage1 is supervised instructs fine-tuning, which uses the datasets mentioned earlier to fine-tune the model

you can run the `examples/train_sft.sh` to start a supervised instructs fine-tuning

```
torchrun --standalone --nproc_per_node=4 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_zero2 \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 4 \
    --accimulation_steps 8 \
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
```

### Stage2 - Training reward model

Stage2 trains a reward model, which obtains corresponding scores by manually ranking different outputs for the same prompt and supervises the training of the reward model

you can run the `examples/train_rm.sh` to start a reward model training

```
torchrun --standalone --nproc_per_node=4 train_reward_model.py
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_zero2 \
    --loss_fn 'log_exp'\
    --save_path 'rmstatic.pt' \
```

### Stage3 - Training model with reinforcement learning by human feedback

Stage3 uses reinforcement learning algorithm, which is the most complex part of the training process:

<p align="center">
BlueRum's avatar
BlueRum committed
151
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/stage-3.jpeg" width=800/>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
152
153
154
155
156
</p>

you can run the `examples/train_prompts.sh` to start training PPO with human feedback

```
BlueRum's avatar
BlueRum committed
157
158
159
160
161
162
163
164
torchrun --standalone --nproc_per_node=4 train_prompts.py \
         --pretrain "/path/to/LLaMa-7B/" \
         --model 'llama' \
         --strategy colossalai_zero2 \
         --prompt_path /path/to/your/prompt_dataset \
         --pretrain_dataset /path/to/your/pretrain_dataset \
         --rm_pretrain /your/pretrain/rm/defination \
         --rm_path /your/rm/model/path
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
165
166
```

BlueRum's avatar
BlueRum committed
167
For more details, see [`examples/`](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/examples).
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
168

BlueRum's avatar
BlueRum committed
169
170
### Inference - After Training
#### 8-bit setup
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
171

BlueRum's avatar
BlueRum committed
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
8-bit quantization is originally supported by the latest [transformers](https://github.com/huggingface/transformers). Please install it from source.

Please ensure you have downloaded HF-format model weights of LLaMA models.

Usage:

```python
from transformers import LlamaForCausalLM
USE_8BIT = True # use 8-bit quantization; otherwise, use fp16
model = LlamaForCausalLM.from_pretrained(
            "pretrained/path",
            load_in_8bit=USE_8BIT,
            torch_dtype=torch.float16,
            device_map="auto",
        )
if not USE_8BIT:
    model.half()  # use fp16
model.eval()
```

192
**Troubleshooting**: if you get errors indicating your CUDA-related libraries are not found when loading the 8-bit model, you can check whether your `LD_LIBRARY_PATH` is correct.
BlueRum's avatar
BlueRum committed
193
194
195
196
197

E.g. you can set `export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH`.

#### 4-bit setup

198
Please ensure you have downloaded the HF-format model weights of LLaMA models first.
BlueRum's avatar
BlueRum committed
199

200
Then you can follow [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa). This lib provides efficient CUDA kernels and weight conversion scripts.
BlueRum's avatar
BlueRum committed
201

202
After installing this lib, we may convert the original HF-format LLaMA model weights to a 4-bit version.
BlueRum's avatar
BlueRum committed
203
204
205
206
207
208
209

```shell
CUDA_VISIBLE_DEVICES=0 python llama.py /path/to/pretrained/llama-7b c4 --wbits 4 --groupsize 128 --save llama7b-4bit.pt
```

Run this command in your cloned `GPTQ-for-LLaMa` directory, then you will get a 4-bit weight file `llama7b-4bit-128g.pt`.

210
**Troubleshooting**: if you get errors about `position_ids`, you can checkout to commit `50287c3b9ae4a3b66f6b5127c643ec39b769b155`(`GPTQ-for-LLaMa` repo).
BlueRum's avatar
BlueRum committed
211
212

For more details, see [`inference/`](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/inference).
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
213
214
215

## Coati7B examples

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
216
217
218
219
### Generation

<details><summary><b>E-mail</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
220
![phd](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/Phd.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
221
222
223
224
</details>

<details><summary><b>coding</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
225
![sort](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/quick_sort.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
226
227
228
229
230

</details>

<details><summary><b>regex</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
231
![regex](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/regex.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
232
233
234
235
236

</details>

<details><summary><b>Tex</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
237
![tex](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/tex.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
238
239
240
241
242

</details>

<details><summary><b>writing</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
243
![writing](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/writing.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
244
245
246
247
248

</details>

<details><summary><b>Table</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
249
![Table](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/table.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
250
251
252
253
254
255

</details>

### Open QA
<details><summary><b>Game</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
256
![Game](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/game.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
257
258
259
260
261

</details>

<details><summary><b>Travel</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
262
![Travel](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/travel.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
263
264
265
266
267

</details>

<details><summary><b>Physical</b></summary>

BlueRum's avatar
BlueRum committed
268
![Physical](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/physical.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
269
270
271
272
273

</details>

<details><summary><b>Chemical</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
274
![Chemical](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/chemical.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
275
276
277
278
279

</details>

<details><summary><b>Economy</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
280
![Economy](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/economy.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
281
282

</details>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
283

BlueRum's avatar
BlueRum committed
284
You can find more examples in this [repo](https://github.com/XueFuzhao/InstructionWild/blob/main/comparison.md).
BlueRum's avatar
BlueRum committed
285

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
286
287
288
289
290
291
### Limitation for LLaMA-finetuned models
- Both Alpaca and ColossalChat are based on LLaMA. It is hard to compensate for the missing knowledge in the pre-training stage.
- Lack of counting ability: Cannot count the number of items in a list.
- Lack of Logics (reasoning and calculation)
- Tend to repeat the last sentence (fail to produce the end token).
- Poor multilingual results: LLaMA is mainly trained on English datasets (Generation performs better than QA).
BlueRum's avatar
BlueRum committed
292

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
293
294
295
296
297
298
299
### Limitation of dataset
- Lack of summarization ability: No such instructions in finetune datasets.
- Lack of multi-turn chat: No such instructions in finetune datasets
- Lack of self-recognition: No such instructions in finetune datasets
- Lack of Safety:
  - When the input contains fake facts, the model makes up false facts and explanations.
  - Cannot abide by OpenAI's policy: When generating prompts from OpenAI API, it always abides by its policy. So no violation case is in the datasets.
BlueRum's avatar
BlueRum committed
300

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
## FAQ

### How to save/load checkpoint

We have integrated the Transformers save and load pipeline, allowing users to freely call Hugging Face's language models and save them in the HF format.

```
from coati.models.llama import LlamaLM
from coati.trainer import SFTTrainer

model = LlamaLM(pretrained=args.pretrain)
tokenizer = AutoTokenizer.from_pretrained(args.pretrain)

trainer = SFTTrainer(model=model,
    strategy=strategy,
    optim=optim,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    batch_size=args.batch_size,
    max_epochs=args.max_epochs,
    accimulation_steps = args.accimulation_steps
)

trainer.fit()
trainer.save_model(path=args.save_path, only_rank0=True, tokenizer=tokenizer)
```

328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
### How to train with limited resources

Here are some examples that can allow you to train a 7B model on a single or multiple consumer-grade GPUs.

If you only have a single 24G GPU, you can use the following script. `batch_size` and `lora_rank` are the most important parameters to successfully train the model.
```
torchrun --standalone --nproc_per_node=1 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy naive \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
    --accimulation_steps 8 \
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
    --lora_rank 16 \
```

`colossalai_gemini` strategy can enable a single 24G GPU to train the whole model without using LoRA if you have sufficient CPU memory. You can use the following script.
```
torchrun --standalone --nproc_per_node=1 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_gemini \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
    --accimulation_steps 8 \
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
``` 

If you have 4x32 GB GPUs, you can even train the whole 7B model using our `colossalai_zero2_cpu` strategy! The script is given as follows.
```
torchrun --standalone --nproc_per_node=4 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_zero2_cpu \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
    --accimulation_steps 8 \
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
```

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
381
382
383
384
385
386
387
388
389
390
## The Plan

- [x] implement PPO fine-tuning
- [x] implement training reward model
- [x] support LoRA
- [x] support inference
- [x] support llama from [facebook](https://github.com/facebookresearch/llama)
- [x] implement PPO-ptx fine-tuning
- [ ] integrate with Ray
- [ ] support more RL paradigms, like Implicit Language Q-Learning (ILQL),
391
- [ ] support chain-of-thought by [langchain](https://github.com/hwchase17/langchain)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433

### Real-time progress
You will find our progress in github project broad

[Coati](https://github.com/orgs/hpcaitech/projects/17/views/1)

## Invitation to open-source contribution
Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models from the starting point of replicating ChatGPT!

You may contact us or participate in the following ways:
1. [Leaving a Star ⭐](https://github.com/hpcaitech/ColossalAI/stargazers) to show your like and support. Thanks!
2. Posting an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose), or submitting a PR on GitHub follow the guideline in [Contributing](https://github.com/hpcaitech/ColossalAI/blob/main/CONTRIBUTING.md).
3. Join the Colossal-AI community on
[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),
and [WeChat(微信)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your ideas.
4. Send your official proposal to email contact@hpcaitech.com

Thanks so much to all of our amazing contributors!

## Quick Preview
<p id="ChatGPT_scaling" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT%20scaling.png" width=800/>
</p>

- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference

<p id="ChatGPT-1GPU" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT-1GPU.jpg" width=450/>
</p>

- Up to 10.3x growth in model capacity on one GPU
- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)

<p id="inference" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/LoRA%20data.jpg" width=600/>
</p>

- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
- Keep in a sufficiently high running speed

## Authors

434
435
436
437
438
439
440
441
442
443
Coati is developed by ColossalAI Team:
- [Fazzie](https://fazzie-key.cool/about/index.html)
- [FrankLeeeee](https://github.com/FrankLeeeee)
- [BlueRum](https://github.com/ht-zhou)
- [ver217](https://github.com/ver217)
- [ofey404](https://github.com/ofey404)

The Phd student from [(HPC-AI) Lab](https://ai.comp.nus.edu.sg/) also contributed a lot to this project.
- [Zangwei Zheng](https://github.com/zhengzangw)
- [Xue Fuzhao](https://github.com/XueFuzhao)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477

## Citations

```bibtex
@article{Hu2021LoRALA,
    title   = {LoRA: Low-Rank Adaptation of Large Language Models},
    author  = {Edward J. Hu and Yelong Shen and Phillip Wallis and Zeyuan Allen-Zhu and Yuanzhi Li and Shean Wang and Weizhu Chen},
    journal = {ArXiv},
    year    = {2021},
    volume  = {abs/2106.09685}
}

@article{ouyang2022training,
  title={Training language models to follow instructions with human feedback},
  author={Ouyang, Long and Wu, Jeff and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll L and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and others},
  journal={arXiv preprint arXiv:2203.02155},
  year={2022}
}

@article{touvron2023llama,
  title={LLaMA: Open and Efficient Foundation Language Models},
  author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
  journal={arXiv preprint arXiv:2302.13971},
  year={2023}
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
478
479
480
481
482
483
484
485
486

@misc{instructionwild,
  author = {Fuzhao Xue and Zangwei Zheng and Yang You },
  title = {Instruction in the Wild: A User-based Instruction Dataset},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/XueFuzhao/InstructionWild}},
}
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
487
488
489
490
491
```

## Licenses

Coati is licensed under the [Apache 2.0 License](LICENSE).