README.md 18.8 KB
Newer Older
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
1
<h1 align="center">
2
3
4
  <img width="auto" height="100px", src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/logo_coati.png"/>
  <br/>
  <span>ColossalChat</span>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
5
6
7
8
9
10
</h1>


## Table of Contents

- [Table of Contents](#table-of-contents)
11
- [What is ColossalChat and Coati ?](#what-is-colossalchat-and-coati-)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
12
13
14
15
16
17
- [Online demo](#online-demo)
- [Install](#install)
  - [Install the environment](#install-the-environment)
  - [Install the Transformers](#install-the-transformers)
- [How to use?](#how-to-use)
  - [Supervised datasets collection](#supervised-datasets-collection)
18
19
20
21
  - [RLHF Training Stage1 - Supervised instructs tuning](#RLHF-training-stage1---supervised-instructs-tuning)
  - [RLHF Training Stage2 - Training reward model](#RLHF-training-stage2---training-reward-model)
  - [RLHF Training Stage3 - Training model with reinforcement learning by human feedback](#RLHF-training-stage3---training-model-with-reinforcement-learning-by-human-feedback)
  - [Inference Quantization and Serving - After Training](#inference-quantization-and-serving---after-training)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
22
- [Coati7B examples](#coati7b-examples)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
23
24
  - [Generation](#generation)
  - [Open QA](#open-qa)
25
26
  - [Limitation for LLaMA-finetuned models](#limitation)
  - [Limitation of dataset](#limitation)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
27
- [FAQ](#faq)
28
29
  - [How to save/load checkpoint](#faq)
  - [How to train with limited resources](#faq)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
30
31
32
33
34
35
36
37
- [The Plan](#the-plan)
  - [Real-time progress](#real-time-progress)
- [Invitation to open-source contribution](#invitation-to-open-source-contribution)
- [Quick Preview](#quick-preview)
- [Authors](#authors)
- [Citations](#citations)
- [Licenses](#licenses)
---
38
## What is ColossalChat and Coati ?
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
39

40
[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat) is the project to implement LLM with RLHF, powered by the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) project.
41
42
43
44

Coati stands for `ColossalAI Talking Intelligence`. It is the name for the module implemented in this project and is also the name of the large language model developed by the ColossalChat project.

The Coati package provides a unified large language model framework that has implemented the following functions
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
45
46
- Supports comprehensive large-model training acceleration capabilities for ColossalAI, without requiring knowledge of complex distributed training algorithms
- Supervised datasets collection
47
- Supervised instructions fine-tuning
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
48
49
50
51
- Training reward model
- Reinforcement learning with human feedback
- Quantization inference
- Fast model deploying
52
- Perfectly integrated with the Hugging Face ecosystem, a high degree of model customization
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
53

54
55
56
57
<div align="center">
  <p align="center">
    <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/chatgpt.png" width=700/>
  </p>
58

59
60
   Image source: https://openai.com/blog/chatgpt
</div>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
61

62
**As Colossal-AI is undergoing some major updates, this project will be actively maintained to stay in line with the Colossal-AI project.**
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
63
64


65
66
67
More details can be found in the latest news.
* [2023/03] [ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b)
* [2023/02] [Open Source Solution Replicates ChatGPT Training Process! Ready to go with only 1.6GB GPU Memory](https://www.hpc-ai.tech/blog/colossal-ai-chatgpt)
68

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
69
## Online demo
70
71
72
73
74
<div align="center">
   <a href="https://www.youtube.com/watch?v=HcTiHzApHm0">
   <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/ColossalChat%20YouTube.png" width="700" />
   </a>
</div>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
75

76
77
78
[ColossalChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat): An open-source solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline.
[[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)
[[blog]](https://medium.com/@yangyou_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b)
79
80
81
82
83
84
[[demo]](https://www.youtube.com/watch?v=HcTiHzApHm0)
[[tutorial]](https://www.youtube.com/watch?v=-qFBZFmOJfg)

<p id="ColossalChat-Speed" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/ColossalChat%20Speed.jpg" width=450/>
</p>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
85

86
> DeepSpeedChat performance comes from its blog on 2023 April 12, ColossalChat performance can be reproduced on an AWS p4d.24xlarge node with 8 A100-40G GPUs with the following command: torchrun --standalone --nproc_per_node 8 benchmark_opt_lora_dummy.py --num_collect_steps 1 --use_kernels --strategy colossalai_zero2 --experience_batch_size 64 --train_batch_size 32
87

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
88
89
90
91
92
## Install

### Install the environment

```shell
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
93
conda create -n coati
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
94
conda activate coati
binmakeswell's avatar
binmakeswell committed
95
96
git clone https://github.com/hpcaitech/ColossalAI.git
cd ColossalAI/applications/Chat
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
pip install .
```

### Install the Transformers
Given Hugging Face hasn't officially supported the LLaMA models, We fork a branch of Transformers that can be compatible with our code

```shell
git clone https://github.com/hpcaitech/transformers
cd transformers
pip install .
```

## How to use?

### Supervised datasets collection

113
we collected 104K bilingual datasets of Chinese and English, and you can find the datasets in this repo
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
114
[InstructionWild](https://github.com/XueFuzhao/InstructionWild)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
115
116
117

Here is how we collected the data
<p align="center">
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
118
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/data-collect.png" width=500/>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
119
120
</p>

121
### RLHF Training Stage1 - Supervised instructs tuning
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
122

123
Stage1 is supervised instructs fine-tuning, which uses the datasets mentioned earlier to fine-tune the model.
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
124

125
You can run the `examples/train_sft.sh` to start a supervised instructs fine-tuning.
126
[[Stage1 tutorial video]](https://www.youtube.com/watch?v=-qFBZFmOJfg)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
127

128
### RLHF Training Stage2 - Training reward model
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
129
130
131

Stage2 trains a reward model, which obtains corresponding scores by manually ranking different outputs for the same prompt and supervises the training of the reward model

132
You can run the `examples/train_rm.sh` to start a reward model training.
133
[[Stage2 tutorial video]](https://www.youtube.com/watch?v=gMx2CApKhuo)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
134

135
### RLHF Training Stage3 - Training model with reinforcement learning by human feedback
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
136
137
138
139

Stage3 uses reinforcement learning algorithm, which is the most complex part of the training process:

<p align="center">
BlueRum's avatar
BlueRum committed
140
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/stage-3.jpeg" width=800/>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
141
142
</p>

143
You can run the `examples/train_prompts.sh` to start training PPO with human feedback.
144
[[Stage3 tutorial video]](https://www.youtube.com/watch?v=Z8wwSHxPL9g)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
145

BlueRum's avatar
BlueRum committed
146
For more details, see [`examples/`](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/examples).
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
147

148
### Inference Quantization and Serving - After Training
BlueRum's avatar
BlueRum committed
149

150
We provide an online inference server and a benchmark. We aim to run inference on single GPU, so quantization is essential when using large models.
BlueRum's avatar
BlueRum committed
151

152
153
We support 8-bit quantization (RTN), 4-bit quantization (GPTQ), and  FP16 inference. You can
Online inference server scripts can help you deploy your own services.
BlueRum's avatar
BlueRum committed
154
155

For more details, see [`inference/`](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat/inference).
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
156
157
158

## Coati7B examples

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
159
160
161
162
### Generation

<details><summary><b>E-mail</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
163
![phd](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/Phd.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
164
165
166
167
</details>

<details><summary><b>coding</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
168
![sort](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/quick_sort.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
169
170
171
172
173

</details>

<details><summary><b>regex</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
174
![regex](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/regex.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
175
176
177
178
179

</details>

<details><summary><b>Tex</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
180
![tex](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/tex.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
181
182
183
184
185

</details>

<details><summary><b>writing</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
186
![writing](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/writing.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
187
188
189
190
191

</details>

<details><summary><b>Table</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
192
![Table](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/table.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
193
194
195
196
197
198

</details>

### Open QA
<details><summary><b>Game</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
199
![Game](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/game.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
200
201
202
203
204

</details>

<details><summary><b>Travel</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
205
![Travel](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/travel.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
206
207
208
209
210

</details>

<details><summary><b>Physical</b></summary>

BlueRum's avatar
BlueRum committed
211
![Physical](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/physical.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
212
213
214
215
216

</details>

<details><summary><b>Chemical</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
217
![Chemical](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/chemical.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
218
219
220
221
222

</details>

<details><summary><b>Economy</b></summary>

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
223
![Economy](https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chat/economy.png)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
224
225

</details>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
226

BlueRum's avatar
BlueRum committed
227
You can find more examples in this [repo](https://github.com/XueFuzhao/InstructionWild/blob/main/comparison.md).
BlueRum's avatar
BlueRum committed
228

229
230
### Limitation
<details><summary><b>Limitation for LLaMA-finetuned models</b></summary>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
231
232
233
234
235
- Both Alpaca and ColossalChat are based on LLaMA. It is hard to compensate for the missing knowledge in the pre-training stage.
- Lack of counting ability: Cannot count the number of items in a list.
- Lack of Logics (reasoning and calculation)
- Tend to repeat the last sentence (fail to produce the end token).
- Poor multilingual results: LLaMA is mainly trained on English datasets (Generation performs better than QA).
236
</details>
BlueRum's avatar
BlueRum committed
237

238
<details><summary><b>Limitation of dataset</b></summary>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
239
240
241
242
243
244
- Lack of summarization ability: No such instructions in finetune datasets.
- Lack of multi-turn chat: No such instructions in finetune datasets
- Lack of self-recognition: No such instructions in finetune datasets
- Lack of Safety:
  - When the input contains fake facts, the model makes up false facts and explanations.
  - Cannot abide by OpenAI's policy: When generating prompts from OpenAI API, it always abides by its policy. So no violation case is in the datasets.
245
</details>
BlueRum's avatar
BlueRum committed
246

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
247
248
## FAQ

249
<details><summary><b>How to save/load checkpoint</b></summary>
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
250
251
252
253
254
255
256
257
258
259

We have integrated the Transformers save and load pipeline, allowing users to freely call Hugging Face's language models and save them in the HF format.

```
from coati.models.llama import LlamaLM
from coati.trainer import SFTTrainer

model = LlamaLM(pretrained=args.pretrain)
tokenizer = AutoTokenizer.from_pretrained(args.pretrain)

260
(model, optim) = strategy.prepare((model, optim))
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
261
262
263
264
265
266
267
trainer = SFTTrainer(model=model,
    strategy=strategy,
    optim=optim,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    batch_size=args.batch_size,
    max_epochs=args.max_epochs,
268
    accumulation_steps = args.accumulation_steps
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
269
270
271
)

trainer.fit()
272
273
274
275
276
# this saves in pytorch format
strategy.save_model(model, args.save_path, only_rank0=True)

# this saves in HF format. ColossalAI strategy with stage-3 doesn't support this method
strategy.save_pretrained(model, args.save_path, only_rank0=True, tokenizer=tokenizer)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
277
278
```

279
280
281
</details>

<details><summary><b>How to train with limited resources</b></summary>
282
283
284

Here are some examples that can allow you to train a 7B model on a single or multiple consumer-grade GPUs.

285
If you only have a single 24G GPU, you can use the following script. `batch_size`, `lora_rank` and `grad_checkpoint` are the most important parameters to successfully train the model.
286
287
288
289
```
torchrun --standalone --nproc_per_node=1 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
290
    --strategy ddp \
291
292
293
294
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
295
    --accumulation_steps 8 \
296
297
298
299
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
    --lora_rank 16 \
300
    --grad_checkpoint
301
302
303
304
305
306
307
308
309
310
311
312
```

`colossalai_gemini` strategy can enable a single 24G GPU to train the whole model without using LoRA if you have sufficient CPU memory. You can use the following script.
```
torchrun --standalone --nproc_per_node=1 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_gemini \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
313
    --accumulation_steps 8 \
314
315
316
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
317
    --grad_checkpoint
318
```
319
320
321
322
323
324
325
326
327
328
329

If you have 4x32 GB GPUs, you can even train the whole 7B model using our `colossalai_zero2_cpu` strategy! The script is given as follows.
```
torchrun --standalone --nproc_per_node=4 train_sft.py \
    --pretrain "/path/to/LLaMa-7B/" \
    --model 'llama' \
    --strategy colossalai_zero2_cpu \
    --log_interval 10 \
    --save_path  /path/to/Coati-7B \
    --dataset /path/to/data.json \
    --batch_size 1 \
330
    --accumulation_steps 8 \
331
332
333
    --lr 2e-5 \
    --max_datasets_size 512 \
    --max_epochs 1 \
334
    --grad_checkpoint
335
```
336
337
</details>

338

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
339
340
341
342
343
344
345
346
347
348
## The Plan

- [x] implement PPO fine-tuning
- [x] implement training reward model
- [x] support LoRA
- [x] support inference
- [x] support llama from [facebook](https://github.com/facebookresearch/llama)
- [x] implement PPO-ptx fine-tuning
- [ ] integrate with Ray
- [ ] support more RL paradigms, like Implicit Language Q-Learning (ILQL),
349
- [ ] support chain-of-thought by [langchain](https://github.com/hwchase17/langchain)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369

### Real-time progress
You will find our progress in github project broad

[Coati](https://github.com/orgs/hpcaitech/projects/17/views/1)

## Invitation to open-source contribution
Referring to the successful attempts of [BLOOM](https://bigscience.huggingface.co/) and [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), any and all developers and partners with computing powers, datasets, models are welcome to join and build the Colossal-AI community, making efforts towards the era of big AI models from the starting point of replicating ChatGPT!

You may contact us or participate in the following ways:
1. [Leaving a Star ⭐](https://github.com/hpcaitech/ColossalAI/stargazers) to show your like and support. Thanks!
2. Posting an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose), or submitting a PR on GitHub follow the guideline in [Contributing](https://github.com/hpcaitech/ColossalAI/blob/main/CONTRIBUTING.md).
3. Join the Colossal-AI community on
[Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w),
and [WeChat(微信)](https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/WeChat.png "qrcode") to share your ideas.
4. Send your official proposal to email contact@hpcaitech.com

Thanks so much to all of our amazing contributors!

## Quick Preview
370
371
372
373
374
375
376
377
<div align="center">
   <a href="https://chat.colossalai.org/">
   <img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/Chat-demo.png" width="700" />
   </a>
</div>

- An open-source low cost solution for cloning [ChatGPT](https://openai.com/blog/chatgpt/) with a complete RLHF pipeline. [[demo]](https://chat.colossalai.org)

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
<p id="ChatGPT_scaling" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT%20scaling.png" width=800/>
</p>

- Up to 7.73 times faster for single server training and 1.42 times faster for single-GPU inference

<p id="ChatGPT-1GPU" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/ChatGPT-1GPU.jpg" width=450/>
</p>

- Up to 10.3x growth in model capacity on one GPU
- A mini demo training process requires only 1.62GB of GPU memory (any consumer-grade GPU)

<p id="inference" align="center">
<img src="https://raw.githubusercontent.com/hpcaitech/public_assets/main/applications/chatgpt/LoRA%20data.jpg" width=600/>
</p>

- Increase the capacity of the fine-tuning model by up to 3.7 times on a single GPU
- Keep in a sufficiently high running speed

Yuanchen's avatar
Yuanchen committed
398
399
400
401
402
403
404
|  Model Pair   | Alpaca-7B ⚔ Coati-7B | Coati-7B ⚔ Alpaca-7B |
| :-----------: | :------------------: | :------------------: |
| Better Cases  |     38 ⚔ **41**      |     **45** ⚔ 33      |
|   Win Rate    |    48% ⚔ **52%**     |    **58%** ⚔ 42%     |
| Average Score |   7.06 ⚔ **7.13**    |   **7.31** ⚔ 6.82    |
- Our Coati-7B model performs better than Alpaca-7B when using GPT-4 to evaluate model performance. The Coati-7B model we evaluate is an old version we trained a few weeks ago and the new version is around the corner.

Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
405
406
## Authors

407
408
409
410
411
412
413
414
415
416
Coati is developed by ColossalAI Team:
- [Fazzie](https://fazzie-key.cool/about/index.html)
- [FrankLeeeee](https://github.com/FrankLeeeee)
- [BlueRum](https://github.com/ht-zhou)
- [ver217](https://github.com/ver217)
- [ofey404](https://github.com/ofey404)

The Phd student from [(HPC-AI) Lab](https://ai.comp.nus.edu.sg/) also contributed a lot to this project.
- [Zangwei Zheng](https://github.com/zhengzangw)
- [Xue Fuzhao](https://github.com/XueFuzhao)
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450

## Citations

```bibtex
@article{Hu2021LoRALA,
    title   = {LoRA: Low-Rank Adaptation of Large Language Models},
    author  = {Edward J. Hu and Yelong Shen and Phillip Wallis and Zeyuan Allen-Zhu and Yuanzhi Li and Shean Wang and Weizhu Chen},
    journal = {ArXiv},
    year    = {2021},
    volume  = {abs/2106.09685}
}

@article{ouyang2022training,
  title={Training language models to follow instructions with human feedback},
  author={Ouyang, Long and Wu, Jeff and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll L and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and others},
  journal={arXiv preprint arXiv:2203.02155},
  year={2022}
}

@article{touvron2023llama,
  title={LLaMA: Open and Efficient Foundation Language Models},
  author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
  journal={arXiv preprint arXiv:2302.13971},
  year={2023}
}

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
451
452
453
454
455
456
457
458
459

@misc{instructionwild,
  author = {Fuzhao Xue and Zangwei Zheng and Yang You },
  title = {Instruction in the Wild: A User-based Instruction Dataset},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/XueFuzhao/InstructionWild}},
}
Fazzie-Maqianli's avatar
Fazzie-Maqianli committed
460
461
462
463
464
```

## Licenses

Coati is licensed under the [Apache 2.0 License](LICENSE).