Commits · efba0f44b99f595731598d29792f5f229c7c0696 · OpenDAS / ColossalAI

29 Aug, 2023 1 commit

[coati] add chatglm model (#4539) · 1467e3b4

yingliu-hpc authored Aug 29, 2023

* update configuration of chatglm and add support in coati

* add unit test & update chatglm default config & fix bos index issue

* remove chatglm due to oom

* add dataset pkg in requirement-text

* fix parameter issue in test_models

* add ref in tokenize & rm unnessary parts

* separate source & target tokenization in chatglm

* add unit test to chatglm

* fix test dataset issue

* update truncation of chatglm

* fix Colossalai version

* fix colossal ai version in test

1467e3b4

16 Aug, 2023 1 commit

[devops] add large-scale distributed test marker (#4452) · 26e29d58

Hongxin Liu authored Aug 16, 2023

* [test] remove cpu marker

* [test] remove gpu marker

* [test] update pytest markers

* [ci] update unit test ci

26e29d58

02 Aug, 2023 1 commit

[chat] fix bugs and add unit tests (#4213) · da4f7b85

Wenhao Chen authored Aug 02, 2023

* style: rename replay buffer

Experience replay is typically for off policy algorithms.
Use this name in PPO maybe misleading.

* fix: fix wrong zero2 default arg

* test: update experience tests

* style: rename zero_pad fn

* fix: defer init in CycledDataLoader

* test: add benchmark test

* style: rename internal fn of generation

* style: rename internal fn of lora

* fix: remove unused loss fn

* fix: remove unused utils fn

* refactor: remove generate_with_actor fn

* fix: fix type annotation

* test: add models tests

* fix: skip llama due to long execution time

* style: modify dataset

* style: apply formatter

* perf: update reward dataset

* fix: fix wrong IGNORE_INDEX in sft dataset

* fix: remove DataCollatorForSupervisedDataset

* test: add dataset tests

* style: apply formatter

* style: rename test_ci to test_train

* feat: add llama in inference

* test: add inference tests

* test: change test scripts directory

* fix: update ci

* fix: fix typo

* fix: skip llama due to oom

* fix: fix file mod

* style: apply formatter

* refactor: remove duplicated llama_gptq

* style: apply formatter

* to: update rm test

* feat: add tokenizer arg

* feat: add download model script

* test: update train tests

* fix: modify gemini load and save pretrained

* test: update checkpoint io test

* to: modify nproc_per_node

* fix: do not remove existing dir

* fix: modify save path

* test: add random choice

* fix: fix sft path

* fix: enlarge nproc_per_node to avoid oom

* fix: add num_retry

* fix: make lora config of rm and critic consistent

* fix: add warning about lora weights

* fix: skip some gpt2 tests

* fix: remove grad ckpt in rm and critic due to errors

* refactor: directly use Actor in train_sft

* test: add more arguments

* fix: disable grad ckpt when using lora

* fix: fix save_pretrained and related tests

* test: enable zero2 tests

* revert: remove useless fn

* style: polish code

* test: modify test args

da4f7b85