1. 29 Aug, 2023 1 commit
    • yingliu-hpc's avatar
      [coati] add chatglm model (#4539) · 1467e3b4
      yingliu-hpc authored
      * update configuration of chatglm and add support in coati
      
      * add unit test & update chatglm default config & fix bos index issue
      
      * remove chatglm due to oom
      
      * add dataset pkg in requirement-text
      
      * fix parameter issue in test_models
      
      * add ref in tokenize & rm unnessary parts
      
      * separate source & target tokenization in chatglm
      
      * add unit test to chatglm
      
      * fix test dataset issue
      
      * update truncation of chatglm
      
      * fix Colossalai version
      
      * fix colossal ai version in test
      1467e3b4
  2. 16 Aug, 2023 1 commit
  3. 02 Aug, 2023 1 commit
    • Wenhao Chen's avatar
      [chat] fix bugs and add unit tests (#4213) · da4f7b85
      Wenhao Chen authored
      * style: rename replay buffer
      
      Experience replay is typically for off policy algorithms.
      Use this name in PPO maybe misleading.
      
      * fix: fix wrong zero2 default arg
      
      * test: update experience tests
      
      * style: rename zero_pad fn
      
      * fix: defer init in CycledDataLoader
      
      * test: add benchmark test
      
      * style: rename internal fn of generation
      
      * style: rename internal fn of lora
      
      * fix: remove unused loss fn
      
      * fix: remove unused utils fn
      
      * refactor: remove generate_with_actor fn
      
      * fix: fix type annotation
      
      * test: add models tests
      
      * fix: skip llama due to long execution time
      
      * style: modify dataset
      
      * style: apply formatter
      
      * perf: update reward dataset
      
      * fix: fix wrong IGNORE_INDEX in sft dataset
      
      * fix: remove DataCollatorForSupervisedDataset
      
      * test: add dataset tests
      
      * style: apply formatter
      
      * style: rename test_ci to test_train
      
      * feat: add llama in inference
      
      * test: add inference tests
      
      * test: change test scripts directory
      
      * fix: update ci
      
      * fix: fix typo
      
      * fix: skip llama due to oom
      
      * fix: fix file mod
      
      * style: apply formatter
      
      * refactor: remove duplicated llama_gptq
      
      * style: apply formatter
      
      * to: update rm test
      
      * feat: add tokenizer arg
      
      * feat: add download model script
      
      * test: update train tests
      
      * fix: modify gemini load and save pretrained
      
      * test: update checkpoint io test
      
      * to: modify nproc_per_node
      
      * fix: do not remove existing dir
      
      * fix: modify save path
      
      * test: add random choice
      
      * fix: fix sft path
      
      * fix: enlarge nproc_per_node to avoid oom
      
      * fix: add num_retry
      
      * fix: make lora config of rm and critic consistent
      
      * fix: add warning about lora weights
      
      * fix: skip some gpt2 tests
      
      * fix: remove grad ckpt in rm and critic due to errors
      
      * refactor: directly use Actor in train_sft
      
      * test: add more arguments
      
      * fix: disable grad ckpt when using lora
      
      * fix: fix save_pretrained and related tests
      
      * test: enable zero2 tests
      
      * revert: remove useless fn
      
      * style: polish code
      
      * test: modify test args
      da4f7b85
  4. 29 Jun, 2023 1 commit
    • Wenhao Chen's avatar
      [chat] remove naive strategy and split colossalai strategy (#4094) · edd75a59
      Wenhao Chen authored
      * feat: remove on_learn_epoch fn as not used
      
      * revert: add _on_learn_epoch fn
      
      * to: remove the use of NaiveStrategy
      
      * test: remove NaiveStrategy tests
      
      * feat: remove NaiveStrategy
      
      * style: modify comments and params
      
      * feat: split ColossalAIStrategy into LowLevelZeroStrategy and GeminiStrategy
      
      * fix: remove naive
      
      * fix: align with modified colossal strategy
      
      * fix: fix ddp _try_init_dist arg
      edd75a59
  5. 25 Jun, 2023 1 commit
    • Wenhao Chen's avatar
      [chat] refactor strategy class with booster api (#3987) · 153b957a
      Wenhao Chen authored
      * refactor: adapt boost API in base and naive strategies
      
      * fix: initialize plugin after setup_distributed
      
      * fix: fix save_pretrained fn
      
      * refactor: adapt boost API in DDPStrategy
      
      * to: add _post_init check
      
      * to: fix ddp backward, modify ddp dataloader and unwrap
      
      * feat: adapt boost API in ColossalAIStrategy
      
      * fix: call setup_distributed before use get_current_device
      
      * fix: fix save_model and save_optimizer
      
      * test: remove save_sharded_optimizer test
      
      * style: apply formatter
      
      * fix: fix stage check and add comments
      
      * feat: allow dict type arg in strategy.prepare
      
      * to: temporarily remove lr_scheduler for testing
      
      * style: simplify init of ColossalAIStrategy
      
      * fix: fix lr_scheduler in sft and rm
      
      * style: modify comments
      
      * test: add train_prompts tests
      
      * fix: fix inference only case and use in train_prompts
      
      * test: skip failed tests in ci
      
      * style: fix CodeFactor check
      
      * fix: do not use model.to('cpu') with GeminiPlugin
      
      * test: enable colossalai_gemini tests
      
      * test: set CUDA_VISIBLE_DEVICES in ci
      
      * docs: add note
      153b957a
  6. 19 Jun, 2023 1 commit
  7. 13 Jun, 2023 1 commit
    • Wenhao Chen's avatar
      [chat] refactor actor class (#3968) · 9d02590c
      Wenhao Chen authored
      * refactor: separate log_probs fn from Actor forward fn
      
      * refactor: separate generate fn from Actor class
      
      * feat: update unwrap_model and get_base_model
      * unwrap_model returns model not wrapped by Strategy
      * get_base_model returns HF model for Actor, Critic and RewardModel
      
      * feat: simplify Strategy.prepare
      
      * style: remove get_base_model method of Actor
      
      * perf: tokenize text in batches
      
      * refactor: move calc_action_log_probs to utils of model
      
      * test: update test with new forward fn
      
      * style: rename forward fn args
      
      * fix: do not unwrap model in save_model fn of naive strategy
      
      * test: add gemini test for train_prompts
      
      * fix: fix _set_default_generate_kwargs
      9d02590c
  8. 27 Apr, 2023 1 commit
    • Hongxin Liu's avatar
      [chat] refactor model save/load logic (#3654) · 842768a1
      Hongxin Liu authored
      * [chat] strategy refactor unwrap model
      
      * [chat] strategy refactor save model
      
      * [chat] add docstr
      
      * [chat] refactor trainer save model
      
      * [chat] fix strategy typing
      
      * [chat] refactor trainer save model
      
      * [chat] update readme
      
      * [chat] fix unit test
      842768a1
  9. 26 Apr, 2023 1 commit
    • Hongxin Liu's avatar
      [gemini] accelerate inference (#3641) · 50793b35
      Hongxin Liu authored
      * [gemini] support don't scatter after inference
      
      * [chat] update colossalai strategy
      
      * [chat] fix opt benchmark
      
      * [chat] update opt benchmark
      
      * [gemini] optimize inference
      
      * [test] add gemini inference test
      
      * [chat] fix unit test ci
      
      * [chat] fix ci
      
      * [chat] fix ci
      
      * [chat] skip checkpoint test
      50793b35
  10. 06 Apr, 2023 1 commit
  11. 28 Mar, 2023 1 commit