[run_clm example] add torch_dtype option for model load. (#20971)
* [run_clm example] add torch_dtype option for model load. for BLOOM 175B model. peak memory will reduce about 350G for inference. the weight of BLOOM in model hub is bfloat16 Signed-off-by:Wang, Yi A <yi.a.wang@intel.com> * add other type in option * fix style Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
Showing
Please register or sign in to comment