fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105)
* add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to
Showing
Please register or sign in to comment