[fix][FSDP] fix weight init when using apply() (fixes #490 and #444) (#543)
* Add new test for weight init (fails) * Set FSDP.compute_device so summon_full_params works before module moves to CUDA * Override FSDP.apply to enable custom weight init
Showing
Please register or sign in to comment