[modeling_utils] use less cpu memory with sharded checkpoint loading (#16844)

* less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI

[modeling_utils] use less cpu memory with sharded checkpoint loading (#16844)
* less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI
afa1ef09 · Stas Bekman · GitHub · e13a91fe · afa1ef09
Unverified Commit afa1ef09 authored Apr 20, 2022 by Stas Bekman Committed by GitHub Apr 20, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 0 deletions

src/transformers/modeling_utils.py src/transformers/modeling_utils.py +5 -0

No files found.
--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py
@@ -14,6 +14,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import gc
 import json
 import os
 import re
@@ -2149,6 +2150,10 @@ class PreTrainedModel(nn.Module, ModuleUtilsMixin, GenerationMixin, PushToHubMix
                else:
                    error_msgs += _load_state_dict_into_model(model_to_load, state_dict, start_prefix)
+                # force memory release
+                del state_dict
+                gc.collect()
        if len(error_msgs) > 0:
            error_msg = "\n\t".join(error_msgs)
            raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")