Reduce Memory Cost in Flux Training (#9829)

* Improve NPU performance * Improve NPU performance * Improve NPU performance * Improve NPU performance * [bugfix] bugfix for npu free memory * [bugfix] bugfix for npu free memory * [bugfix] bugfix for npu free memory * Reduce memory cost for flux training process --------- Co-authored-by: 蒋硕 <jiangshuo9@h-partners.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Reduce Memory Cost in Flux Training (#9829)
* Improve NPU performance * Improve NPU performance * Improve NPU performance * Improve NPU performance * [bugfix] bugfix for npu free memory * [bugfix] bugfix for npu free memory * [bugfix] bugfix for npu free memory * Reduce memory cost for flux training process --------- Co-authored-by: 蒋硕 <jiangshuo9@h-partners.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
a98a839d · Leo Jiang · GitHub · 3deed729 · a98a839d · a98a839d
Unverified Commit a98a839d authored Nov 01, 2024 by Leo Jiang Committed by GitHub Nov 01, 2024
Showing with 12 additions and 0 deletions

examples/dreambooth/train_dreambooth_flux.py examples/dreambooth/train_dreambooth_flux.py +6 -0

examples/dreambooth/train_dreambooth_lora_flux.py examples/dreambooth/train_dreambooth_lora_flux.py +6 -0

No files found.
--- a/examples/dreambooth/train_dreambooth_flux.py
+++ b/examples/dreambooth/train_dreambooth_flux.py
@@ -1740,6 +1740,9 @@ def main(args):
                        torch_npu.npu.empty_cache()
                    gc.collect()
+                images = None
+                del pipeline
    # Save the lora layers
    accelerator.wait_for_everyone()
    if accelerator.is_main_process:
@@ -1798,6 +1801,9 @@ def main(args):
                ignore_patterns=["step_*", "epoch_*"],
            )
+        images = None
+        del pipeline
    accelerator.end_training()

--- a/examples/dreambooth/train_dreambooth_lora_flux.py
+++ b/examples/dreambooth/train_dreambooth_lora_flux.py
@@ -1844,6 +1844,9 @@ def main(args):
                    del text_encoder_one, text_encoder_two
                    free_memory()
+                images = None
+                del pipeline
    # Save the lora layers
    accelerator.wait_for_everyone()
    if accelerator.is_main_process:
@@ -1908,6 +1911,9 @@ def main(args):
                ignore_patterns=["step_*", "epoch_*"],
            )
+        images = None
+        del pipeline
    accelerator.end_training()