• Will Berman's avatar
    controlnet training resize inputs to multiple of 8 (#3135) · 7e6886f5
    Will Berman authored
    controlnet training center crop input images to multiple of 8
    
    The pipeline code resizes inputs to multiples of 8.
    Not doing this resizing in the training script is causing
    the encoded image to have different height/width dimensions
    than the encoded conditioning image (which uses a separate
    encoder that's part of the controlnet model).
    
    We resize and center crop the inputs to make sure they're the
    same size (as well as all other images in the batch). We also
    check that the initial resolution is a multiple of 8.
    7e6886f5
train_controlnet.py 41.8 KB