Unverified Commit 280fcdc4 authored by NatalieC323's avatar NatalieC323 Committed by GitHub
Browse files
parent 4d5d8f98
...@@ -47,40 +47,21 @@ conda env create -f environment.yaml ...@@ -47,40 +47,21 @@ conda env create -f environment.yaml
conda activate ldm conda activate ldm
``` ```
You can also update an existing [latent diffusion](https://github.com/CompVis/latent-diffusion) environment by running You can also update an existing [latent diffusion](https://github.com/CompVis/latent-diffusion) environment by running:
``` ```
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install transformers diffusers invisible-watermark pip install transformers diffusers invisible-watermark
``` ```
#### Step 2: install lightning #### Step 2:Install [Colossal-AI](https://colossalai.org/download/) From Our Official Website
Install Lightning version later than 2022.01.04. We suggest you install lightning from source. Notice that the default download path of pip should be within the conda environment, or you may need to specify using 'which pip' and redirect the path into conda environment.
##### From Source
```
git clone https://github.com/Lightning-AI/lightning.git
pip install -r requirements.txt
python setup.py install
```
##### From pip
```
pip install pytorch-lightning
```
#### Step 3:Install [Colossal-AI](https://colossalai.org/download/) From Our Official Website
You can install the latest version (0.2.7) from our official website or from source. Notice that the suitable version for this training is colossalai(0.2.5), which stands for torch(1.12.1). You can install the latest version (0.2.7) from our official website or from source. Notice that the suitable version for this training is colossalai(0.2.5), which stands for torch(1.12.1).
##### Download suggested verision for this training ##### Download suggested verision for this training
``` ```
pip install colossalai==0.2.5 pip install colossalai==0.2.5
``` ```
##### Download the latest version from pip for latest torch version ##### Download the latest version from pip for latest torch version
...@@ -89,7 +70,7 @@ pip install colossalai==0.2.5 ...@@ -89,7 +70,7 @@ pip install colossalai==0.2.5
pip install colossalai pip install colossalai
``` ```
##### From source ##### From source:
``` ```
git clone https://github.com/hpcaitech/ColossalAI.git git clone https://github.com/hpcaitech/ColossalAI.git
...@@ -99,7 +80,7 @@ cd ColossalAI ...@@ -99,7 +80,7 @@ cd ColossalAI
CUDA_EXT=1 pip install . CUDA_EXT=1 pip install .
``` ```
#### Step 4:Accelerate with flash attention by xformers(Optional) #### Step 3:Accelerate with flash attention by xformers(Optional)
Notice that xformers will accelerate the training process in cost of extra disk space. The suitable version of xformers for this training process is 0.12.0. You can download xformers directly via pip. For more release versions, feel free to check its official website: [XFormers](./https://pypi.org/project/xformers/) Notice that xformers will accelerate the training process in cost of extra disk space. The suitable version of xformers for this training process is 0.12.0. You can download xformers directly via pip. For more release versions, feel free to check its official website: [XFormers](./https://pypi.org/project/xformers/)
...@@ -113,7 +94,7 @@ To use the stable diffusion Docker image, you can either build using the provide ...@@ -113,7 +94,7 @@ To use the stable diffusion Docker image, you can either build using the provide
``` ```
# 1. build from dockerfile # 1. build from dockerfile
cd docker cd ColossalAI/examples/images/diffusion/docker
docker build -t hpcaitech/diffusion:0.2.0 . docker build -t hpcaitech/diffusion:0.2.0 .
# 2. pull from our docker hub # 2. pull from our docker hub
...@@ -127,7 +108,7 @@ Once you have the image ready, you can launch the image with the following comma ...@@ -127,7 +108,7 @@ Once you have the image ready, you can launch the image with the following comma
# On Your Host Machine # # On Your Host Machine #
######################## ########################
# make sure you start your image in the repository root directory # make sure you start your image in the repository root directory
cd Colossal-AI cd ColossalAI
# run the docker container # run the docker container
docker run --rm \ docker run --rm \
...@@ -144,13 +125,15 @@ docker run --rm \ ...@@ -144,13 +125,15 @@ docker run --rm \
# Once you have entered the docker container, go to the stable diffusion directory for training # Once you have entered the docker container, go to the stable diffusion directory for training
cd examples/images/diffusion/ cd examples/images/diffusion/
# Download the model checkpoint from pretrained (See the following steps)
# Set up your configuration the "train_colossalai.sh" (See the following steps)
# start training with colossalai # start training with colossalai
bash train_colossalai.sh bash train_colossalai.sh
``` ```
It is important for you to configure your volume mapping in order to get the best training experience. It is important for you to configure your volume mapping in order to get the best training experience.
1. **Mandatory**, mount your prepared data to `/data/scratch` via `-v <your-data-dir>:/data/scratch`, where you need to replace `<your-data-dir>` with the actual data path on your machine. 1. **Mandatory**, mount your prepared data to `/data/scratch` via `-v <your-data-dir>:/data/scratch`, where you need to replace `<your-data-dir>` with the actual data path on your machine. Notice that within docker we need to transform Win expresison into Linuxd, e.g. C:\User\Desktop into /c/User/Desktop.
2. **Recommended**, store the downloaded model weights to your host machine instead of the container directory via `-v <hf-cache-dir>:/root/.cache/huggingface`, where you need to repliace the `<hf-cache-dir>` with the actual path. In this way, you don't have to repeatedly download the pretrained weights for every `docker run`. 2. **Recommended**, store the downloaded model weights to your host machine instead of the container directory via `-v <hf-cache-dir>:/root/.cache/huggingface`, where you need to replace the `<hf-cache-dir>` with the actual path. In this way, you don't have to repeatedly download the pretrained weights for every `docker run`.
3. **Optional**, if you encounter any problem stating that shared memory is insufficient inside container, please add `-v /dev/shm:/dev/shm` to your `docker run` command. 3. **Optional**, if you encounter any problem stating that shared memory is insufficient inside container, please add `-v /dev/shm:/dev/shm` to your `docker run` command.
......
...@@ -5,87 +5,105 @@ from PIL import Image ...@@ -5,87 +5,105 @@ from PIL import Image
from torch.utils.data import Dataset from torch.utils.data import Dataset
from torchvision import transforms from torchvision import transforms
# This class is used to create a dataset of images from LSUN dataset for training
class LSUNBase(Dataset): class LSUNBase(Dataset):
def __init__(self, def __init__(self,
txt_file, txt_file, # path to the text file containing the list of image paths
data_root, data_root, # root directory of the LSUN dataset
size=None, size=None, # the size of images to resize to
interpolation="bicubic", interpolation="bicubic", # interpolation method to be used while resizing
flip_p=0.5 flip_p=0.5 # probability of random horizontal flipping
): ):
self.data_paths = txt_file self.data_paths = txt_file # store path to text file containing list of images
self.data_root = data_root self.data_root = data_root # store path to root directory of the dataset
with open(self.data_paths, "r") as f: with open(self.data_paths, "r") as f: # open and read the text file
self.image_paths = f.read().splitlines() self.image_paths = f.read().splitlines() # read the lines of the file and store as list
self._length = len(self.image_paths) self._length = len(self.image_paths) # store the number of images
# create dictionary to hold image path information
self.labels = { self.labels = {
"relative_file_path_": [l for l in self.image_paths], "relative_file_path_": [l for l in self.image_paths],
"file_path_": [os.path.join(self.data_root, l) "file_path_": [os.path.join(self.data_root, l)
for l in self.image_paths], for l in self.image_paths],
} }
self.size = size # set the image size to be resized
self.size = size
# set the interpolation method for resizing the image
self.interpolation = {"linear": PIL.Image.LINEAR, self.interpolation = {"linear": PIL.Image.LINEAR,
"bilinear": PIL.Image.BILINEAR, "bilinear": PIL.Image.BILINEAR,
"bicubic": PIL.Image.BICUBIC, "bicubic": PIL.Image.BICUBIC,
"lanczos": PIL.Image.LANCZOS, "lanczos": PIL.Image.LANCZOS,
}[interpolation] }[interpolation]
# randomly flip the image horizontally with a given probability
self.flip = transforms.RandomHorizontalFlip(p=flip_p) self.flip = transforms.RandomHorizontalFlip(p=flip_p)
def __len__(self): def __len__(self):
# return the length of dataset
return self._length return self._length
def __getitem__(self, i): def __getitem__(self, i):
# get the image path for the given index
example = dict((k, self.labels[k][i]) for k in self.labels) example = dict((k, self.labels[k][i]) for k in self.labels)
image = Image.open(example["file_path_"]) image = Image.open(example["file_path_"])
# convert it to RGB format
if not image.mode == "RGB": if not image.mode == "RGB":
image = image.convert("RGB") image = image.convert("RGB")
# default to score-sde preprocessing # default to score-sde preprocessing
img = np.array(image).astype(np.uint8)
crop = min(img.shape[0], img.shape[1]) img = np.array(image).astype(np.uint8) # convert image to numpy array
h, w, = img.shape[0], img.shape[1] crop = min(img.shape[0], img.shape[1]) # crop the image to a square shape
h, w, = img.shape[0], img.shape[1] # get the height and width of image
img = img[(h - crop) // 2:(h + crop) // 2, img = img[(h - crop) // 2:(h + crop) // 2,
(w - crop) // 2:(w + crop) // 2] (w - crop) // 2:(w + crop) // 2] # crop the image to a square shape
image = Image.fromarray(img) image = Image.fromarray(img) # create an image from numpy array
if self.size is not None: if self.size is not None: # if image size is provided, resize the image
image = image.resize((self.size, self.size), resample=self.interpolation) image = image.resize((self.size, self.size), resample=self.interpolation)
image = self.flip(image) image = self.flip(image) # flip the image horizontally with the given probability
image = np.array(image).astype(np.uint8) image = np.array(image).astype(np.uint8)
example["image"] = (image / 127.5 - 1.0).astype(np.float32) example["image"] = (image / 127.5 - 1.0).astype(np.float32) # normalize the image values and convert to float32
return example return example # return the example dictionary containing the image and its file paths
#A dataset class for LSUN Churches training set.
# It initializes by calling the constructor of LSUNBase class and passing the appropriate arguments.
# The text file containing the paths to the images and the root directory where the images are stored are passed as arguments. Any additional keyword arguments passed to this class will be forwarded to the constructor of the parent class.
class LSUNChurchesTrain(LSUNBase): class LSUNChurchesTrain(LSUNBase):
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(txt_file="data/lsun/church_outdoor_train.txt", data_root="data/lsun/churches", **kwargs) super().__init__(txt_file="data/lsun/church_outdoor_train.txt", data_root="data/lsun/churches", **kwargs)
#A dataset class for LSUN Churches validation set.
# It is similar to LSUNChurchesTrain except that it uses a different text file and sets the flip probability to zero by default.
class LSUNChurchesValidation(LSUNBase): class LSUNChurchesValidation(LSUNBase):
def __init__(self, flip_p=0., **kwargs): def __init__(self, flip_p=0., **kwargs):
super().__init__(txt_file="data/lsun/church_outdoor_val.txt", data_root="data/lsun/churches", super().__init__(txt_file="data/lsun/church_outdoor_val.txt", data_root="data/lsun/churches",
flip_p=flip_p, **kwargs) flip_p=flip_p, **kwargs)
# A dataset class for LSUN Bedrooms training set.
# It initializes by calling the constructor of LSUNBase class and passing the appropriate arguments.
class LSUNBedroomsTrain(LSUNBase): class LSUNBedroomsTrain(LSUNBase):
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(txt_file="data/lsun/bedrooms_train.txt", data_root="data/lsun/bedrooms", **kwargs) super().__init__(txt_file="data/lsun/bedrooms_train.txt", data_root="data/lsun/bedrooms", **kwargs)
# A dataset class for LSUN Bedrooms validation set.
# It is similar to LSUNBedroomsTrain except that it uses a different text file and sets the flip probability to zero by default.
class LSUNBedroomsValidation(LSUNBase): class LSUNBedroomsValidation(LSUNBase):
def __init__(self, flip_p=0.0, **kwargs): def __init__(self, flip_p=0.0, **kwargs):
super().__init__(txt_file="data/lsun/bedrooms_val.txt", data_root="data/lsun/bedrooms", super().__init__(txt_file="data/lsun/bedrooms_val.txt", data_root="data/lsun/bedrooms",
flip_p=flip_p, **kwargs) flip_p=flip_p, **kwargs)
# A dataset class for LSUN Cats training set.
# It initializes by calling the constructor of LSUNBase class and passing the appropriate arguments.
# The text file containing the paths to the images and the root directory where the images are stored are passed as arguments.
class LSUNCatsTrain(LSUNBase): class LSUNCatsTrain(LSUNBase):
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(txt_file="data/lsun/cat_train.txt", data_root="data/lsun/cats", **kwargs) super().__init__(txt_file="data/lsun/cat_train.txt", data_root="data/lsun/cats", **kwargs)
# A dataset class for LSUN Cats validation set.
# It is similar to LSUNCatsTrain except that it uses a different text file and sets the flip probability to zero by default.
class LSUNCatsValidation(LSUNBase): class LSUNCatsValidation(LSUNBase):
def __init__(self, flip_p=0., **kwargs): def __init__(self, flip_p=0., **kwargs):
super().__init__(txt_file="data/lsun/cat_val.txt", data_root="data/lsun/cats", super().__init__(txt_file="data/lsun/cat_val.txt", data_root="data/lsun/cats",
......
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment