add new model

0d97cc8c · Sugon_ldc · 0d97cc8c · 0d97cc8c · 0d97cc8c · 0d97cc8c
Commit 0d97cc8c authored Jun 07, 2023 by Sugon_ldc
20 changed files
--- a/Matting/docs/quick_start_cn.md
+++ b/Matting/docs/quick_start_cn.md
+# 快速体验
+## 环境配置
+#### 1. 安装PaddlePaddle
+版本要求
+* PaddlePaddle >= 2.0.2
+* Python >= 3.7+
+由于图像抠图模型计算开销大，推荐在GPU版本的PaddlePaddle下使用。
+推荐安装10.0以上的CUDA环境。安装教程请见[PaddlePaddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)。
+#### 2. 下载PaddleSeg仓库
+```shell
+git clone https://github.com/PaddlePaddle/PaddleSeg
+```
+#### 3. 安装
+```shell
+cd PaddleSeg/Matting
+pip install -r requirements.txt
+```
+## 下载预训练模型
+下载[模型库](../README_CN.md/#模型库)中的预训练模型并放置于pretrained_models目录下。这边以PP—MattingV2为例。
+```shell
+mkdir pretrained_models && cd pretrained_models
+wget https://paddleseg.bj.bcebos.com/matting/models/ppmattingv2-stdc1-human_512.pdparams
+cd ..
+```
+## 预测
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python tools/predict.py \
+    --config configs/ppmattingv2/ppmattingv2-stdc1-human_512.yml \
+    --model_path pretrained_models/ppmattingv2-stdc1-human_512.pdparams \
+    --image_path demo/human.jpg \
+    --save_dir ./output/results \
+    --fg_estimate True
+```
+预测结果如下:
+<div align="center">
+<img src="https://user-images.githubusercontent.com/30919197/201861635-0d139592-7da5-44b1-9bfa-7502d9643320.png"  width = "90%"  />
+* </div>
+**注意**： `--config`需要与`--model_path`匹配。
+## 背景替换
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python tools/bg_replace.py \
+    --config configs/ppmattingv2/ppmattingv2-stdc1-human_512.yml \
+    --model_path pretrained_models/ppmattingv2-stdc1-human_512.pdparams \
+    --image_path demo/human.jpg \
+    --background 'g' \
+    --save_dir ./output/results \
+    --fg_estimate True
+```
+背景替换效果如下：
+<div align="center">
+<img src="https://user-images.githubusercontent.com/30919197/201861644-15dd5ccf-fb6e-4440-a731-8e7c1d464699.png"  width = "90%"  />
+* </div>
+**注意：**
+* `--image_path`必须是一张图片的具体路径。
+* `--config`需要与`--model_path`匹配。
+* `--background`可以传入背景图片路径，或选择（'r','g','b','w')中的一种，代表红，绿，蓝，白背景, 若不提供则采用绿色作为背景。
--- a/Matting/docs/quick_start_en.md
+++ b/Matting/docs/quick_start_en.md
+# Quick Start
+## Installation
+#### 1. Install PaddlePaddle
+Versions
+* PaddlePaddle >= 2.0.2
+* Python >= 3.7+
+Due to the high computational cost of model, PaddleSeg is recommended for GPU version PaddlePaddle.
+CUDA 10.0 or later is recommended. See [PaddlePaddle official website](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html) for the installation tutorial.
+#### 2. Download the PaddleSeg repository
+```shell
+git clone https://github.com/PaddlePaddle/PaddleSeg
+```
+#### 3. Installation
+```shell
+cd PaddleSeg/Matting
+pip install -r requirements.txt
+```
+## Download pre-trained model
+Download the pre-trained model in [Models](../README.md/#Models) to `pretrained_models`. Take PP-MattingV2 as an example.
+```shell
+mkdir pretrained_models && cd pretrained_models
+wget https://paddleseg.bj.bcebos.com/matting/models/ppmattingv2-stdc1-human_512.pdparams
+cd ..
+```
+## Prediction
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python tools/predict.py \
+    --config configs/ppmattingv2/ppmattingv2-stdc1-human_512.yml \
+    --model_path pretrained_models/ppmattingv2-stdc1-human_512.pdparams \
+    --image_path demo/human.jpg \
+    --save_dir ./output/results \
+    --fg_estimate True
+```
+Prediction results are as follows:
+<div align="center">
+<img src="https://user-images.githubusercontent.com/30919197/201861635-0d139592-7da5-44b1-9bfa-7502d9643320.png"  width = "90%"  />
+* </div>
+**Note**: `--config` needs to match `--model_path`.
+## Background Replacement
+```shell
+export CUDA_VISIBLE_DEVICES=0
+python tools/bg_replace.py \
+    --config configs/ppmattingv2/ppmattingv2-stdc1-human_512.yml \
+    --model_path pretrained_models/ppmattingv2-stdc1-human_512.pdparams \
+    --image_path demo/human.jpg \
+    --background 'g' \
+    --save_dir ./output/results \
+    --fg_estimate True
+```
+The background replacement effect is as follows：
+<div align="center">
+<img src="https://user-images.githubusercontent.com/30919197/201861644-15dd5ccf-fb6e-4440-a731-8e7c1d464699.png"  width = "90%"  />
+* </div>
+**Notes:**
+* `--image_path` must be the specific path of an image.
+* `--config` needs to match `--model_path`.
+* `--background` can be passed into the background image path, or one of ('r','g','b','w'), representing a red, green, blue, or white background, default green if not passed.
--- a/Matting/ppmatting/__init__.py
+++ b/Matting/ppmatting/__init__.py
+from . import ml, metrics, transforms, datasets, models, utils
--- a/Matting/ppmatting/core/__init__.py
+++ b/Matting/ppmatting/core/__init__.py
+from .val import evaluate
+from .val_ml import evaluate_ml
+from .train import train
+from .predict import predict
\ No newline at end of file
--- a/Matting/ppmatting/core/predict.py
+++ b/Matting/ppmatting/core/predict.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import math
+import time
+import cv2
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+from paddleseg import utils
+from paddleseg.core import infer
+from paddleseg.utils import logger, progbar, TimeAverager
+from ppmatting.utils import mkdir, estimate_foreground_ml
+def partition_list(arr, m):
+    """split the list 'arr' into m pieces"""
+    n = int(math.ceil(len(arr) / float(m)))
+    return [arr[i:i + n] for i in range(0, len(arr), n)]
+def save_result(alpha, path, im_path, trimap=None, fg_estimate=True):
+    """
+    The value of alpha is range [0, 1], shape should be [h,w]
+    """
+    dirname = os.path.dirname(path)
+    if not os.path.exists(dirname):
+        os.makedirs(dirname)
+    basename = os.path.basename(path)
+    name = os.path.splitext(basename)[0]
+    alpha_save_path = os.path.join(dirname, name + '_alpha.png')
+    rgba_save_path = os.path.join(dirname, name + '_rgba.png')
+    # save alpha matte
+    if trimap is not None:
+        trimap = cv2.imread(trimap, 0)
+        alpha[trimap == 0] = 0
+        alpha[trimap == 255] = 255
+    alpha = (alpha).astype('uint8')
+    cv2.imwrite(alpha_save_path, alpha)
+    # save rgba
+    im = cv2.imread(im_path)
+    if fg_estimate:
+        fg = estimate_foreground_ml(im / 255.0, alpha / 255.0) * 255
+    else:
+        fg = im
+    fg = fg.astype('uint8')
+    alpha = alpha[:, :, np.newaxis]
+    rgba = np.concatenate((fg, alpha), axis=-1)
+    cv2.imwrite(rgba_save_path, rgba)
+    return fg
+def reverse_transform(alpha, trans_info):
+    """recover pred to origin shape"""
+    for item in trans_info[::-1]:
+        if item[0] == 'resize':
+            h, w = item[1][0], item[1][1]
+            alpha = F.interpolate(alpha, [h, w], mode='bilinear')
+        elif item[0] == 'padding':
+            h, w = item[1][0], item[1][1]
+            alpha = alpha[:, :, 0:h, 0:w]
+        else:
+            raise Exception("Unexpected info '{}' in im_info".format(item[0]))
+    return alpha
+def preprocess(img, transforms, trimap=None):
+    data = {}
+    data['img'] = img
+    if trimap is not None:
+        data['trimap'] = trimap
+        data['gt_fields'] = ['trimap']
+    data['trans_info'] = []
+    data = transforms(data)
+    data['img'] = paddle.to_tensor(data['img'])
+    data['img'] = data['img'].unsqueeze(0)
+    if trimap is not None:
+        data['trimap'] = paddle.to_tensor(data['trimap'])
+        data['trimap'] = data['trimap'].unsqueeze((0, 1))
+    return data
+def predict(model,
+            model_path,
+            transforms,
+            image_list,
+            image_dir=None,
+            trimap_list=None,
+            save_dir='output',
+            fg_estimate=True):
+    """
+    predict and visualize the image_list.
+    Args:
+        model (nn.Layer): Used to predict for input image.
+        model_path (str): The path of pretrained model.
+        transforms (transforms.Compose): Preprocess for input image.
+        image_list (list): A list of image path to be predicted.
+        image_dir (str, optional): The root directory of the images predicted. Default: None.
+        trimap_list (list, optional): A list of trimap of image_list. Default: None.
+        save_dir (str, optional): The directory to save the visualized results. Default: 'output'.
+    """
+    utils.utils.load_entire_model(model, model_path)
+    model.eval()
+    nranks = paddle.distributed.get_world_size()
+    local_rank = paddle.distributed.get_rank()
+    if nranks > 1:
+        img_lists = partition_list(image_list, nranks)
+        trimap_lists = partition_list(
+            trimap_list, nranks) if trimap_list is not None else None
+    else:
+        img_lists = [image_list]
+        trimap_lists = [trimap_list] if trimap_list is not None else None
+    logger.info("Start to predict...")
+    progbar_pred = progbar.Progbar(target=len(img_lists[0]), verbose=1)
+    preprocess_cost_averager = TimeAverager()
+    infer_cost_averager = TimeAverager()
+    postprocess_cost_averager = TimeAverager()
+    batch_start = time.time()
+    with paddle.no_grad():
+        for i, im_path in enumerate(img_lists[local_rank]):
+            preprocess_start = time.time()
+            trimap = trimap_lists[local_rank][
+                i] if trimap_list is not None else None
+            data = preprocess(img=im_path, transforms=transforms, trimap=trimap)
+            preprocess_cost_averager.record(time.time() - preprocess_start)
+            infer_start = time.time()
+            alpha_pred = model(data)
+            infer_cost_averager.record(time.time() - infer_start)
+            postprocess_start = time.time()
+            alpha_pred = reverse_transform(alpha_pred, data['trans_info'])
+            alpha_pred = (alpha_pred.numpy()).squeeze()
+            alpha_pred = (alpha_pred * 255).astype('uint8')
+            # get the saved name
+            if image_dir is not None:
+                im_file = im_path.replace(image_dir, '')
+            else:
+                im_file = os.path.basename(im_path)
+            if im_file[0] == '/' or im_file[0] == '\\':
+                im_file = im_file[1:]
+            save_path = os.path.join(save_dir, im_file)
+            mkdir(save_path)
+            fg = save_result(
+                alpha_pred,
+                save_path,
+                im_path=im_path,
+                trimap=trimap,
+                fg_estimate=fg_estimate)
+            postprocess_cost_averager.record(time.time() - postprocess_start)
+            preprocess_cost = preprocess_cost_averager.get_average()
+            infer_cost = infer_cost_averager.get_average()
+            postprocess_cost = postprocess_cost_averager.get_average()
+            if local_rank == 0:
+                progbar_pred.update(i + 1,
+                                    [('preprocess_cost', preprocess_cost),
+                                     ('infer_cost cost', infer_cost),
+                                     ('postprocess_cost', postprocess_cost)])
+            preprocess_cost_averager.reset()
+            infer_cost_averager.reset()
+            postprocess_cost_averager.reset()
+    return alpha_pred, fg
--- a/Matting/ppmatting/core/train.py
+++ b/Matting/ppmatting/core/train.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import time
+from collections import deque, defaultdict
+import pickle
+import shutil
+import numpy as np
+import paddle
+import paddle.nn.functional as F
+from paddleseg.utils import TimeAverager, calculate_eta, resume, logger, train_profiler
+from .val import evaluate
+def visual_in_traning(log_writer, vis_dict, step):
+    """
+    Visual in vdl
+    Args:
+        log_writer (LogWriter): The log writer of vdl.
+        vis_dict (dict): Dict of tensor. The shape of thesor is (C, H, W)
+    """
+    for key, value in vis_dict.items():
+        value_shape = value.shape
+        if value_shape[0] not in [1, 3]:
+            value = value[0]
+            value = value.unsqueeze(0)
+        value = paddle.transpose(value, (1, 2, 0))
+        min_v = paddle.min(value)
+        max_v = paddle.max(value)
+        if (min_v > 0) and (max_v < 1):
+            value = value * 255
+        elif (min_v < 0 and min_v >= -1) and (max_v <= 1):
+            value = (1 + value) / 2 * 255
+        else:
+            value = (value - min_v) / (max_v - min_v) * 255
+        value = value.astype('uint8')
+        value = value.numpy()
+        log_writer.add_image(tag=key, img=value, step=step)
+def save_best(best_model_dir, metrics_data, iter):
+    with open(os.path.join(best_model_dir, 'best_metrics.txt'), 'w') as f:
+        for key, value in metrics_data.items():
+            line = key + ' ' + str(value) + '\n'
+            f.write(line)
+        f.write('iter' + ' ' + str(iter) + '\n')
+def get_best(best_file, metrics, resume_model=None):
+    '''Get best metrics and iter from file'''
+    best_metrics_data = {}
+    if os.path.exists(best_file) and (resume_model is not None):
+        values = []
+        with open(best_file, 'r') as f:
+            lines = f.readlines()
+            for line in lines:
+                line = line.strip()
+                key, value = line.split(' ')
+                best_metrics_data[key] = eval(value)
+                if key == 'iter':
+                    best_iter = eval(value)
+    else:
+        for key in metrics:
+            best_metrics_data[key] = np.inf
+        best_iter = -1
+    return best_metrics_data, best_iter
+def train(model,
+          train_dataset,
+          val_dataset=None,
+          optimizer=None,
+          save_dir='output',
+          iters=10000,
+          batch_size=2,
+          resume_model=None,
+          save_interval=1000,
+          log_iters=10,
+          log_image_iters=1000,
+          num_workers=0,
+          use_vdl=False,
+          losses=None,
+          keep_checkpoint_max=5,
+          eval_begin_iters=None,
+          metrics='sad',
+          precision='fp32',
+          amp_level='O1',
+          profiler_options=None):
+    """
+    Launch training.
+    Args:
+        model（nn.Layer): A matting model.
+        train_dataset (paddle.io.Dataset): Used to read and process training datasets.
+        val_dataset (paddle.io.Dataset, optional): Used to read and process validation datasets.
+        optimizer (paddle.optimizer.Optimizer): The optimizer.
+        save_dir (str, optional): The directory for saving the model snapshot. Default: 'output'.
+        iters (int, optional): How may iters to train the model. Defualt: 10000.
+        batch_size (int, optional): Mini batch size of one gpu or cpu. Default: 2.
+        resume_model (str, optional): The path of resume model.
+        save_interval (int, optional): How many iters to save a model snapshot once during training. Default: 1000.
+        log_iters (int, optional): Display logging information at every log_iters. Default: 10.
+        log_image_iters (int, optional): Log image to vdl. Default: 1000.
+        num_workers (int, optional): Num workers for data loader. Default: 0.
+        use_vdl (bool, optional): Whether to record the data to VisualDL during training. Default: False.
+        losses (dict, optional): A dict of loss, refer to the loss function of the model for details. Default: None.
+        keep_checkpoint_max (int, optional): Maximum number of checkpoints to save. Default: 5.
+        eval_begin_iters (int): The iters begin evaluation. It will evaluate at iters/2 if it is None. Defalust: None.
+        metrics(str|list, optional): The metrics to evaluate, it may be the combination of ("sad", "mse", "grad", "conn"). 
+        precision (str, optional): Use AMP if precision='fp16'. If precision='fp32', the training is normal.
+        amp_level (str, optional): Auto mixed precision level. Accepted values are “O1” and “O2”: O1 represent mixed precision, 
+            the input data type of each operator will be casted by white_list and black_list; O2 represent Pure fp16, all operators 
+            parameters and input data will be casted to fp16, except operators in black_list, don’t support fp16 kernel and batchnorm. Default is O1(amp)
+        profiler_options (str, optional): The option of train profiler.
+    """
+    model.train()
+    nranks = paddle.distributed.ParallelEnv().nranks
+    local_rank = paddle.distributed.ParallelEnv().local_rank
+    start_iter = 0
+    if resume_model is not None:
+        start_iter = resume(model, optimizer, resume_model)
+    if not os.path.isdir(save_dir):
+        if os.path.exists(save_dir):
+            os.remove(save_dir)
+        os.makedirs(save_dir)
+    # Use amp
+    if precision == 'fp16':
+        logger.info('use AMP to train. AMP level = {}'.format(amp_level))
+        scaler = paddle.amp.GradScaler(init_loss_scaling=1024)
+        if amp_level == 'O2':
+            model, optimizer = paddle.amp.decorate(
+                models=model,
+                optimizers=optimizer,
+                level='O2',
+                save_dtype='float32')
+    if nranks > 1:
+        # Initialize parallel environment if not done.
+        if not paddle.distributed.parallel.parallel_helper._is_parallel_ctx_initialized(
+        ):
+            paddle.distributed.init_parallel_env()
+            ddp_model = paddle.DataParallel(model)
+        else:
+            ddp_model = paddle.DataParallel(model)
+    batch_sampler = paddle.io.DistributedBatchSampler(
+        train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
+    loader = paddle.io.DataLoader(
+        train_dataset,
+        batch_sampler=batch_sampler,
+        num_workers=num_workers,
+        return_list=True, )
+    if use_vdl:
+        from visualdl import LogWriter
+        log_writer = LogWriter(save_dir)
+    if isinstance(metrics, str):
+        metrics = [metrics]
+    elif not isinstance(metrics, list):
+        metrics = ['sad']
+    best_metrics_data, best_iter = get_best(
+        os.path.join(save_dir, 'best_model', 'best_metrics.txt'),
+        metrics,
+        resume_model=resume_model)
+    avg_loss = defaultdict(float)
+    iters_per_epoch = len(batch_sampler)
+    reader_cost_averager = TimeAverager()
+    batch_cost_averager = TimeAverager()
+    save_models = deque()
+    batch_start = time.time()
+    iter = start_iter
+    while iter < iters:
+        for data in loader:
+            iter += 1
+            if iter > iters:
+                break
+            reader_cost_averager.record(time.time() - batch_start)
+            if precision == 'fp16':
+                with paddle.amp.auto_cast(
+                        level=amp_level,
+                        enable=True,
+                        custom_white_list={
+                            "elementwise_add", "batch_norm", "sync_batch_norm"
+                        },
+                        custom_black_list={'bilinear_interp_v2', 'pad3d'}):
+                    logit_dict, loss_dict = ddp_model(
+                        data) if nranks > 1 else model(data)
+                scaled = scaler.scale(loss_dict['all'])  # scale the loss
+                scaled.backward()  # do backward
+                scaler.minimize(optimizer, scaled)  # update parameters
+            else:
+                logit_dict, loss_dict = ddp_model(
+                    data) if nranks > 1 else model(data)
+                loss_dict['all'].backward()
+                optimizer.step()
+            lr = optimizer.get_lr()
+            if isinstance(optimizer._learning_rate,
+                          paddle.optimizer.lr.LRScheduler):
+                optimizer._learning_rate.step()
+            train_profiler.add_profiler_step(profiler_options)
+            model.clear_gradients()
+            for key, value in loss_dict.items():
+                avg_loss[key] += float(value)
+            batch_cost_averager.record(
+                time.time() - batch_start, num_samples=batch_size)
+            if (iter) % log_iters == 0 and local_rank == 0:
+                for key, value in avg_loss.items():
+                    avg_loss[key] = value / log_iters
+                remain_iters = iters - iter
+                avg_train_batch_cost = batch_cost_averager.get_average()
+                avg_train_reader_cost = reader_cost_averager.get_average()
+                eta = calculate_eta(remain_iters, avg_train_batch_cost)
+                # loss info
+                loss_str = ' ' * 26 + '\t[LOSSES]'
+                loss_str = loss_str
+                for key, value in avg_loss.items():
+                    if key != 'all':
+                        loss_str = loss_str + ' ' + key + '={:.4f}'.format(
+                            value)
+                logger.info(
+                    "[TRAIN] epoch={}, iter={}/{}, loss={:.4f}, lr={:.6f}, batch_cost={:.4f}, reader_cost={:.5f}, ips={:.4f} samples/sec | ETA {}\n{}\n"
+                    .format((iter - 1) // iters_per_epoch + 1, iter, iters,
+                            avg_loss['all'], lr, avg_train_batch_cost,
+                            avg_train_reader_cost,
+                            batch_cost_averager.get_ips_average(
+                            ), eta, loss_str))
+                if use_vdl:
+                    for key, value in avg_loss.items():
+                        log_tag = 'Train/' + key
+                        log_writer.add_scalar(log_tag, value, iter)
+                    log_writer.add_scalar('Train/lr', lr, iter)
+                    log_writer.add_scalar('Train/batch_cost',
+                                          avg_train_batch_cost, iter)
+                    log_writer.add_scalar('Train/reader_cost',
+                                          avg_train_reader_cost, iter)
+                    if iter % log_image_iters == 0:
+                        vis_dict = {}
+                        # ground truth
+                        vis_dict['ground truth/img'] = data['img'][0]
+                        for key in data['gt_fields']:
+                            key = key[0]
+                            vis_dict['/'.join(['ground truth', key])] = data[
+                                key][0]
+                        # predict
+                        for key, value in logit_dict.items():
+                            vis_dict['/'.join(['predict', key])] = logit_dict[
+                                key][0]
+                        visual_in_traning(
+                            log_writer=log_writer, vis_dict=vis_dict, step=iter)
+                for key in avg_loss.keys():
+                    avg_loss[key] = 0.
+                reader_cost_averager.reset()
+                batch_cost_averager.reset()
+            # save model
+            if (iter % save_interval == 0 or iter == iters) and local_rank == 0:
+                current_save_dir = os.path.join(save_dir,
+                                                "iter_{}".format(iter))
+                if not os.path.isdir(current_save_dir):
+                    os.makedirs(current_save_dir)
+                paddle.save(model.state_dict(),
+                            os.path.join(current_save_dir, 'model.pdparams'))
+                paddle.save(optimizer.state_dict(),
+                            os.path.join(current_save_dir, 'model.pdopt'))
+                save_models.append(current_save_dir)
+                if len(save_models) > keep_checkpoint_max > 0:
+                    model_to_remove = save_models.popleft()
+                    shutil.rmtree(model_to_remove)
+            # eval model
+            if eval_begin_iters is None:
+                eval_begin_iters = iters // 2
+            if (iter % save_interval == 0 or iter == iters) and (
+                    val_dataset is not None
+            ) and local_rank == 0 and iter >= eval_begin_iters:
+                num_workers = 1 if num_workers > 0 else 0
+                metrics_data = evaluate(
+                    model,
+                    val_dataset,
+                    num_workers=1,
+                    print_detail=True,
+                    save_results=False,
+                    metrics=metrics,
+                    precision=precision,
+                    amp_level=amp_level)
+                model.train()
+            # save best model and add evaluation results to vdl
+            if (iter % save_interval == 0 or iter == iters) and local_rank == 0:
+                if val_dataset is not None and iter >= eval_begin_iters:
+                    if metrics_data[metrics[0]] < best_metrics_data[metrics[0]]:
+                        best_iter = iter
+                        best_metrics_data = metrics_data.copy()
+                        best_model_dir = os.path.join(save_dir, "best_model")
+                        paddle.save(
+                            model.state_dict(),
+                            os.path.join(best_model_dir, 'model.pdparams'))
+                        save_best(best_model_dir, best_metrics_data, iter)
+                    show_list = []
+                    for key, value in best_metrics_data.items():
+                        show_list.append((key, value))
+                    log_str = '[EVAL] The model with the best validation {} ({:.4f}) was saved at iter {}.'.format(
+                        show_list[0][0], show_list[0][1], best_iter)
+                    if len(show_list) > 1:
+                        log_str += " While"
+                        for i in range(1, len(show_list)):
+                            log_str = log_str + ' {}: {:.4f},'.format(
+                                show_list[i][0], show_list[i][1])
+                        log_str = log_str[:-1]
+                    logger.info(log_str)
+                    if use_vdl:
+                        for key, value in metrics_data.items():
+                            log_writer.add_scalar('Evaluate/' + key, value,
+                                                  iter)
+            batch_start = time.time()
+    # Sleep for half a second to let dataloader release resources.
+    time.sleep(0.5)
+    if use_vdl:
+        log_writer.close()
--- a/Matting/ppmatting/core/val.py
+++ b/Matting/ppmatting/core/val.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import cv2
+import numpy as np
+import time
+import paddle
+import paddle.nn.functional as F
+from paddleseg.utils import TimeAverager, calculate_eta, logger, progbar
+from ppmatting.metrics import metrics_class_dict
+np.set_printoptions(suppress=True)
+def save_alpha_pred(alpha, path):
+    """
+    The value of alpha is range [0, 1], shape should be [h,w]
+    """
+    dirname = os.path.dirname(path)
+    if not os.path.exists(dirname):
+        os.makedirs(dirname)
+    alpha = (alpha).astype('uint8')
+    cv2.imwrite(path, alpha)
+def reverse_transform(alpha, trans_info):
+    """recover pred to origin shape"""
+    for item in trans_info[::-1]:
+        if item[0][0] == 'resize':
+            h, w = item[1][0], item[1][1]
+            alpha = F.interpolate(alpha, [h, w], mode='bilinear')
+        elif item[0][0] == 'padding':
+            h, w = item[1][0], item[1][1]
+            alpha = alpha[:, :, 0:h, 0:w]
+        else:
+            raise Exception("Unexpected info '{}' in im_info".format(item[0]))
+    return alpha
+def evaluate(model,
+             eval_dataset,
+             num_workers=0,
+             print_detail=True,
+             save_dir='output/results',
+             save_results=True,
+             metrics='sad',
+             precision='fp32',
+             amp_level='O1'):
+    model.eval()
+    nranks = paddle.distributed.ParallelEnv().nranks
+    local_rank = paddle.distributed.ParallelEnv().local_rank
+    if nranks > 1:
+        # Initialize parallel environment if not done.
+        if not paddle.distributed.parallel.parallel_helper._is_parallel_ctx_initialized(
+        ):
+            paddle.distributed.init_parallel_env()
+    loader = paddle.io.DataLoader(
+        eval_dataset,
+        batch_size=1,
+        drop_last=False,
+        num_workers=num_workers,
+        return_list=True, )
+    total_iters = len(loader)
+    # Get metric instances and data saving
+    metrics_ins = {}
+    metrics_data = {}
+    if isinstance(metrics, str):
+        metrics = [metrics]
+    elif not isinstance(metrics, list):
+        metrics = ['sad']
+    for key in metrics:
+        key = key.lower()
+        metrics_ins[key] = metrics_class_dict[key]()
+        metrics_data[key] = None
+    if print_detail:
+        logger.info("Start evaluating (total_samples: {}, total_iters: {})...".
+                    format(len(eval_dataset), total_iters))
+    progbar_val = progbar.Progbar(
+        target=total_iters, verbose=1 if nranks < 2 else 2)
+    reader_cost_averager = TimeAverager()
+    batch_cost_averager = TimeAverager()
+    batch_start = time.time()
+    img_name = ''
+    i = 0
+    with paddle.no_grad():
+        for iter, data in enumerate(loader):
+            reader_cost_averager.record(time.time() - batch_start)
+            if precision == 'fp16':
+                with paddle.amp.auto_cast(
+                        level=amp_level,
+                        enable=True,
+                        custom_white_list={
+                            "elementwise_add", "batch_norm", "sync_batch_norm"
+                        },
+                        custom_black_list={'bilinear_interp_v2', 'pad3d'}):
+                    alpha_pred = model(data)
+                    alpha_pred = reverse_transform(alpha_pred,
+                                                   data['trans_info'])
+            else:
+                alpha_pred = model(data)
+                alpha_pred = reverse_transform(alpha_pred, data['trans_info'])
+            alpha_pred = alpha_pred.numpy()
+            alpha_gt = data['alpha'].numpy() * 255
+            trimap = data.get('ori_trimap')
+            if trimap is not None:
+                trimap = trimap.numpy().astype('uint8')
+            alpha_pred = np.round(alpha_pred * 255)
+            for key in metrics_ins.keys():
+                metrics_data[key] = metrics_ins[key].update(alpha_pred,
+                                                            alpha_gt, trimap)
+            if save_results:
+                alpha_pred_one = alpha_pred[0].squeeze()
+                if trimap is not None:
+                    trimap = trimap.squeeze().astype('uint8')
+                    alpha_pred_one[trimap == 255] = 255
+                    alpha_pred_one[trimap == 0] = 0
+                save_name = data['img_name'][0]
+                name, ext = os.path.splitext(save_name)
+                if save_name == img_name:
+                    save_name = name + '_' + str(i) + ext
+                    i += 1
+                else:
+                    img_name = save_name
+                    save_name = name + '_' + str(i) + ext
+                    i = 1
+                save_alpha_pred(alpha_pred_one,
+                                os.path.join(save_dir, save_name))
+            batch_cost_averager.record(
+                time.time() - batch_start, num_samples=len(alpha_gt))
+            batch_cost = batch_cost_averager.get_average()
+            reader_cost = reader_cost_averager.get_average()
+            if local_rank == 0 and print_detail:
+                show_list = [(k, v) for k, v in metrics_data.items()]
+                show_list = show_list + [('batch_cost', batch_cost),
+                                         ('reader cost', reader_cost)]
+                progbar_val.update(iter + 1, show_list)
+            reader_cost_averager.reset()
+            batch_cost_averager.reset()
+            batch_start = time.time()
+    for key in metrics_ins.keys():
+        metrics_data[key] = metrics_ins[key].evaluate()
+    log_str = '[EVAL] '
+    for key, value in metrics_data.items():
+        log_str = log_str + key + ': {:.4f}, '.format(value)
+    log_str = log_str[:-2]
+    logger.info(log_str)
+    return metrics_data
--- a/Matting/ppmatting/core/val_ml.py
+++ b/Matting/ppmatting/core/val_ml.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import cv2
+import numpy as np
+import time
+import paddle
+import paddle.nn.functional as F
+from paddleseg.utils import TimeAverager, calculate_eta, logger, progbar
+from ppmatting.metrics import metric
+from pymatting.util.util import load_image, save_image, stack_images
+from pymatting.foreground.estimate_foreground_ml import estimate_foreground_ml
+np.set_printoptions(suppress=True)
+def save_alpha_pred(alpha, path):
+    """
+    The value of alpha is range [0, 1], shape should be [h,w]
+    """
+    dirname = os.path.dirname(path)
+    if not os.path.exists(dirname):
+        os.makedirs(dirname)
+    alpha = (alpha).astype('uint8')
+    cv2.imwrite(path, alpha)
+def reverse_transform(alpha, trans_info):
+    """recover pred to origin shape"""
+    for item in trans_info[::-1]:
+        if item[0][0] == 'resize':
+            h, w = int(item[1][0]), int(item[1][1])
+            alpha = cv2.resize(alpha, dsize=(w, h))
+        elif item[0][0] == 'padding':
+            h, w = int(item[1][0]), int(item[1][1])
+            alpha = alpha[0:h, 0:w]
+        else:
+            raise Exception("Unexpected info '{}' in im_info".format(item[0]))
+    return alpha
+def evaluate_ml(model,
+                eval_dataset,
+                num_workers=0,
+                print_detail=True,
+                save_dir='output/results',
+                save_results=True):
+    loader = paddle.io.DataLoader(
+        eval_dataset,
+        batch_size=1,
+        drop_last=False,
+        num_workers=num_workers,
+        return_list=True, )
+    total_iters = len(loader)
+    mse_metric = metric.MSE()
+    sad_metric = metric.SAD()
+    grad_metric = metric.Grad()
+    conn_metric = metric.Conn()
+    if print_detail:
+        logger.info("Start evaluating (total_samples: {}, total_iters: {})...".
+                    format(len(eval_dataset), total_iters))
+    progbar_val = progbar.Progbar(target=total_iters, verbose=1)
+    reader_cost_averager = TimeAverager()
+    batch_cost_averager = TimeAverager()
+    batch_start = time.time()
+    img_name = ''
+    i = 0
+    ignore_cnt = 0
+    for iter, data in enumerate(loader):
+        reader_cost_averager.record(time.time() - batch_start)
+        image_rgb_chw = data['img'].numpy()[0]
+        image_rgb_hwc = np.transpose(image_rgb_chw, (1, 2, 0))
+        trimap = data['trimap'].numpy().squeeze() / 255.0
+        image = image_rgb_hwc * 0.5 + 0.5  # reverse normalize (x/255 - mean) / std
+        is_fg = trimap >= 0.9
+        is_bg = trimap <= 0.1
+        if is_fg.sum() == 0 or is_bg.sum() == 0:
+            ignore_cnt += 1
+            logger.info(str(iter))
+            continue
+        alpha_pred = model(image, trimap)
+        alpha_pred = reverse_transform(alpha_pred, data['trans_info'])
+        alpha_gt = data['alpha'].numpy().squeeze() * 255
+        trimap = data['ori_trimap'].numpy().squeeze()
+        alpha_pred = np.round(alpha_pred * 255)
+        mse = mse_metric.update(alpha_pred, alpha_gt, trimap)
+        sad = sad_metric.update(alpha_pred, alpha_gt, trimap)
+        grad = grad_metric.update(alpha_pred, alpha_gt, trimap)
+        conn = conn_metric.update(alpha_pred, alpha_gt, trimap)
+        if sad > 1000:
+            print(data['img_name'][0])
+        if save_results:
+            alpha_pred_one = alpha_pred
+            alpha_pred_one[trimap == 255] = 255
+            alpha_pred_one[trimap == 0] = 0
+            save_name = data['img_name'][0]
+            name, ext = os.path.splitext(save_name)
+            if save_name == img_name:
+                save_name = name + '_' + str(i) + ext
+                i += 1
+            else:
+                img_name = save_name
+                save_name = name + '_' + str(0) + ext
+                i = 1
+            save_alpha_pred(alpha_pred_one, os.path.join(save_dir, save_name))
+        batch_cost_averager.record(
+            time.time() - batch_start, num_samples=len(alpha_gt))
+        batch_cost = batch_cost_averager.get_average()
+        reader_cost = reader_cost_averager.get_average()
+        if print_detail:
+            progbar_val.update(iter + 1,
+                               [('SAD', sad), ('MSE', mse), ('Grad', grad),
+                                ('Conn', conn), ('batch_cost', batch_cost),
+                                ('reader cost', reader_cost)])
+        reader_cost_averager.reset()
+        batch_cost_averager.reset()
+        batch_start = time.time()
+    mse = mse_metric.evaluate()
+    sad = sad_metric.evaluate()
+    grad = grad_metric.evaluate()
+    conn = conn_metric.evaluate()
+    logger.info('[EVAL] SAD: {:.4f}, MSE: {:.4f}, Grad: {:.4f}, Conn: {:.4f}'.
+                format(sad, mse, grad, conn))
+    logger.info('{}'.format(ignore_cnt))
+    return sad, mse, grad, conn
--- a/Matting/ppmatting/datasets/__init__.py
+++ b/Matting/ppmatting/datasets/__init__.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from .matting_dataset import MattingDataset
+from .composition_1k import Composition1K
+from .distinctions_646 import Distinctions646
--- a/Matting/ppmatting/datasets/composition_1k.py
+++ b/Matting/ppmatting/datasets/composition_1k.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import math
+import cv2
+import numpy as np
+import random
+import paddle
+from paddleseg.cvlibs import manager
+import ppmatting.transforms as T
+from ppmatting.datasets.matting_dataset import MattingDataset
+@manager.DATASETS.add_component
+class Composition1K(MattingDataset):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
--- a/Matting/ppmatting/datasets/distinctions_646.py
+++ b/Matting/ppmatting/datasets/distinctions_646.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import math
+import cv2
+import numpy as np
+import random
+import paddle
+from paddleseg.cvlibs import manager
+import ppmatting.transforms as T
+from ppmatting.datasets.matting_dataset import MattingDataset
+@manager.DATASETS.add_component
+class Distinctions646(MattingDataset):
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)
--- a/Matting/ppmatting/datasets/matting_dataset.py
+++ b/Matting/ppmatting/datasets/matting_dataset.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import os
+import math
+import cv2
+import numpy as np
+import random
+import paddle
+from paddleseg.cvlibs import manager
+import ppmatting.transforms as T
+@manager.DATASETS.add_component
+class MattingDataset(paddle.io.Dataset):
+    """
+    Pass in a dataset that conforms to the format.
+        matting_dataset/
+        |--bg/
+        |
+        |--train/
+        |  |--fg/
+        |  |--alpha/
+        |
+        |--val/
+        |  |--fg/
+        |  |--alpha/
+        |  |--trimap/ (if existing)
+        |
+        |--train.txt
+        |
+        |--val.txt
+    See README.md for more information of dataset.
+    Args:
+        dataset_root(str): The root path of dataset.
+        transforms(list):  Transforms for image.
+        mode (str, optional): which part of dataset to use. it is one of ('train', 'val', 'trainval'). Default: 'train'.
+        train_file (str|list, optional): File list is used to train. It should be `foreground_image.png background_image.png`
+            or `foreground_image.png`. It shold be provided if mode equal to 'train'. Default: None.
+        val_file (str|list, optional): File list is used to evaluation. It should be `foreground_image.png background_image.png`
+            or `foreground_image.png` or ``foreground_image.png background_image.png trimap_image.png`.
+            It shold be provided if mode equal to 'val'. Default: None.
+        get_trimap (bool, optional): Whether to get triamp. Default: True.
+        separator (str, optional): The separator of train_file or val_file. If file name contains ' ', '|' may be perfect. Default: ' '.
+        key_del (tuple|list, optional): The key which is not need will be delete to accellect data reader. Default: None.
+        if_rssn (bool, optional): Whether to use RSSN while Compositing image. Including denoise and blur. Default: False.
+    """
+    def __init__(self,
+                 dataset_root,
+                 transforms,
+                 mode='train',
+                 train_file=None,
+                 val_file=None,
+                 get_trimap=True,
+                 separator=' ',
+                 key_del=None,
+                 if_rssn=False):
+        super().__init__()
+        self.dataset_root = dataset_root
+        self.transforms = T.Compose(transforms)
+        self.mode = mode
+        self.get_trimap = get_trimap
+        self.separator = separator
+        self.key_del = key_del
+        self.if_rssn = if_rssn
+        # check file
+        if mode == 'train' or mode == 'trainval':
+            if train_file is None:
+                raise ValueError(
+                    "When `mode` is 'train' or 'trainval', `train_file must be provided!"
+                )
+            if isinstance(train_file, str):
+                train_file = [train_file]
+            file_list = train_file
+        if mode == 'val' or mode == 'trainval':
+            if val_file is None:
+                raise ValueError(
+                    "When `mode` is 'val' or 'trainval', `val_file must be provided!"
+                )
+            if isinstance(val_file, str):
+                val_file = [val_file]
+            file_list = val_file
+        if mode == 'trainval':
+            file_list = train_file + val_file
+        # read file
+        self.fg_bg_list = []
+        for file in file_list:
+            file = os.path.join(dataset_root, file)
+            with open(file, 'r') as f:
+                lines = f.readlines()
+                for line in lines:
+                    line = line.strip()
+                    self.fg_bg_list.append(line)
+        if mode != 'val':
+            random.shuffle(self.fg_bg_list)
+    def __getitem__(self, idx):
+        data = {}
+        fg_bg_file = self.fg_bg_list[idx]
+        fg_bg_file = fg_bg_file.split(self.separator)
+        data['img_name'] = fg_bg_file[0]  # using in save prediction results
+        fg_file = os.path.join(self.dataset_root, fg_bg_file[0])
+        alpha_file = fg_file.replace('/fg', '/alpha')
+        fg = cv2.imread(fg_file)
+        alpha = cv2.imread(alpha_file, 0)
+        data['alpha'] = alpha
+        data['gt_fields'] = []
+        # line is: fg [bg] [trimap]
+        if len(fg_bg_file) >= 2:
+            bg_file = os.path.join(self.dataset_root, fg_bg_file[1])
+            bg = cv2.imread(bg_file)
+            data['img'], data['fg'], data['bg'] = self.composite(fg, alpha, bg)
+            if self.mode in ['train', 'trainval']:
+                data['gt_fields'].append('fg')
+                data['gt_fields'].append('bg')
+                data['gt_fields'].append('alpha')
+            if len(fg_bg_file) == 3 and self.get_trimap:
+                if self.mode == 'val':
+                    trimap_path = os.path.join(self.dataset_root, fg_bg_file[2])
+                    if os.path.exists(trimap_path):
+                        data['trimap'] = trimap_path
+                        data['gt_fields'].append('trimap')
+                        data['ori_trimap'] = cv2.imread(trimap_path, 0)
+                    else:
+                        raise FileNotFoundError(
+                            'trimap is not Found: {}'.format(fg_bg_file[2]))
+        else:
+            data['img'] = fg
+            if self.mode in ['train', 'trainval']:
+                data['fg'] = fg.copy()
+                data['bg'] = fg.copy()
+                data['gt_fields'].append('fg')
+                data['gt_fields'].append('bg')
+                data['gt_fields'].append('alpha')
+        data['trans_info'] = []  # Record shape change information
+        # Generate trimap from alpha if no trimap file provided
+        if self.get_trimap:
+            if 'trimap' not in data:
+                data['trimap'] = self.gen_trimap(
+                    data['alpha'], mode=self.mode).astype('float32')
+                data['gt_fields'].append('trimap')
+                if self.mode == 'val':
+                    data['ori_trimap'] = data['trimap'].copy()
+        # Delete key which is not need
+        if self.key_del is not None:
+            for key in self.key_del:
+                if key in data.keys():
+                    data.pop(key)
+                if key in data['gt_fields']:
+                    data['gt_fields'].remove(key)
+        data = self.transforms(data)
+        # When evaluation, gt should not be transforms.
+        if self.mode == 'val':
+            data['gt_fields'].append('alpha')
+        data['img'] = data['img'].astype('float32')
+        for key in data.get('gt_fields', []):
+            data[key] = data[key].astype('float32')
+        if 'trimap' in data:
+            data['trimap'] = data['trimap'][np.newaxis, :, :]
+        if 'ori_trimap' in data:
+            data['ori_trimap'] = data['ori_trimap'][np.newaxis, :, :]
+        data['alpha'] = data['alpha'][np.newaxis, :, :] / 255.
+        return data
+    def __len__(self):
+        return len(self.fg_bg_list)
+    def composite(self, fg, alpha, ori_bg):
+        if self.if_rssn:
+            if np.random.rand() < 0.5:
+                fg = cv2.fastNlMeansDenoisingColored(fg, None, 3, 3, 7, 21)
+                ori_bg = cv2.fastNlMeansDenoisingColored(ori_bg, None, 3, 3, 7,
+                                                         21)
+            if np.random.rand() < 0.5:
+                radius = np.random.choice([19, 29, 39, 49, 59])
+                ori_bg = cv2.GaussianBlur(ori_bg, (radius, radius), 0, 0)
+        fg_h, fg_w = fg.shape[:2]
+        ori_bg_h, ori_bg_w = ori_bg.shape[:2]
+        wratio = fg_w / ori_bg_w
+        hratio = fg_h / ori_bg_h
+        ratio = wratio if wratio > hratio else hratio
+        # Resize ori_bg if it is smaller than fg.
+        if ratio > 1:
+            resize_h = math.ceil(ori_bg_h * ratio)
+            resize_w = math.ceil(ori_bg_w * ratio)
+            bg = cv2.resize(
+                ori_bg, (resize_w, resize_h), interpolation=cv2.INTER_LINEAR)
+        else:
+            bg = ori_bg
+        bg = bg[0:fg_h, 0:fg_w, :]
+        alpha = alpha / 255
+        alpha = np.expand_dims(alpha, axis=2)
+        image = alpha * fg + (1 - alpha) * bg
+        image = image.astype(np.uint8)
+        return image, fg, bg
+    @staticmethod
+    def gen_trimap(alpha, mode='train', eval_kernel=25):
+        if mode == 'train':
+            k_size = random.choice(range(2, 5))
+            iterations = np.random.randint(5, 15)
+            kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,
+                                               (k_size, k_size))
+            dilated = cv2.dilate(alpha, kernel, iterations=iterations)
+            eroded = cv2.erode(alpha, kernel, iterations=iterations)
+            trimap = np.zeros(alpha.shape)
+            trimap.fill(128)
+            trimap[eroded > 254.5] = 255
+            trimap[dilated < 0.5] = 0
+        else:
+            k_size = eval_kernel
+            kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,
+                                               (k_size, k_size))
+            dilated = cv2.dilate(alpha, kernel)
+            eroded = cv2.erode(alpha, kernel)
+            trimap = np.zeros(alpha.shape)
+            trimap.fill(128)
+            trimap[eroded > 254.5] = 255
+            trimap[dilated < 0.5] = 0
+        return trimap
--- a/Matting/ppmatting/metrics/__init__.py
+++ b/Matting/ppmatting/metrics/__init__.py
+from .metric import MSE, SAD, Grad, Conn
+metrics_class_dict = {'sad': SAD, 'mse': MSE, 'grad': Grad, 'conn': Conn}
--- a/Matting/ppmatting/metrics/metric.py
+++ b/Matting/ppmatting/metrics/metric.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# Grad and Conn is refer to https://github.com/yucornetto/MGMatting/blob/main/code-base/utils/evaluate.py
+# Output of `Grad` is sightly different from the MATLAB version provided by Adobe (less than 0.1%)
+# Output of `Conn` is smaller than the MATLAB version (~5%, maybe MATLAB has a different algorithm)
+# So do not report results calculated by these functions in your paper.
+# Evaluate your inference with the MATLAB file `DIM_evaluation_code/evaluate.m`.
+import cv2
+import numpy as np
+from scipy.ndimage.filters import convolve
+from scipy.special import gamma
+from skimage.measure import label
+class MSE:
+    """
+    Only calculate the unknown region if trimap provided.
+    """
+    def __init__(self):
+        self.mse_diffs = 0
+        self.count = 0
+    def update(self, pred, gt, trimap=None):
+        """
+        update metric.
+        Args:
+            pred (np.ndarray): The value range is [0., 255.].
+            gt (np.ndarray): The value range is [0, 255].
+            trimap (np.ndarray, optional) The value is in {0, 128, 255}. Default: None.
+        """
+        if trimap is None:
+            trimap = np.ones_like(gt) * 128
+        if not (pred.shape == gt.shape == trimap.shape):
+            raise ValueError(
+                'The shape of `pred`, `gt` and `trimap` should be equal. '
+                'but they are {}, {} and {}'.format(pred.shape, gt.shape,
+                                                    trimap.shape))
+        pred[trimap == 0] = 0
+        pred[trimap == 255] = 255
+        mask = trimap == 128
+        pixels = float(mask.sum())
+        pred = pred / 255.
+        gt = gt / 255.
+        diff = (pred - gt) * mask
+        mse_diff = (diff**2).sum() / pixels if pixels > 0 else 0
+        self.mse_diffs += mse_diff
+        self.count += 1
+        return mse_diff
+    def evaluate(self):
+        mse = self.mse_diffs / self.count if self.count > 0 else 0
+        return mse
+class SAD:
+    """
+    Only calculate the unknown region if trimap provided.
+    """
+    def __init__(self):
+        self.sad_diffs = 0
+        self.count = 0
+    def update(self, pred, gt, trimap=None):
+        """
+        update metric.
+        Args:
+            pred (np.ndarray): The value range is [0., 255.].
+            gt (np.ndarray): The value range is [0., 255.].
+            trimap (np.ndarray, optional)L The value is in {0, 128, 255}. Default: None.
+        """
+        if trimap is None:
+            trimap = np.ones_like(gt) * 128
+        if not (pred.shape == gt.shape == trimap.shape):
+            raise ValueError(
+                'The shape of `pred`, `gt` and `trimap` should be equal. '
+                'but they are {}, {} and {}'.format(pred.shape, gt.shape,
+                                                    trimap.shape))
+        pred[trimap == 0] = 0
+        pred[trimap == 255] = 255
+        mask = trimap == 128
+        pred = pred / 255.
+        gt = gt / 255.
+        diff = (pred - gt) * mask
+        sad_diff = (np.abs(diff)).sum()
+        sad_diff /= 1000
+        self.sad_diffs += sad_diff
+        self.count += 1
+        return sad_diff
+    def evaluate(self):
+        sad = self.sad_diffs / self.count if self.count > 0 else 0
+        return sad
+class Grad:
+    """
+    Only calculate the unknown region if trimap provided.
+    Refer to: https://github.com/open-mlab/mmediting/blob/master/mmedit/core/evaluation/metrics.py
+    """
+    def __init__(self):
+        self.grad_diffs = 0
+        self.count = 0
+    def gaussian(self, x, sigma):
+        return np.exp(-x**2 / (2 * sigma**2)) / (sigma * np.sqrt(2 * np.pi))
+    def dgaussian(self, x, sigma):
+        return -x * self.gaussian(x, sigma) / sigma**2
+    def gauss_filter(self, sigma, epsilon=1e-2):
+        half_size = np.ceil(
+            sigma * np.sqrt(-2 * np.log(np.sqrt(2 * np.pi) * sigma * epsilon)))
+        size = int(2 * half_size + 1)
+        # create filter in x axis
+        filter_x = np.zeros((size, size))
+        for i in range(size):
+            for j in range(size):
+                filter_x[i, j] = self.gaussian(
+                    i - half_size, sigma) * self.dgaussian(j - half_size, sigma)
+        # normalize filter
+        norm = np.sqrt((filter_x**2).sum())
+        filter_x = filter_x / norm
+        filter_y = np.transpose(filter_x)
+        return filter_x, filter_y
+    def gauss_gradient(self, img, sigma):
+        filter_x, filter_y = self.gauss_filter(sigma)
+        img_filtered_x = cv2.filter2D(
+            img, -1, filter_x, borderType=cv2.BORDER_REPLICATE)
+        img_filtered_y = cv2.filter2D(
+            img, -1, filter_y, borderType=cv2.BORDER_REPLICATE)
+        return np.sqrt(img_filtered_x**2 + img_filtered_y**2)
+    def update(self, pred, gt, trimap=None, sigma=1.4):
+        """
+        update metric.
+        Args:
+            pred (np.ndarray): The value range is [0., 1.].
+            gt (np.ndarray): The value range is [0, 255].
+            trimap (np.ndarray, optional)L The value is in {0, 128, 255}. Default: None.
+            sigma (float, optional): Standard deviation of the gaussian kernel. Default: 1.4.
+        """
+        if trimap is None:
+            trimap = np.ones_like(gt) * 128
+        if not (pred.shape == gt.shape == trimap.shape):
+            raise ValueError(
+                'The shape of `pred`, `gt` and `trimap` should be equal. '
+                'but they are {}, {} and {}'.format(pred.shape, gt.shape,
+                                                    trimap.shape))
+        pred[trimap == 0] = 0
+        pred[trimap == 255] = 255
+        gt = gt.squeeze()
+        pred = pred.squeeze()
+        gt = gt.astype(np.float64)
+        pred = pred.astype(np.float64)
+        gt_normed = np.zeros_like(gt)
+        pred_normed = np.zeros_like(pred)
+        cv2.normalize(gt, gt_normed, 1., 0., cv2.NORM_MINMAX)
+        cv2.normalize(pred, pred_normed, 1., 0., cv2.NORM_MINMAX)
+        gt_grad = self.gauss_gradient(gt_normed, sigma).astype(np.float32)
+        pred_grad = self.gauss_gradient(pred_normed, sigma).astype(np.float32)
+        grad_diff = ((gt_grad - pred_grad)**2 * (trimap == 128)).sum()
+        grad_diff /= 1000
+        self.grad_diffs += grad_diff
+        self.count += 1
+        return grad_diff
+    def evaluate(self):
+        grad = self.grad_diffs / self.count if self.count > 0 else 0
+        return grad
+class Conn:
+    """
+    Only calculate the unknown region if trimap provided.
+    Refer to: Refer to: https://github.com/open-mlab/mmediting/blob/master/mmedit/core/evaluation/metrics.py
+    """
+    def __init__(self):
+        self.conn_diffs = 0
+        self.count = 0
+    def update(self, pred, gt, trimap=None, step=0.1):
+        """
+        update metric.
+        Args:
+            pred (np.ndarray): The value range is [0., 1.].
+            gt (np.ndarray): The value range is [0, 255].
+            trimap (np.ndarray, optional)L The value is in {0, 128, 255}. Default: None.
+            step (float, optional): Step of threshold when computing intersection between
+            `gt` and `pred`. Default: 0.1.
+        """
+        if trimap is None:
+            trimap = np.ones_like(gt) * 128
+        if not (pred.shape == gt.shape == trimap.shape):
+            raise ValueError(
+                'The shape of `pred`, `gt` and `trimap` should be equal. '
+                'but they are {}, {} and {}'.format(pred.shape, gt.shape,
+                                                    trimap.shape))
+        pred[trimap == 0] = 0
+        pred[trimap == 255] = 255
+        gt = gt.squeeze()
+        pred = pred.squeeze()
+        gt = gt.astype(np.float32) / 255
+        pred = pred.astype(np.float32) / 255
+        thresh_steps = np.arange(0, 1 + step, step)
+        round_down_map = -np.ones_like(gt)
+        for i in range(1, len(thresh_steps)):
+            gt_thresh = gt >= thresh_steps[i]
+            pred_thresh = pred >= thresh_steps[i]
+            intersection = (gt_thresh & pred_thresh).astype(np.uint8)
+            # connected components
+            _, output, stats, _ = cv2.connectedComponentsWithStats(
+                intersection, connectivity=4)
+            # start from 1 in dim 0 to exclude background
+            size = stats[1:, -1]
+            # largest connected component of the intersection
+            omega = np.zeros_like(gt)
+            if len(size) != 0:
+                max_id = np.argmax(size)
+                # plus one to include background
+                omega[output == max_id + 1] = 1
+            mask = (round_down_map == -1) & (omega == 0)
+            round_down_map[mask] = thresh_steps[i - 1]
+        round_down_map[round_down_map == -1] = 1
+        gt_diff = gt - round_down_map
+        pred_diff = pred - round_down_map
+        # only calculate difference larger than or equal to 0.15
+        gt_phi = 1 - gt_diff * (gt_diff >= 0.15)
+        pred_phi = 1 - pred_diff * (pred_diff >= 0.15)
+        conn_diff = np.sum(np.abs(gt_phi - pred_phi) * (trimap == 128))
+        conn_diff /= 1000
+        self.conn_diffs += conn_diff
+        self.count += 1
+        return conn_diff
+    def evaluate(self):
+        conn = self.conn_diffs / self.count if self.count > 0 else 0
+        return conn
--- a/Matting/ppmatting/ml/__init__.py
+++ b/Matting/ppmatting/ml/__init__.py
+from .methods import CloseFormMatting, KNNMatting, LearningBasedMatting, FastMatting, RandomWalksMatting
--- a/Matting/ppmatting/ml/methods.py
+++ b/Matting/ppmatting/ml/methods.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import pymatting
+from paddleseg.cvlibs import manager
+class BaseMLMatting(object):
+    def __init__(self, alpha_estimator, **kargs):
+        self.alpha_estimator = alpha_estimator
+        self.kargs = kargs
+    def __call__(self, image, trimap):
+        image = self.__to_float64(image)
+        trimap = self.__to_float64(trimap)
+        alpha_matte = self.alpha_estimator(image, trimap, **self.kargs)
+        return alpha_matte
+    def __to_float64(self, x):
+        x_dtype = x.dtype
+        assert x_dtype in ["float32", "float64"]
+        x = x.astype("float64")
+        return x
+@manager.MODELS.add_component
+class CloseFormMatting(BaseMLMatting):
+    def __init__(self, **kargs):
+        cf_alpha_estimator = pymatting.estimate_alpha_cf
+        super().__init__(cf_alpha_estimator, **kargs)
+@manager.MODELS.add_component
+class KNNMatting(BaseMLMatting):
+    def __init__(self, **kargs):
+        knn_alpha_estimator = pymatting.estimate_alpha_knn
+        super().__init__(knn_alpha_estimator, **kargs)
+@manager.MODELS.add_component
+class LearningBasedMatting(BaseMLMatting):
+    def __init__(self, **kargs):
+        lbdm_alpha_estimator = pymatting.estimate_alpha_lbdm
+        super().__init__(lbdm_alpha_estimator, **kargs)
+@manager.MODELS.add_component
+class FastMatting(BaseMLMatting):
+    def __init__(self, **kargs):
+        lkm_alpha_estimator = pymatting.estimate_alpha_lkm
+        super().__init__(lkm_alpha_estimator, **kargs)
+@manager.MODELS.add_component
+class RandomWalksMatting(BaseMLMatting):
+    def __init__(self, **kargs):
+        rw_alpha_estimator = pymatting.estimate_alpha_rw
+        super().__init__(rw_alpha_estimator, **kargs)
+if __name__ == "__main__":
+    from pymatting.util.util import load_image, save_image, stack_images
+    from pymatting.foreground.estimate_foreground_ml import estimate_foreground_ml
+    import cv2
+    root = "/mnt/liuyi22/PaddlePaddle/PaddleSeg/Matting/data/examples/"
+    image_path = root + "lemur.png"
+    trimap_path = root + "lemur_trimap.png"
+    cutout_path = root + "lemur_cutout.png"
+    image = cv2.cvtColor(
+        cv2.imread(image_path).astype("float64"), cv2.COLOR_BGR2RGB) / 255.0
+    cv2.imwrite("image.png", (image * 255).astype('uint8'))
+    trimap = load_image(trimap_path, "GRAY")
+    print(image.shape, trimap.shape)
+    print(image.dtype, trimap.dtype)
+    cf = CloseFormMatting()
+    alpha = cf(image, trimap)
+    # alpha = pymatting.estimate_alpha_lkm(image, trimap)
+    foreground = estimate_foreground_ml(image, alpha)
+    cutout = stack_images(foreground, alpha)
+    save_image(cutout_path, cutout)
--- a/Matting/ppmatting/models/__init__.py
+++ b/Matting/ppmatting/models/__init__.py
+from .backbone import *
+from .losses import *
+from .modnet import MODNet
+from .human_matting import HumanMatting
+from .dim import DIM
+from .ppmatting import PPMatting
+from .gca import GCABaseline, GCA
+from .ppmattingv2 import PPMattingV2
--- a/Matting/ppmatting/models/backbone/__init__.py
+++ b/Matting/ppmatting/models/backbone/__init__.py
+from .mobilenet_v2 import *
+from .hrnet import *
+from .resnet_vd import *
+from .vgg import *
+from .gca_enc import *
+from .stdcnet import *
\ No newline at end of file
--- a/Matting/ppmatting/models/backbone/gca_enc.py
+++ b/Matting/ppmatting/models/backbone/gca_enc.py
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# The gca code was heavily based on https://github.com/Yaoyi-Li/GCA-Matting
+# and https://github.com/open-mmlab/mmediting
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddleseg.cvlibs import manager, param_init
+from paddleseg.utils import utils
+from ppmatting.models.layers import GuidedCxtAtten
+class ResNet_D(nn.Layer):
+    def __init__(self,
+                 input_channels,
+                 layers,
+                 late_downsample=False,
+                 pretrained=None):
+        super().__init__()
+        self.pretrained = pretrained
+        self._norm_layer = nn.BatchNorm
+        self.inplanes = 64
+        self.late_downsample = late_downsample
+        self.midplanes = 64 if late_downsample else 32
+        self.start_stride = [1, 2, 1, 2] if late_downsample else [2, 1, 2, 1]
+        self.conv1 = nn.utils.spectral_norm(
+            nn.Conv2D(
+                input_channels,
+                32,
+                kernel_size=3,
+                stride=self.start_stride[0],
+                padding=1,
+                bias_attr=False))
+        self.conv2 = nn.utils.spectral_norm(
+            nn.Conv2D(
+                32,
+                self.midplanes,
+                kernel_size=3,
+                stride=self.start_stride[1],
+                padding=1,
+                bias_attr=False))
+        self.conv3 = nn.utils.spectral_norm(
+            nn.Conv2D(
+                self.midplanes,
+                self.inplanes,
+                kernel_size=3,
+                stride=self.start_stride[2],
+                padding=1,
+                bias_attr=False))
+        self.bn1 = self._norm_layer(32)
+        self.bn2 = self._norm_layer(self.midplanes)
+        self.bn3 = self._norm_layer(self.inplanes)
+        self.activation = nn.ReLU()
+        self.layer1 = self._make_layer(
+            BasicBlock, 64, layers[0], stride=self.start_stride[3])
+        self.layer2 = self._make_layer(BasicBlock, 128, layers[1], stride=2)
+        self.layer3 = self._make_layer(BasicBlock, 256, layers[2], stride=2)
+        self.layer_bottleneck = self._make_layer(
+            BasicBlock, 512, layers[3], stride=2)
+        self.init_weight()
+    def _make_layer(self, block, planes, block_num, stride=1):
+        if block_num == 0:
+            return nn.Sequential(nn.Identity())
+        norm_layer = self._norm_layer
+        downsample = None
+        if stride != 1:
+            downsample = nn.Sequential(
+                nn.AvgPool2D(2, stride),
+                nn.utils.spectral_norm(
+                    conv1x1(self.inplanes, planes * block.expansion)),
+                norm_layer(planes * block.expansion), )
+        elif self.inplanes != planes * block.expansion:
+            downsample = nn.Sequential(
+                nn.utils.spectral_norm(
+                    conv1x1(self.inplanes, planes * block.expansion, stride)),
+                norm_layer(planes * block.expansion), )
+        layers = [block(self.inplanes, planes, stride, downsample, norm_layer)]
+        self.inplanes = planes * block.expansion
+        for _ in range(1, block_num):
+            layers.append(block(self.inplanes, planes, norm_layer=norm_layer))
+        return nn.Sequential(*layers)
+    def forward(self, x):
+        x = self.conv1(x)
+        x = self.bn1(x)
+        x = self.activation(x)
+        x = self.conv2(x)
+        x = self.bn2(x)
+        x1 = self.activation(x)  # N x 32 x 256 x 256
+        x = self.conv3(x1)
+        x = self.bn3(x)
+        x2 = self.activation(x)  # N x 64 x 128 x 128
+        x3 = self.layer1(x2)  # N x 64 x 128 x 128
+        x4 = self.layer2(x3)  # N x 128 x 64 x 64
+        x5 = self.layer3(x4)  # N x 256 x 32 x 32
+        x = self.layer_bottleneck(x5)  # N x 512 x 16 x 16
+        return x, (x1, x2, x3, x4, x5)
+    def init_weight(self):
+        for layer in self.sublayers():
+            if isinstance(layer, nn.Conv2D):
+                if hasattr(layer, "weight_orig"):
+                    param = layer.weight_orig
+                else:
+                    param = layer.weight
+                param_init.xavier_uniform(param)
+            elif isinstance(layer, (nn.BatchNorm, nn.SyncBatchNorm)):
+                param_init.constant_init(layer.weight, value=1.0)
+                param_init.constant_init(layer.bias, value=0.0)
+            elif isinstance(layer, BasicBlock):
+                param_init.constant_init(layer.bn2.weight, value=0.0)
+        if self.pretrained is not None:
+            utils.load_pretrained_model(self, self.pretrained)
+@manager.MODELS.add_component
+class ResShortCut_D(ResNet_D):
+    def __init__(self,
+                 input_channels,
+                 layers,
+                 late_downsample=False,
+                 pretrained=None):
+        super().__init__(
+            input_channels,
+            layers,
+            late_downsample=late_downsample,
+            pretrained=pretrained)
+        self.shortcut_inplane = [input_channels, self.midplanes, 64, 128, 256]
+        self.shortcut_plane = [32, self.midplanes, 64, 128, 256]
+        self.shortcut = nn.LayerList()
+        for stage, inplane in enumerate(self.shortcut_inplane):
+            self.shortcut.append(
+                self._make_shortcut(inplane, self.shortcut_plane[stage]))
+    def _make_shortcut(self, inplane, planes):
+        return nn.Sequential(
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    inplane, planes, kernel_size=3, padding=1,
+                    bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(planes),
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    planes, planes, kernel_size=3, padding=1, bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(planes))
+    def forward(self, x):
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.activation(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        x1 = self.activation(out)  # N x 32 x 256 x 256
+        out = self.conv3(x1)
+        out = self.bn3(out)
+        out = self.activation(out)
+        x2 = self.layer1(out)  # N x 64 x 128 x 128
+        x3 = self.layer2(x2)  # N x 128 x 64 x 64
+        x4 = self.layer3(x3)  # N x 256 x 32 x 32
+        out = self.layer_bottleneck(x4)  # N x 512 x 16 x 16
+        fea1 = self.shortcut[0](x)  # input image and trimap
+        fea2 = self.shortcut[1](x1)
+        fea3 = self.shortcut[2](x2)
+        fea4 = self.shortcut[3](x3)
+        fea5 = self.shortcut[4](x4)
+        return out, {
+            'shortcut': (fea1, fea2, fea3, fea4, fea5),
+            'image': x[:, :3, ...]
+        }
+@manager.MODELS.add_component
+class ResGuidedCxtAtten(ResNet_D):
+    def __init__(self,
+                 input_channels,
+                 layers,
+                 late_downsample=False,
+                 pretrained=None):
+        super().__init__(
+            input_channels,
+            layers,
+            late_downsample=late_downsample,
+            pretrained=pretrained)
+        self.input_channels = input_channels
+        self.shortcut_inplane = [input_channels, self.midplanes, 64, 128, 256]
+        self.shortcut_plane = [32, self.midplanes, 64, 128, 256]
+        self.shortcut = nn.LayerList()
+        for stage, inplane in enumerate(self.shortcut_inplane):
+            self.shortcut.append(
+                self._make_shortcut(inplane, self.shortcut_plane[stage]))
+        self.guidance_head = nn.Sequential(
+            nn.Pad2D(
+                1, mode="reflect"),
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    3, 16, kernel_size=3, padding=0, stride=2,
+                    bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(16),
+            nn.Pad2D(
+                1, mode="reflect"),
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    16, 32, kernel_size=3, padding=0, stride=2,
+                    bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(32),
+            nn.Pad2D(
+                1, mode="reflect"),
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    32,
+                    128,
+                    kernel_size=3,
+                    padding=0,
+                    stride=2,
+                    bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(128))
+        self.gca = GuidedCxtAtten(128, 128)
+        self.init_weight()
+    def init_weight(self):
+        for layer in self.sublayers():
+            if isinstance(layer, nn.Conv2D):
+                initializer = nn.initializer.XavierUniform()
+                if hasattr(layer, "weight_orig"):
+                    param = layer.weight_orig
+                else:
+                    param = layer.weight
+                initializer(param, param.block)
+            elif isinstance(layer, (nn.BatchNorm, nn.SyncBatchNorm)):
+                param_init.constant_init(layer.weight, value=1.0)
+                param_init.constant_init(layer.bias, value=0.0)
+            elif isinstance(layer, BasicBlock):
+                param_init.constant_init(layer.bn2.weight, value=0.0)
+        if self.pretrained is not None:
+            utils.load_pretrained_model(self, self.pretrained)
+    def _make_shortcut(self, inplane, planes):
+        return nn.Sequential(
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    inplane, planes, kernel_size=3, padding=1,
+                    bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(planes),
+            nn.utils.spectral_norm(
+                nn.Conv2D(
+                    planes, planes, kernel_size=3, padding=1, bias_attr=False)),
+            nn.ReLU(),
+            self._norm_layer(planes))
+    def forward(self, x):
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.activation(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        x1 = self.activation(out)  # N x 32 x 256 x 256
+        out = self.conv3(x1)
+        out = self.bn3(out)
+        out = self.activation(out)
+        im_fea = self.guidance_head(
+            x[:, :3, ...])  # downsample origin image and extract features
+        if self.input_channels == 6:
+            unknown = F.interpolate(
+                x[:, 4:5, ...], scale_factor=1 / 8, mode='nearest')
+        else:
+            unknown = x[:, 3:, ...].equal(paddle.to_tensor([1.]))
+            unknown = paddle.cast(unknown, dtype='float32')
+            unknown = F.interpolate(unknown, scale_factor=1 / 8, mode='nearest')
+        x2 = self.layer1(out)  # N x 64 x 128 x 128
+        x3 = self.layer2(x2)  # N x 128 x 64 x 64
+        x3 = self.gca(im_fea, x3, unknown)  # contextual attention
+        x4 = self.layer3(x3)  # N x 256 x 32 x 32
+        out = self.layer_bottleneck(x4)  # N x 512 x 16 x 16
+        fea1 = self.shortcut[0](x)  # input image and trimap
+        fea2 = self.shortcut[1](x1)
+        fea3 = self.shortcut[2](x2)
+        fea4 = self.shortcut[3](x3)
+        fea5 = self.shortcut[4](x4)
+        return out, {
+            'shortcut': (fea1, fea2, fea3, fea4, fea5),
+            'image_fea': im_fea,
+            'unknown': unknown,
+        }
+class BasicBlock(nn.Layer):
+    expansion = 1
+    def __init__(self,
+                 inplanes,
+                 planes,
+                 stride=1,
+                 downsample=None,
+                 norm_layer=None):
+        super().__init__()
+        if norm_layer is None:
+            norm_layer = nn.BatchNorm
+        # Both self.conv1 and self.downsample layers downsample the input when stride != 1
+        self.conv1 = nn.utils.spectral_norm(conv3x3(inplanes, planes, stride))
+        self.bn1 = norm_layer(planes)
+        self.activation = nn.ReLU()
+        self.conv2 = nn.utils.spectral_norm(conv3x3(planes, planes))
+        self.bn2 = norm_layer(planes)
+        self.downsample = downsample
+        self.stride = stride
+    def forward(self, x):
+        identity = x
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.activation(out)
+        out = self.conv2(out)
+        out = self.bn2(out)
+        if self.downsample is not None:
+            identity = self.downsample(x)
+        out += identity
+        out = self.activation(out)
+        return out
+def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
+    """3x3 convolution with padding"""
+    return nn.Conv2D(
+        in_planes,
+        out_planes,
+        kernel_size=3,
+        stride=stride,
+        padding=dilation,
+        groups=groups,
+        bias_attr=False,
+        dilation=dilation)
+def conv1x1(in_planes, out_planes, stride=1):
+    """1x1 convolution"""
+    return nn.Conv2D(
+        in_planes, out_planes, kernel_size=1, stride=stride, bias_attr=False)
--- a/Matting/ppmatting/models/backbone/hrnet.py
+++ b/Matting/ppmatting/models/backbone/hrnet.py
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import math
+import paddle
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddleseg.cvlibs import manager, param_init
+from paddleseg.models import layers
+import ppmatting
+__all__ = [
+    "HRNet_W18_Small_V1", "HRNet_W18_Small_V2", "HRNet_W18", "HRNet_W30",
+    "HRNet_W32", "HRNet_W40", "HRNet_W44", "HRNet_W48", "HRNet_W60", "HRNet_W64"
+]
+class HRNet(nn.Layer):
+    """
+    The HRNet implementation based on PaddlePaddle.
+    The original article refers to
+    Jingdong Wang, et, al. "HRNet：Deep High-Resolution Representation Learning for Visual Recognition"
+    (https://arxiv.org/pdf/1908.07919.pdf).
+    Args:
+        pretrained (str, optional): The path of pretrained model.
+        stage1_num_modules (int, optional): Number of modules for stage1. Default 1.
+        stage1_num_blocks (list, optional): Number of blocks per module for stage1. Default (4).
+        stage1_num_channels (list, optional): Number of channels per branch for stage1. Default (64).
+        stage2_num_modules (int, optional): Number of modules for stage2. Default 1.
+        stage2_num_blocks (list, optional): Number of blocks per module for stage2. Default (4, 4).
+        stage2_num_channels (list, optional): Number of channels per branch for stage2. Default (18, 36).
+        stage3_num_modules (int, optional): Number of modules for stage3. Default 4.
+        stage3_num_blocks (list, optional): Number of blocks per module for stage3. Default (4, 4, 4).
+        stage3_num_channels (list, optional): Number of channels per branch for stage3. Default [18, 36, 72).
+        stage4_num_modules (int, optional): Number of modules for stage4. Default 3.
+        stage4_num_blocks (list, optional): Number of blocks per module for stage4. Default (4, 4, 4, 4).
+        stage4_num_channels (list, optional): Number of channels per branch for stage4. Default (18, 36, 72. 144).
+        has_se (bool, optional): Whether to use Squeeze-and-Excitation module. Default False.
+        align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even,
+            e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False.
+    """
+    def __init__(self,
+                 input_channels=3,
+                 pretrained=None,
+                 stage1_num_modules=1,
+                 stage1_num_blocks=(4, ),
+                 stage1_num_channels=(64, ),
+                 stage2_num_modules=1,
+                 stage2_num_blocks=(4, 4),
+                 stage2_num_channels=(18, 36),
+                 stage3_num_modules=4,
+                 stage3_num_blocks=(4, 4, 4),
+                 stage3_num_channels=(18, 36, 72),
+                 stage4_num_modules=3,
+                 stage4_num_blocks=(4, 4, 4, 4),
+                 stage4_num_channels=(18, 36, 72, 144),
+                 has_se=False,
+                 align_corners=False,
+                 padding_same=True):
+        super(HRNet, self).__init__()
+        self.pretrained = pretrained
+        self.stage1_num_modules = stage1_num_modules
+        self.stage1_num_blocks = stage1_num_blocks
+        self.stage1_num_channels = stage1_num_channels
+        self.stage2_num_modules = stage2_num_modules
+        self.stage2_num_blocks = stage2_num_blocks
+        self.stage2_num_channels = stage2_num_channels
+        self.stage3_num_modules = stage3_num_modules
+        self.stage3_num_blocks = stage3_num_blocks
+        self.stage3_num_channels = stage3_num_channels
+        self.stage4_num_modules = stage4_num_modules
+        self.stage4_num_blocks = stage4_num_blocks
+        self.stage4_num_channels = stage4_num_channels
+        self.has_se = has_se
+        self.align_corners = align_corners
+        self.feat_channels = [i for i in stage4_num_channels]
+        self.feat_channels = [64] + self.feat_channels
+        self.conv_layer1_1 = layers.ConvBNReLU(
+            in_channels=input_channels,
+            out_channels=64,
+            kernel_size=3,
+            stride=2,
+            padding=1 if not padding_same else 'same',
+            bias_attr=False)
+        self.conv_layer1_2 = layers.ConvBNReLU(
+            in_channels=64,
+            out_channels=64,
+            kernel_size=3,
+            stride=2,
+            padding=1 if not padding_same else 'same',
+            bias_attr=False)
+        self.la1 = Layer1(
+            num_channels=64,
+            num_blocks=self.stage1_num_blocks[0],
+            num_filters=self.stage1_num_channels[0],
+            has_se=has_se,
+            name="layer2",
+            padding_same=padding_same)
+        self.tr1 = TransitionLayer(
+            in_channels=[self.stage1_num_channels[0] * 4],
+            out_channels=self.stage2_num_channels,
+            name="tr1",
+            padding_same=padding_same)
+        self.st2 = Stage(
+            num_channels=self.stage2_num_channels,
+            num_modules=self.stage2_num_modules,
+            num_blocks=self.stage2_num_blocks,
+            num_filters=self.stage2_num_channels,
+            has_se=self.has_se,
+            name="st2",
+            align_corners=align_corners,
+            padding_same=padding_same)
+        self.tr2 = TransitionLayer(
+            in_channels=self.stage2_num_channels,
+            out_channels=self.stage3_num_channels,
+            name="tr2",
+            padding_same=padding_same)
+        self.st3 = Stage(
+            num_channels=self.stage3_num_channels,
+            num_modules=self.stage3_num_modules,
+            num_blocks=self.stage3_num_blocks,
+            num_filters=self.stage3_num_channels,
+            has_se=self.has_se,
+            name="st3",
+            align_corners=align_corners,
+            padding_same=padding_same)
+        self.tr3 = TransitionLayer(
+            in_channels=self.stage3_num_channels,
+            out_channels=self.stage4_num_channels,
+            name="tr3",
+            padding_same=padding_same)
+        self.st4 = Stage(
+            num_channels=self.stage4_num_channels,
+            num_modules=self.stage4_num_modules,
+            num_blocks=self.stage4_num_blocks,
+            num_filters=self.stage4_num_channels,
+            has_se=self.has_se,
+            name="st4",
+            align_corners=align_corners,
+            padding_same=padding_same)
+        self.init_weight()
+    def forward(self, x):
+        feat_list = []
+        conv1 = self.conv_layer1_1(x)
+        feat_list.append(conv1)
+        conv2 = self.conv_layer1_2(conv1)
+        la1 = self.la1(conv2)
+        tr1 = self.tr1([la1])
+        st2 = self.st2(tr1)
+        tr2 = self.tr2(st2)
+        st3 = self.st3(tr2)
+        tr3 = self.tr3(st3)
+        st4 = self.st4(tr3)
+        feat_list = feat_list + st4
+        return feat_list
+    def init_weight(self):
+        for layer in self.sublayers():
+            if isinstance(layer, nn.Conv2D):
+                param_init.normal_init(layer.weight, std=0.001)
+            elif isinstance(layer, (nn.BatchNorm, nn.SyncBatchNorm)):
+                param_init.constant_init(layer.weight, value=1.0)
+                param_init.constant_init(layer.bias, value=0.0)
+        if self.pretrained is not None:
+            ppmatting.utils.load_pretrained_model(self, self.pretrained)
+class Layer1(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_filters,
+                 num_blocks,
+                 has_se=False,
+                 name=None,
+                 padding_same=True):
+        super(Layer1, self).__init__()
+        self.bottleneck_block_list = []
+        for i in range(num_blocks):
+            bottleneck_block = self.add_sublayer(
+                "bb_{}_{}".format(name, i + 1),
+                BottleneckBlock(
+                    num_channels=num_channels if i == 0 else num_filters * 4,
+                    num_filters=num_filters,
+                    has_se=has_se,
+                    stride=1,
+                    downsample=True if i == 0 else False,
+                    name=name + '_' + str(i + 1),
+                    padding_same=padding_same))
+            self.bottleneck_block_list.append(bottleneck_block)
+    def forward(self, x):
+        conv = x
+        for block_func in self.bottleneck_block_list:
+            conv = block_func(conv)
+        return conv
+class TransitionLayer(nn.Layer):
+    def __init__(self, in_channels, out_channels, name=None, padding_same=True):
+        super(TransitionLayer, self).__init__()
+        num_in = len(in_channels)
+        num_out = len(out_channels)
+        self.conv_bn_func_list = []
+        for i in range(num_out):
+            residual = None
+            if i < num_in:
+                if in_channels[i] != out_channels[i]:
+                    residual = self.add_sublayer(
+                        "transition_{}_layer_{}".format(name, i + 1),
+                        layers.ConvBNReLU(
+                            in_channels=in_channels[i],
+                            out_channels=out_channels[i],
+                            kernel_size=3,
+                            padding=1 if not padding_same else 'same',
+                            bias_attr=False))
+            else:
+                residual = self.add_sublayer(
+                    "transition_{}_layer_{}".format(name, i + 1),
+                    layers.ConvBNReLU(
+                        in_channels=in_channels[-1],
+                        out_channels=out_channels[i],
+                        kernel_size=3,
+                        stride=2,
+                        padding=1 if not padding_same else 'same',
+                        bias_attr=False))
+            self.conv_bn_func_list.append(residual)
+    def forward(self, x):
+        outs = []
+        for idx, conv_bn_func in enumerate(self.conv_bn_func_list):
+            if conv_bn_func is None:
+                outs.append(x[idx])
+            else:
+                if idx < len(x):
+                    outs.append(conv_bn_func(x[idx]))
+                else:
+                    outs.append(conv_bn_func(x[-1]))
+        return outs
+class Branches(nn.Layer):
+    def __init__(self,
+                 num_blocks,
+                 in_channels,
+                 out_channels,
+                 has_se=False,
+                 name=None,
+                 padding_same=True):
+        super(Branches, self).__init__()
+        self.basic_block_list = []
+        for i in range(len(out_channels)):
+            self.basic_block_list.append([])
+            for j in range(num_blocks[i]):
+                in_ch = in_channels[i] if j == 0 else out_channels[i]
+                basic_block_func = self.add_sublayer(
+                    "bb_{}_branch_layer_{}_{}".format(name, i + 1, j + 1),
+                    BasicBlock(
+                        num_channels=in_ch,
+                        num_filters=out_channels[i],
+                        has_se=has_se,
+                        name=name + '_branch_layer_' + str(i + 1) + '_' +
+                        str(j + 1),
+                        padding_same=padding_same))
+                self.basic_block_list[i].append(basic_block_func)
+    def forward(self, x):
+        outs = []
+        for idx, input in enumerate(x):
+            conv = input
+            for basic_block_func in self.basic_block_list[idx]:
+                conv = basic_block_func(conv)
+            outs.append(conv)
+        return outs
+class BottleneckBlock(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_filters,
+                 has_se,
+                 stride=1,
+                 downsample=False,
+                 name=None,
+                 padding_same=True):
+        super(BottleneckBlock, self).__init__()
+        self.has_se = has_se
+        self.downsample = downsample
+        self.conv1 = layers.ConvBNReLU(
+            in_channels=num_channels,
+            out_channels=num_filters,
+            kernel_size=1,
+            bias_attr=False)
+        self.conv2 = layers.ConvBNReLU(
+            in_channels=num_filters,
+            out_channels=num_filters,
+            kernel_size=3,
+            stride=stride,
+            padding=1 if not padding_same else 'same',
+            bias_attr=False)
+        self.conv3 = layers.ConvBN(
+            in_channels=num_filters,
+            out_channels=num_filters * 4,
+            kernel_size=1,
+            bias_attr=False)
+        if self.downsample:
+            self.conv_down = layers.ConvBN(
+                in_channels=num_channels,
+                out_channels=num_filters * 4,
+                kernel_size=1,
+                bias_attr=False)
+        if self.has_se:
+            self.se = SELayer(
+                num_channels=num_filters * 4,
+                num_filters=num_filters * 4,
+                reduction_ratio=16,
+                name=name + '_fc')
+        self.add = layers.Add()
+        self.relu = layers.Activation("relu")
+    def forward(self, x):
+        residual = x
+        conv1 = self.conv1(x)
+        conv2 = self.conv2(conv1)
+        conv3 = self.conv3(conv2)
+        if self.downsample:
+            residual = self.conv_down(x)
+        if self.has_se:
+            conv3 = self.se(conv3)
+        y = self.add(conv3, residual)
+        y = self.relu(y)
+        return y
+class BasicBlock(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_filters,
+                 stride=1,
+                 has_se=False,
+                 downsample=False,
+                 name=None,
+                 padding_same=True):
+        super(BasicBlock, self).__init__()
+        self.has_se = has_se
+        self.downsample = downsample
+        self.conv1 = layers.ConvBNReLU(
+            in_channels=num_channels,
+            out_channels=num_filters,
+            kernel_size=3,
+            stride=stride,
+            padding=1 if not padding_same else 'same',
+            bias_attr=False)
+        self.conv2 = layers.ConvBN(
+            in_channels=num_filters,
+            out_channels=num_filters,
+            kernel_size=3,
+            padding=1 if not padding_same else 'same',
+            bias_attr=False)
+        if self.downsample:
+            self.conv_down = layers.ConvBNReLU(
+                in_channels=num_channels,
+                out_channels=num_filters,
+                kernel_size=1,
+                bias_attr=False)
+        if self.has_se:
+            self.se = SELayer(
+                num_channels=num_filters,
+                num_filters=num_filters,
+                reduction_ratio=16,
+                name=name + '_fc')
+        self.add = layers.Add()
+        self.relu = layers.Activation("relu")
+    def forward(self, x):
+        residual = x
+        conv1 = self.conv1(x)
+        conv2 = self.conv2(conv1)
+        if self.downsample:
+            residual = self.conv_down(x)
+        if self.has_se:
+            conv2 = self.se(conv2)
+        y = self.add(conv2, residual)
+        y = self.relu(y)
+        return y
+class SELayer(nn.Layer):
+    def __init__(self, num_channels, num_filters, reduction_ratio, name=None):
+        super(SELayer, self).__init__()
+        self.pool2d_gap = nn.AdaptiveAvgPool2D(1)
+        self._num_channels = num_channels
+        med_ch = int(num_channels / reduction_ratio)
+        stdv = 1.0 / math.sqrt(num_channels * 1.0)
+        self.squeeze = nn.Linear(
+            num_channels,
+            med_ch,
+            weight_attr=paddle.ParamAttr(
+                initializer=nn.initializer.Uniform(-stdv, stdv)))
+        stdv = 1.0 / math.sqrt(med_ch * 1.0)
+        self.excitation = nn.Linear(
+            med_ch,
+            num_filters,
+            weight_attr=paddle.ParamAttr(
+                initializer=nn.initializer.Uniform(-stdv, stdv)))
+    def forward(self, x):
+        pool = self.pool2d_gap(x)
+        pool = paddle.reshape(pool, shape=[-1, self._num_channels])
+        squeeze = self.squeeze(pool)
+        squeeze = F.relu(squeeze)
+        excitation = self.excitation(squeeze)
+        excitation = F.sigmoid(excitation)
+        excitation = paddle.reshape(
+            excitation, shape=[-1, self._num_channels, 1, 1])
+        out = x * excitation
+        return out
+class Stage(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_modules,
+                 num_blocks,
+                 num_filters,
+                 has_se=False,
+                 multi_scale_output=True,
+                 name=None,
+                 align_corners=False,
+                 padding_same=True):
+        super(Stage, self).__init__()
+        self._num_modules = num_modules
+        self.stage_func_list = []
+        for i in range(num_modules):
+            if i == num_modules - 1 and not multi_scale_output:
+                stage_func = self.add_sublayer(
+                    "stage_{}_{}".format(name, i + 1),
+                    HighResolutionModule(
+                        num_channels=num_channels,
+                        num_blocks=num_blocks,
+                        num_filters=num_filters,
+                        has_se=has_se,
+                        multi_scale_output=False,
+                        name=name + '_' + str(i + 1),
+                        align_corners=align_corners,
+                        padding_same=padding_same))
+            else:
+                stage_func = self.add_sublayer(
+                    "stage_{}_{}".format(name, i + 1),
+                    HighResolutionModule(
+                        num_channels=num_channels,
+                        num_blocks=num_blocks,
+                        num_filters=num_filters,
+                        has_se=has_se,
+                        name=name + '_' + str(i + 1),
+                        align_corners=align_corners,
+                        padding_same=padding_same))
+            self.stage_func_list.append(stage_func)
+    def forward(self, x):
+        out = x
+        for idx in range(self._num_modules):
+            out = self.stage_func_list[idx](out)
+        return out
+class HighResolutionModule(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_blocks,
+                 num_filters,
+                 has_se=False,
+                 multi_scale_output=True,
+                 name=None,
+                 align_corners=False,
+                 padding_same=True):
+        super(HighResolutionModule, self).__init__()
+        self.branches_func = Branches(
+            num_blocks=num_blocks,
+            in_channels=num_channels,
+            out_channels=num_filters,
+            has_se=has_se,
+            name=name,
+            padding_same=padding_same)
+        self.fuse_func = FuseLayers(
+            in_channels=num_filters,
+            out_channels=num_filters,
+            multi_scale_output=multi_scale_output,
+            name=name,
+            align_corners=align_corners,
+            padding_same=padding_same)
+    def forward(self, x):
+        out = self.branches_func(x)
+        out = self.fuse_func(out)
+        return out
+class FuseLayers(nn.Layer):
+    def __init__(self,
+                 in_channels,
+                 out_channels,
+                 multi_scale_output=True,
+                 name=None,
+                 align_corners=False,
+                 padding_same=True):
+        super(FuseLayers, self).__init__()
+        self._actual_ch = len(in_channels) if multi_scale_output else 1
+        self._in_channels = in_channels
+        self.align_corners = align_corners
+        self.residual_func_list = []
+        for i in range(self._actual_ch):
+            for j in range(len(in_channels)):
+                if j > i:
+                    residual_func = self.add_sublayer(
+                        "residual_{}_layer_{}_{}".format(name, i + 1, j + 1),
+                        layers.ConvBN(
+                            in_channels=in_channels[j],
+                            out_channels=out_channels[i],
+                            kernel_size=1,
+                            bias_attr=False))
+                    self.residual_func_list.append(residual_func)
+                elif j < i:
+                    pre_num_filters = in_channels[j]
+                    for k in range(i - j):
+                        if k == i - j - 1:
+                            residual_func = self.add_sublayer(
+                                "residual_{}_layer_{}_{}_{}".format(
+                                    name, i + 1, j + 1, k + 1),
+                                layers.ConvBN(
+                                    in_channels=pre_num_filters,
+                                    out_channels=out_channels[i],
+                                    kernel_size=3,
+                                    stride=2,
+                                    padding=1 if not padding_same else 'same',
+                                    bias_attr=False))
+                            pre_num_filters = out_channels[i]
+                        else:
+                            residual_func = self.add_sublayer(
+                                "residual_{}_layer_{}_{}_{}".format(
+                                    name, i + 1, j + 1, k + 1),
+                                layers.ConvBNReLU(
+                                    in_channels=pre_num_filters,
+                                    out_channels=out_channels[j],
+                                    kernel_size=3,
+                                    stride=2,
+                                    padding=1 if not padding_same else 'same',
+                                    bias_attr=False))
+                            pre_num_filters = out_channels[j]
+                        self.residual_func_list.append(residual_func)
+    def forward(self, x):
+        outs = []
+        residual_func_idx = 0
+        for i in range(self._actual_ch):
+            residual = x[i]
+            residual_shape = paddle.shape(residual)[-2:]
+            for j in range(len(self._in_channels)):
+                if j > i:
+                    y = self.residual_func_list[residual_func_idx](x[j])
+                    residual_func_idx += 1
+                    y = F.interpolate(
+                        y,
+                        residual_shape,
+                        mode='bilinear',
+                        align_corners=self.align_corners)
+                    residual = residual + y
+                elif j < i:
+                    y = x[j]
+                    for k in range(i - j):
+                        y = self.residual_func_list[residual_func_idx](y)
+                        residual_func_idx += 1
+                    residual = residual + y
+            residual = F.relu(residual)
+            outs.append(residual)
+        return outs
+@manager.BACKBONES.add_component
+def HRNet_W18_Small_V1(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[1],
+        stage1_num_channels=[32],
+        stage2_num_modules=1,
+        stage2_num_blocks=[2, 2],
+        stage2_num_channels=[16, 32],
+        stage3_num_modules=1,
+        stage3_num_blocks=[2, 2, 2],
+        stage3_num_channels=[16, 32, 64],
+        stage4_num_modules=1,
+        stage4_num_blocks=[2, 2, 2, 2],
+        stage4_num_channels=[16, 32, 64, 128],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W18_Small_V2(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[2],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[2, 2],
+        stage2_num_channels=[18, 36],
+        stage3_num_modules=3,
+        stage3_num_blocks=[2, 2, 2],
+        stage3_num_channels=[18, 36, 72],
+        stage4_num_modules=2,
+        stage4_num_blocks=[2, 2, 2, 2],
+        stage4_num_channels=[18, 36, 72, 144],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W18(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[18, 36],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[18, 36, 72],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[18, 36, 72, 144],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W30(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[30, 60],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[30, 60, 120],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[30, 60, 120, 240],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W32(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[32, 64],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[32, 64, 128],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[32, 64, 128, 256],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W40(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[40, 80],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[40, 80, 160],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[40, 80, 160, 320],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W44(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[44, 88],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[44, 88, 176],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[44, 88, 176, 352],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W48(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[48, 96],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[48, 96, 192],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[48, 96, 192, 384],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W60(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[60, 120],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[60, 120, 240],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[60, 120, 240, 480],
+        **kwargs)
+    return model
+@manager.BACKBONES.add_component
+def HRNet_W64(**kwargs):
+    model = HRNet(
+        stage1_num_modules=1,
+        stage1_num_blocks=[4],
+        stage1_num_channels=[64],
+        stage2_num_modules=1,
+        stage2_num_blocks=[4, 4],
+        stage2_num_channels=[64, 128],
+        stage3_num_modules=4,
+        stage3_num_blocks=[4, 4, 4],
+        stage3_num_channels=[64, 128, 256],
+        stage4_num_modules=3,
+        stage4_num_blocks=[4, 4, 4, 4],
+        stage4_num_channels=[64, 128, 256, 512],
+        **kwargs)
+    return model