Skip to content

Latest commit

 

History

History
608 lines (521 loc) · 26.3 KB

models.md

File metadata and controls

608 lines (521 loc) · 26.3 KB

English | 简体中文

paddleseg.models

The models subpackage contains the following 21 models for image sementic segmentaion.

class paddleseg.models.DeepLabV3P(
        num_classes, 
        backbone, 
        backbone_indices = (0, 3), 
        aspp_ratios = (1, 6, 12, 18), 
        aspp_out_channels = 256, 
        align_corners = False, 
        pretrained = None
)

The DeepLabV3Plus implementation based on PaddlePaddle.

The original article refers to Liang-Chieh Chen, et, al. "Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (paddle.nn.Layer): Backbone network, currently support Resnet50_vd/Resnet101_vd/Xception65.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone. Default: (0, 3)
  • aspp_ratios (tuple, optional): The dilation rate using in ASSP module. If output_stride=16, aspp_ratios should be set as (1, 6, 12, 18). If output_stride=8, aspp_ratios is (1, 12, 24, 36). Default: (1, 6, 12, 18)
  • aspp_out_channels (int, optional): The output channels of ASPP module. Default: 256
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.DeepLabV3(
        num_classes, 
        backbone, 
        backbone_indices = (3, ), 
        aspp_ratios = (1, 6, 12, 18), 
        aspp_out_channels = 256, 
        align_corners = False, 
        pretrained = None
)

The DeepLabV3 implementation based on PaddlePaddle.

The original article refers to Liang-Chieh Chen, et, al. "Rethinking Atrous Convolution for Semantic Image Segmentation".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (paddle.nn.Layer): Backbone network, currently support Resnet50_vd/Resnet101_vd/Xception65.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone. Default: (3, )
  • aspp_ratios (tuple, optional): The dilation rate using in ASSP module. If output_stride=16, aspp_ratios should be set as (1, 6, 12, 18). If output_stride=8, aspp_ratios is (1, 12, 24, 36). Default: (1, 6, 12, 18)
  • aspp_out_channels (int, optional): The output channels of ASPP module. Default: 256
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.FCN(
        num_classes,
        backbone_indices = (-1, ),
        backbone_channels = (270, ),
        channels = None
)

A simple implementation for FCN based on PaddlePaddle.

The original article refers to Evan Shelhamer, et, al. "Fully Convolutional Networks for Semantic Segmentation".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (paddle.nn.Layer): Backbone networks.
  • backbone_indices (tuple, optional): The values in the tuple indicate the indices of output of backbone. Default: (-1, )
  • channels (int, optional): The channels between conv layer and the last layer of FCNHead. If None, it will be the number of channels of input features. Default: None
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.OCRNet(
        num_classes,
        backbone,
        backbone_indices,
        ocr_mid_channels = 512,
        ocr_key_channels = 256,
        align_corners = False,
        pretrained = None
)

The OCRNet implementation based on PaddlePaddle.

The original article refers to Yuan, Yuhui, et al. "Object-Contextual Representations for Semantic Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): Backbone network.
  • backbone_indices (tuple): A tuple indicates the indices of output of backbone. It can be either one or two values, if two values, the first index will be taken as a deep-supervision feature in auxiliary layer; the second one will be taken as input of pixel representation. If one value, it is taken by both above.
  • ocr_mid_channels (int, optional): The number of middle channels in OCRHead. Default: 512
  • ocr_key_channels (int, optional): The number of key channels in ObjectAttentionBlock. Default: 256
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.PSPNet(
        num_classes,
        backbone,
        backbone_indices = (2, 3),
        pp_out_channels = 1024,
        bin_sizes = (1, 2, 3, 6),
        enable_auxiliary_loss = True,
        align_corners = False,
        pretrained = None
)

The PSPNet implementation based on PaddlePaddle.

The original article refers to Zhao, Hengshuang, et al. "Pyramid scene parsing network".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): Backbone network, currently support Resnet50/101.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone.
  • pp_out_channels (int, optional): The output channels after Pyramid Pooling Module. Default: 1024
  • bin_sizes (tuple, optional): The out size of pooled feature maps. Default: (1,2,3,6)
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.ANN(
        num_classes,
        backbone,
        backbone_indices = (2, 3),
        key_value_channels = 256,
        inter_channels = 512,
        psp_size = (1, 3, 6, 8),
        enable_auxiliary_loss = True,
        align_corners = False,
        pretrained = None
)

The ANN implementation based on PaddlePaddle.

The original article refers to Zhen, Zhu, et al. "Asymmetric Non-local Neural Networks for Semantic Segmentation".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): Backbone network, currently support Resnet50/101.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone.
  • key_value_channels (int, optional): The key and value channels of self-attention map in both AFNB and APNB modules. Default: 256
  • inter_channels (int, optional): Both input and output channels of APNB modules. Default: 512
  • psp_size (tuple, optional): The out size of pooled feature maps. Default: (1, 3, 6, 8)
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.BiSeNetV2(
        num_classes,
        lambd = 0.25,
        align_corners = False,
        pretrained = None
)

The BiSeNet V2 implementation based on PaddlePaddle.

The original article refers to Yu, Changqian, et al. "BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • lambd (float, optional): A factor for controlling the size of semantic branch channels. Default: 0.25
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.DANet(
        num_classes,
        lambd = 0.25,
        align_corners = False,
        pretrained = None
)

The DANet implementation based on PaddlePaddle.

The original article refers to Fu, jun, et al. "Dual Attention Network for Scene Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): A backbone network.
  • backbone_indices (tuple): The values in the tuple indicate the indices of output of backbone.
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.FastSCNN(
        num_classes,
        enable_auxiliary_loss = True,
        align_corners = False,
        pretrained = None
)

The FastSCNN implementation based on PaddlePaddle.As mentioned in the original paper, FastSCNN is a real-time segmentation algorithm (123.5fps) even for high resolution images (1024x2048).

The original article refers to Poudel, Rudra PK, et al. "Fast-scnn: Fast semantic segmentation network".

Args

  • num_classes (int): The unique number of target classes.
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. If true, auxiliary loss will be added after LearningToDownsample module. Default: False
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769.. Default:False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.GCNet(
        num_classes,
        backbone,
        backbone_indices = (2, 3),
        gc_channels = 512,
        ratio = 0.25,
        enable_auxiliary_loss = True,
        align_corners = False,
        pretrained = None
)

The GCNet implementation based on PaddlePaddle.

The original article refers to Cao, Yue, et al. "GCnet: Non-local networks meet squeeze-excitation networks and beyond".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): Backbone network, currently support Resnet50/101.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone.
  • gc_channels (int, optional): The input channels to Global Context Block. Default: 512
  • ratio (float, optional): It indicates the ratio of attention channels and gc_channels. Default: 0.25
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.GSCNN(
        num_classes,
        backbone,
        backbone_indices = (0, 1, 2, 3),
        aspp_ratios = (1, 6, 12, 18),
        aspp_out_channels = 256,
        align_corners = False,
        pretrained = None
)

The GSCNN implementation based on PaddlePaddle.

The original article refers to Towaki Takikawa, et, al. "Gated-SCNN: Gated Shape CNNs for Semantic Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (paddle.nn.Layer): Backbone network, currently support Resnet50_vd/Resnet101_vd.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone. Default: (0, 1, 2, 3)
  • aspp_ratios (tuple, optional): The dilation rate using in ASSP module. If output_stride=16, aspp_ratios should be set as (1, 6, 12, 18). If output_stride=8, aspp_ratios is (1, 12, 24, 36). Default: (1, 6, 12, 18)
  • aspp_out_channels (int, optional): The output channels of ASPP module. Default: 256
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.HarDNet(
        num_classes,
        stem_channels = (16, 24, 32, 48),
        ch_list = (64, 96, 160, 224, 320),
        grmul = 1.7,
        gr = (10, 16, 18, 24, 32),
        n_layers = (4, 4, 8, 8, 8),
        align_corners = False,
        pretrained = None
)

[Real Time] The FC-HardDNet 70 implementation based on PaddlePaddle.

The original article refers to Chao, Ping, et al. "HarDNet: A Low Memory Traffic Network"

Args

  • num_classes (int): The unique number of target classes.
  • stem_channels (tuple|list, optional): The number of channels before the encoder. Default: (16, 24, 32, 48)
  • ch_list (tuple|list, optional): The number of channels at each block in the encoder. Default: (64, 96, 160, 224, 320)
  • grmul (float, optional): The channel multiplying factor in HarDBlock, which is m in the paper. Default: 1.7
  • gr (tuple|list, optional): The growth rate in each HarDBlock, which is k in the paper. Default: (10, 16, 18, 24, 32)
  • n_layers (tuple|list, optional): The number of layers in each HarDBlock. Default: (4, 4, 8, 8, 8)
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.UNet(
        num_classes,
        align_corners = False,
        use_deconv = False,
        pretrained = None
)

The UNet implementation based on PaddlePaddle.

The original article refers to Olaf Ronneberger, et, al. "U-Net: Convolutional Networks for Biomedical Image Segmentation".

Args

  • num_classes (int): The unique number of target classes.
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • use_deconv (bool, optional): A bool value indicates whether using deconvolution in upsampling. If False, use resize_bilinear. Default: False
  • pretrained (str, optional): The path or url of pretrained model for fine tuning. Default: None
class paddleseg.models.U2Net(
        num_classes, 
        in_ch = 3, 
        pretrained = None
)

The U^2-Net implementation based on PaddlePaddle.

The original article refers to Xuebin Qin, et, al. "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection".

Args

  • num_classes (int): The unique number of target classes.
  • in_ch (int, optional): Input channels. Default: 3
  • pretrained (str, optional): The path or url of pretrained model for fine tuning. Default: None
class paddleseg.models.U2Netp(
        num_classes, 
        in_ch = 3, 
        pretrained = None
)

The U^2-Netp implementation based on PaddlePaddle.

The original article refers to Xuebin Qin, et, al. "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection".

Args

  • num_classes (int): The unique number of target classes.
  • in_ch (int, optional): Input channels. Default: 3
  • pretrained (str, optional): The path or url of pretrained model for fine tuning. Default:None
class paddleseg.models.AttentionUNet(num_classes, pretrained = None)

The Attention-UNet implementation based on PaddlePaddle.As mentioned in the original paper, author proposes a novel attention gate (AG) that automatically learns to focus on target structures of varying shapes and sizes.Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task.

The original article refers to Oktay, O, et, al. "Attention u-net: Learning where to look for the pancreas.".

Args

  • num_classes (int): The unique number of target classes.
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class UNetPlusPlus(
    in_channels,
    num_classes,
    use_deconv = False,
    align_corners = False,
    pretrained = None,
    is_ds = True
)

The UNet++ implementation based on PaddlePaddle.

The original article refers to Zongwei Zhou, et, al. "UNet++: A Nested U-Net Architecture for Medical Image Segmentation".

Args

  • in_channels (int): The channel number of input image.
  • num_classes (int): The unique number of target classes.
  • use_deconv (bool, optional): A bool value indicates whether using deconvolution in upsampling. If False, use resize_bilinear. Default: False
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model for fine tuning. Default: None
  • is_ds (bool): use deep supervision or not. Default: True
class DecoupledSegNet(
    num_classes,
    backbone,
    backbone_indices = (0, 3),
    aspp_ratios = (1, 6, 12, 18),
    aspp_out_channels = 256,
    align_corners = False,
    pretrained = None
)

The DecoupledSegNet implementation based on PaddlePaddle.

The original article refers to Xiangtai Li, et, al. "Improving Semantic Segmentation via Decoupled Body and Edge Supervision"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (paddle.nn.Layer): Backbone network, currently support Resnet50_vd/Resnet101_vd.
  • backbone_indices (tuple, optional): Two values in the tuple indicate the indices of output of backbone. Default: (0, 3)
  • aspp_ratios (tuple, optional): The dilation rate using in ASSP module. If output_stride=16, aspp_ratios should be set as (1, 6, 12, 18). If output_stride=8, aspp_ratios is (1, 12, 24, 36). Default: (1, 6, 12, 18)
  • aspp_out_channels (int, optional): The output channels of ASPP module. Default: 256
  • align_corners (bool, optional): An argument of F.interpolate. It should be set to False when the feature size is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.ISANet(
        num_classes, 
        backbone, 
        backbone_indices = (2, 3), 
        isa_channels = 256, 
        down_factor = (8, 8), 
        enable_auxiliary_loss = True, 
        align_corners = False, 
        pretrained = None
)

The ISANet implementation based on PaddlePaddle.

The original article refers to Lang Huang, et al. "Interlaced Sparse Self-Attention for Semantic Segmentation".

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): A backbone network.
  • backbone_indices (tuple): The values in the tuple indicate the indices of output of backbone.
  • isa_channels (int): The channels of ISA Module.
  • down_factor (tuple): Divide the height and width dimension to (Ph, PW) groups.
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.EMANet(
        num_classes, 
        backbone, 
        backbone_indices = (2, 3), 
        ema_channels = 512, 
        gc_channels = 256, 
        num_bases = 64, 
        stage_num = 3, 
        momentum = 0.1, 
        concat_input = True, 
        enable_auxiliary_loss = True, 
        align_corners = False, 
        pretrained = None
)

The EMANet implementation based on PaddlePaddle.

The original article refers to Xia Li, et al. "Expectation-Maximization Attention Networks for Semantic Segmentation"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): A backbone network.
  • backbone_indices (tuple): The values in the tuple indicate the indices of output of backbone.
  • ema_channels (int): EMA module channels.
  • gc_channels (int): The input channels to Global Context Block.
  • num_bases (int): Number of bases.
  • stage_num (int): The iteration number for EM.
  • momentum (float): The parameter for updating bases.
  • concat_input (bool): Whether concat the input and output of convs before classification layer. Default: True
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None
class paddleseg.models.DNLNet(
    num_classes, backbone, 
    backbone_indices = (2, 3), 
    reduction = 2, 
    use_scale = True, 
    mode = 'embedded_gaussian', 
    temperature = 0.05, 
    concat_input = True, 
    enable_auxiliary_loss = True, 
    align_corners = False, 
    pretrained = None
)

The DNLNet implementation based on PaddlePaddle.

The original article refers to Minghao Yin, et al. "Disentangled Non-Local Neural Networks"

Args

  • num_classes (int): The unique number of target classes.
  • backbone (Paddle.nn.Layer): A backbone network.
  • backbone_indices (tuple): The values in the tuple indicate the indices of output of backbone.
  • reduction (int): Reduction factor of projection transform. Default: 2
  • use_scale (bool): Whether to scale pairwise_weight by sqrt(1/inter_channels). Default: False
  • mode (str): The nonlocal mode. Options are 'embedded_gaussian', 'dot_product'. Default: 'embedded_gaussian'
  • temperature (float): Temperature to adjust attention. Default: 0.05
  • concat_input (bool): Whether concat the input and output of convs before classification layer. Default: True
  • enable_auxiliary_loss (bool, optional): A bool value indicates whether adding auxiliary loss. Default: True
  • align_corners (bool): An argument of F.interpolate. It should be set to False when the output size of feature is even, e.g. 1024x512, otherwise it is True, e.g. 769x769. Default: False
  • pretrained (str, optional): The path or url of pretrained model. Default: None