hat.models

Models widely used in upper module in HAT.

Models

structures

Classifier

The basic structure of classifier.

Segmentor

The basic structure of segmentor.

detectors

FCOS

The basic structure of retinanet.

backbones

EfficientNet

A module of EfficientNet.

MobileNetV1

A module of mobilenetv1.

ResNet18

A module of resnet18.

VargConvNet

A module of vargconvnet.

VargNetV2

A module of vargnetv2.

necks

BiFPN

Weighted Bi-directional Feature Pyramid Network(BiFPN).

DwUnet

Unet segmentation neck structure.

FPN

RetinaNetFPN

FPN for RetinaNet.

Unet

Unet neck module.

PAFPN

Path Aggregation Network for Instance Segmentation.

FastSCNNNeck

Upper neck module for segmentation.

task_modules

fcos

FCOSDecoder

param num_classes

Number of categories excluding the background

FCOSHead

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

FCOSTarget

Generate cls and reg targets for FCOS in training stage.

seg

SegDecoder

Semantic Segmentation Decoder.

SegHead

Head Module for segmentation task.

FRCNNSegHead

FRCNNSegHead module for segmentation task.

SegTarget

Generate training targets for Seg task.

fcn

FCNHead(input_index, in_channels, …)

Head Module for FCN.

DepthwiseSeparableFCNHead(in_channels, …)

FCNDecoder([upsample_output_scale])

FCN Decoder.

FCNTarget(num_classes, need_oneshot)

Generate Target for FCN.

deeplab

Deeplabv3plusHead(in_channels, c1_index, …)

Head Module for Deeplab.

losses

CEWithLabelSmooth

The losses of cross-entropy with label smooth.

CrossEntropyLoss

Calculate cross entropy loss of multi stride output.

CrossEntropyLossV2

Calculate cross entropy loss of multi stride output.

FocalLoss

Sigmoid focal loss.

SoftmaxFocalLoss

Focal Loss.

GIoULoss

Generalized Intersection over Union Loss.

SegLoss

Segmentation loss wrapper.

SmoothL1Loss

Smooth L1 Loss.

YOLOV3Loss

The loss module of YOLOv3.

API Reference

class hat.models.structures.Classifier(backbone, losses=None)

The basic structure of classifier.

Parameters
  • backbone (torch.nn.Module) – Backbone module.

  • losses (torch.nn.Module) – Losses module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.Segmentor(backbone, neck, head, losses=None)

The basic structure of segmentor.

Parameters
  • backbone (torch.nn.Module) – Backbone module.

  • neck (torch.nn.Module) – Neck module.

  • head (torch.nn.Module) – Head module.

  • losses (torch.nn.Module) – Losses module.

forward(data: dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.FCOS(backbone: Dict, neck: Optional[Dict] = None, head: Optional[Dict] = None, targets: Optional[Dict] = None, post_process: Optional[Dict] = None, loss_cls: Optional[Dict] = None, loss_reg: Optional[Dict] = None, loss_centerness: Optional[Dict] = None)

The basic structure of retinanet.

Parameters
  • backbone (Dict) – dict for building backbone module.

  • neck (Dict) – dict for building neck module.

  • head (Dict) – dict for building head module.

  • anchors (Dict) – dict for building anchors module.

  • targets (Dict) – dict for building target module.

extract_feat(img)

Directly extract features from the backbone + neck.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.RetinaNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, anchors: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None)

The basic structure of retinanet.

Parameters
  • backbone – backbone module or dict for building backbone module.

  • neck – neck module or dict for building neck module.

  • head – head module or dict for building head module.

  • anchors – anchors module or dict for building anchors module.

  • targets – targets module or dict for building target module.

  • post_process – post_process module or dict for building post_process module.

  • loss_cls – loss_cls module or dict for building loss_cls module.

  • loss_reg – loss_reg module or dict for building loss_reg module.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.YOLOV3(backbone: Optional[dict] = None, neck: Optional[dict] = None, head: Optional[dict] = None, anchor_generator: Optional[dict] = None, target_generator: Optional[dict] = None, loss: Optional[dict] = None, postprocess: Optional[dict] = None)

The basic structure of yolov3.

Parameters
  • backbone (torch.nn.Module) – Backbone module.

  • neck (torch.nn.Module) – Neck module.

  • head (torch.nn.Module) – Head module.

  • anchor_generator (torch.nn.Module) – Anchor generator module.

  • target_generator (torch.nn.Module) – Target generator module.

  • loss (torch.nn.Module) – Loss module.

  • postprocess (torch.nn.Module) – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.EfficientNet(model_type: str, coefficient_params: tuple, num_classes: int, bn_kwargs: Optional[dict] = None, bias: bool = False, drop_connect_rate: float = 0.2, depth_division: int = 8, activation: str = 'relu', use_se_block: bool = False, blocks_args: Sequence[Dict] = (BlockArgs(kernel_size=3, num_repeat=1, in_filters=32, out_filters=16, expand_ratio=1, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=2, in_filters=16, out_filters=24, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=2, in_filters=24, out_filters=40, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=3, in_filters=40, out_filters=80, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=3, in_filters=80, out_filters=112, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=4, in_filters=112, out_filters=192, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=1, in_filters=192, out_filters=320, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25)), include_top: bool = True, flat_output: bool = True, resolution: int = 0, use_drop_connect: bool = False)

A module of EfficientNet.

Parameters
  • model_type (str) – Select to use which EfficientNet(B0-B7 or lite0-4), for EfficientNet model, model_type must be one of: [‘b0’, ‘b1’, ‘b2’, ‘b3’, ‘b4’, ‘b5’, ‘b6’, ‘b7’], for EfficientNet-lite model, model_type must be one of: [‘lite0’, ‘lite1’, ‘lite2’, ‘lite3’, ‘lite4’].

  • coefficient_params (tuple) – Parameter coefficients of EfficientNet, include: width_coefficient(float): scaling coefficient for net width. depth_coefficient(float): scaling coefficient for net depth. default_resolution(int): default input image size. dropout_rate(float): dropout rate for final classifier layer. num_classes (int): Num classes of output layer.

  • bn_kwargs (dict) – Dict for Bn layer.

  • bias (bool) – Whether to use bias in module.

  • drop_connect_rate (float) – Dropout rate at skip connections.

  • depth_division (int) – Depth division, Defaults to 8.

  • activation (str) – Activation layer, defaults to ‘relu’.

  • use_se_block (bool) – Whether to use SEBlock in module.

  • blocks_args (list) – A list of BlockArgs to MBConvBlock modules.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • use_drop_connect (bool) – Whether to use drop connect.

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.MobileNetV1(num_classes: int, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, dw_with_relu: bool = True, include_top: bool = True, flat_output: bool = True)

A module of mobilenetv1.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • alpha (float) – Alpha for mobilenetv1.

  • bias (bool) – Whether to use bias in module.

  • dw_with_relu (bool) – Whether to use relu in dw conv.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.ResNet18(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True)

A module of resnet18.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.ResNet50(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True)

A module of resnet50.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.TinyVargNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None)

A module of TinyVargNetv2.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • alpha (float) – Alpha for tinyvargnetv2.

  • group_base (int) – Group base for tinyvargnetv2.

  • factor (int) – Factor for channel expansion in basic block.

  • bias (bool) – Whether to use bias in module.

  • extend_features (bool) – Whether to extend features.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • input_sequence_length (int) – Length of input sequence.

  • head_factor (int) – Factor for channels expansion of stage1(mod2).

  • input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

class hat.models.backbones.VarGDarkNet53(max_channels: int, bn_kwargs: dict, num_classes: int, include_top: bool, flat_output: bool)
forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.VargConvNet(num_classes, bn_kwargs: dict, channels_list: list, repeats: list, group_list: int, factor_list: int, out_channels: int = 1024, bias: bool = True, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, deep_stem: bool = True)

A module of vargconvnet.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • channels_list (list) – List for output channels

  • repeats (list) – Depth of each stage.

  • group_list (list) – Group of each stage.

  • factor_list (list) – Factor for each stage.

  • out_channels (int) – Output channels.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • deep_stem (bool) – Whether use deep stem.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.VargNetV2(num_classes, bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None)

A module of vargnetv2.

Parameters
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • model_type (str) – Choose to use VargNetV2 or TinyVargNetV2.

  • alpha (float) – Alpha for vargnetv2.

  • group_base (int) – Group base for vargnetv2.

  • factor (int) – Factor for channel expansion in basic block.

  • bias (bool) – Whether to use bias in module.

  • extend_features (bool) – Whether to extend features.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • input_sequence_length (int) – Length of input sequence.

  • head_factor (int) – Factor for channels expansion of stage1(mod2).

  • input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

hat.models.backbones.get_vargnetv2_stride2channels(alpha: float, channels: Optional[List[int]] = None, strides: Optional[List[int]] = None)Dict

Get vargnet v2 stride to channel dict with giving channels and strides.

Parameters
  • alpha – channel multipler.

  • channels – base channel of each stride.

  • strides – stride list corresponding to channels.

Returns

strides2channels: a stride to channel dict.

class hat.models.necks.BiFPN(in_strides, out_strides, stride2channels, out_channels, num_outs, stack=3, start_level=0, end_level=- 1, fpn_name='bifpn_sum')

Weighted Bi-directional Feature Pyramid Network(BiFPN).

This is an implementation of - EfficientDet: Scalable and Efficient Object Detection (https://arxiv.org/abs/1911.09070)

Parameters
  • in_strides (list[int]) – Stride of input feature map

  • out_strides (int) – Stride of output feature map

  • stride2channels (dict) – The key:value is stride:channel , the channles have been multipified by alpha

  • out_channels (int|dict) – Channel number of output layer, the key:value is stride:channel.

  • num_outs (int) – Number of BifpnLayer’s input, the value is must 5, because the bifpn layer is fixed

  • stack (int) – Number of BifpnLayer

  • start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.

  • end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, means the last level.

  • fpn_name (str) – the value is mutst between with ‘bifpn_sum’, ‘bifpn_fa’

forward(inputs)

Forward features.

Parameters

inputs (list[tensor]) – Input tensors

Returns (list[tensor]): Output tensors

class hat.models.necks.DwUnet(base_channels: int, bn_kwargs: Optional[Dict] = None, act_type: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, use_deconv: bool = False, dw_with_act: bool = False, output_scales: Sequence = (4, 8, 16, 32, 64))

Unet segmentation neck structure.

Built with separable convolution layers.

Parameters
  • base_channels (int) – Output channel number of the output layer of scale 1.

  • bn_kwargs (Dict, optional) – Keyword arguments for BN layer. Defaults to {}.

  • use_deconv (bool, optional) – Whether user deconv for upsampling layer. Defaults to False.

  • dw_with_act (bool, optional) – Whether user relu after the depthwise conv in SeparableConv. Defaults to False.

  • output_scales (Sequence, optional) – The scale of each output layer. Defaults to (4, 8, 16, 32, 64).

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.FPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None, bn_kwargs: Optional[Dict] = None)
forward(features: List[torch.Tensor])List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.FastSCNNNeck(in_channels: List[int], feat_channels: List[int], indexes: List[int], bn_kwargs: Optional[Dict] = None)

Upper neck module for segmentation.

Parameters
  • in_channels (list) – channels of each input feature map

  • feat_channels (list) – channels for featture maps.

  • indexes (list) – indexes of inputs.

  • bn_kwargs (dict) – Dict for Bn layer.

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.PAFPN(in_channels, out_channels, num_outs, start_level=0, end_level=- 1, add_extra_convs=False, relu_before_extra_convs=False)

Path Aggregation Network for Instance Segmentation.

This is an implementation of the PAFPN in Path Aggregation Network <https://arxiv.org/abs/1803.01534>.

Parameters
  • in_channels (List[int]) – Number of input channels per scale.

  • out_channels (int) – Number of output channels (used at each scale)

  • num_outs (int) – Number of output scales.

  • start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.

  • end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, which means the last level.

  • add_extra_convs (bool | str) –

    If bool, it decides whether to add conv layers on top of the original feature maps. Default to False. If True, it is equivalent to add_extra_convs=’on_input’. If str, it specifies the source feature map of the extra convs. Only the following options are allowed:

    • ’on_input’: Last feat map of neck inputs (i.e. backbone feature).

    • ’on_lateral’: Last feature map after lateral convs.

    • ’on_output’: The last output feature map after fpn convs.

  • relu_before_extra_convs (bool) – Whether to apply relu before the extra conv. Default: False.

forward(inputs)

Forward function.

class hat.models.necks.RetinaNetFPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None)

FPN for RetinaNet.

The difference with FPN is that RetinaNetFPN has two extra convs correspond to stride 64 and stride 128 except the lateral convs.

Parameters
  • in_strides (list) – strides of each input feature map

  • in_channels (list) – channels of each input feature map, the length of in_channels should be equal to in_strides

  • out_strides (list) – strides of each output feature map, should be a subset of in_strides, and continuous (any subsequence of 2, 4, 8, 16, 32, 64 …). The largest stride in in_strides and out_strides should be equal

  • out_channels (list) – channels of each output feature maps the length of out_channels should be equal to out_strides

  • fix_out_channel (int, optional) – if set, there will be a 1x1 conv following each output feature map so that each final output has fix_out_channel channels

forward(features: List[torch.Tensor])List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()

Initialize the weights of FPN module.

class hat.models.necks.Unet(in_strides: List[int], out_strides: List[int], stride2channels: Dict[int, int], factor: int = 2, use_bias: bool = False, bn_kwargs: Optional[Dict] = None, group_base: int = 8)

Unet neck module.

Parameters
  • in_strides – contains the strides of feature maps from backbone.

  • out_strides – contains the strides of feature maps the neck output.

  • stride2channels – stride to channel dict

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.YoloGroupNeck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True, head_group: bool = True)

Necks module of yolov3.

Parameters
  • backbone_idx (list) – Index of backbone output for necks.

  • in_channels_list (list) – List of input channels.

  • out_channels_list (list) – List of output channels.

  • bn_kwargs (dict) – Config dict for BN layer.

  • bias (bool) – Whether to use bias in module.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CEWithLabelSmooth(smooth_alpha=0.1)

The losses of cross-entropy with label smooth.

Parameters

smooth_alpha (float) – Alpha of label smooth.

forward(input, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CrossEntropyLoss(loss_name, preds_name, label_name, weight_name='weight', avg_factor_name='avg_factor', use_sigmoid=False, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=- 1)

Calculate cross entropy loss of multi stride output.

This class will depracated in future, use CrossEntropyLossV2 instead.

Parameters
  • loss_name (str) – The key of loss in return dict.

  • preds_name (str) – The key of pred in pred dict.

  • label_name (str) – The key of label in target dict.

  • weight_name (str) – The key of weight in target dict.

  • avg_factor_name (str) – The key of avg_factor in target dict.

  • use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

  • class_weight (list[float]) – Weight of each class. Defaults is None.

  • loss_weight (float) – Global weight of loss. Defaults is 1.

  • ignore_index (int) – Only works when using cross_entropy.

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred_dict: Mapping, target_dict: Mapping)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CrossEntropyLossV2(use_sigmoid: bool = False, reduction: str = 'mean', class_weight: Optional[List[float]] = None, loss_weight: float = 1.0, ignore_index: int = - 1, loss_name: Optional[str] = None, auto_class_weight: Optional[bool] = False, weight_min: Optional[float] = None, weight_noobj: Optional[float] = None, num_class: int = 0)

Calculate cross entropy loss of multi stride output.

Parameters
  • use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

  • class_weight (list[float]) – Weight of each class. Defaults is None.

  • loss_weight (float) – Global weight of loss. Defaults is 1.

  • ignore_index (int) – Only works when using cross_entropy.

  • loss_name (str) – The key of loss in return dict. If None, return loss directly.

Returns

cross entropy loss

forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.FocalLoss(loss_name, num_classes, alpha=0.25, gamma=2.0, loss_weight=1.0, eps=1e-12, reduction='mean')

Sigmoid focal loss.

Parameters
  • loss_name (str) – The key of loss in return dict.

  • num_classes (int) – Num_classes including background, C+1, C is number of foreground categories.

  • alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.

  • gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.

  • loss_weight (float) – Global weight of loss. Defaults is 1.0.

  • eps (float) – A small value to avoid zero denominator.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred, target, weight=None, avg_factor=None, points_per_strides=None, valid_classes_list=None)

Forward method.

Parameters
  • pred (Tensor) – Cls pred, with shape(N, C), C is num_classes of foreground.

  • target (Tensor) – Cls target, with shape(N,), values in [0, C-1] represent the foreground, C or negative value represent the background.

  • weight (Tensor) – The weight of loss for each prediction. Default is None.

  • avg_factor (float) – Normalized factor.

class hat.models.losses.FocalLossV2(alpha: float = 0.25, gamma: float = 2.0, eps: float = 1e-12)

Focal Loss.

Parameters
  • alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.

  • gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.

  • eps (float) – A small value to avoid zero denominator.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)

Forward method.

Parameters
  • pred (Tensor) – cls pred, with shape (B, N, C), C is num_classes of foreground.

  • target (Tensor) – cls target, with shape (B, N, C), C is num_classes of foreground.

  • weight (Tensor) – The weight of loss for each prediction. It is mainly used to filter the ignored box. Default is None.

  • avg_factor (float) – Normalized factor.

class hat.models.losses.GIoULoss(loss_name, loss_weight=1.0, eps=1e-06, reduction='mean')

Generalized Intersection over Union Loss.

Parameters
  • loss_name (str) – The key of loss in return dict.

  • loss_weight (float) – Global weight of loss. Defaults is 1.0.

  • eps (float) – A small value to avoid zero denominator.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred, target, weight=None, avg_factor=None)

Forward method.

Parameters
  • pred (torch.Tensor) – Predicted bboxes of format (x1, y1, x2, y2), represent upper-left and lower-right point, with shape(N, 4).

  • target (torch.Tensor) – Corresponding gt_boxes, the same shape as pred.

  • weight (torch.Tensor) – Element-wise weight loss weight, with shape(N,).

  • avg_factor (float) – Average factor that is used to average the loss.

class hat.models.losses.SegLoss(loss: List[torch.nn.modules.module.Module])

Segmentation loss wrapper.

Parameters

loss (dict) – loss config.

Note

This class is not universe. Make sure you know this class limit before using it.

forward(pred: Any, target: List[Dict])Dict

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.SmoothL1Loss(beta=1.0, reduction='mean')

Smooth L1 Loss.

Parameters
  • beta (float, optional) – The threshold in the piecewise function. Defaults to 1.0.

  • reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)

Forward function.

Parameters
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning target of the prediction.

  • weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.

  • avg_factor (float) – Normalized factor.

class hat.models.losses.SoftTargetCrossEntropy(loss_name=None)

The losses of cross-entropy with soft target.

Parameters

loss_name (str) – The name of returned losses.

forward(input, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.SoftmaxFocalLoss(loss_name: str, num_classes: int, alpha: float = 0.25, gamma: float = 2.0, reduction: str = 'mean', weight: Union[float, Sequence] = 1.0)

Focal Loss.

Parameters
  • loss_name (str) – The key of loss in return dict.

  • num_classes (int) – Class number.

  • alpha (float, optional) – Alpha. Defaults to 0.25.

  • gamma (float, optional) – Gamma. Defaults to 2.0.

  • reduction (str, optional) – Specifies the reduction to apply to the output: 'mean' | 'sum'. Defaults to 'mean'.

  • weight (Union[float, Sequence], optional) – Weight to be applied to the loss of each input. Defaults to 1.0.

forward(logits, labels)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.YOLOV3Loss(num_classes: int, anchors: list, strides: list, ignore_thresh: float, loss_xy: dict, loss_wh: dict, loss_conf: dict, loss_cls: dict, lambda_loss: list)

The loss module of YOLOv3.

Parameters
  • num_classes (int) – Num classes of class branch.

  • anchors (list) – The anchors of YOLOv3.

  • strides (list) – The strides of feature maps.

  • ignore_thresh (float) – Ignore thresh of target.

  • loss_xy (dict) – Losses of xy.

  • loss_wh (dict) – Losses of wh.

  • loss_conf (dict) – Losses of conf.

  • loss_cls (dict) – Losses of cls.

  • lambda_loss (list) – The list of weighted losses.

forward(input, target=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.BBoxUpscaler(strides)
Parameters

strides (Sequence[int]) – A list contains the strides of fcos_head output.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])

Do post process for model predictions.

Parameters
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.DynamicFcosTarget(strides, topK, loss_cls, loss_reg, cls_out_channels, background_label)

Generate cls and reg targets for FCOS in training stage base on dynamic losses.

Parameters
  • strides (Sequence[int]) – Strides of points in multiple feature levels.

  • topK (int) – Number of postive sample for each ground trouth to keep.

  • cls_out_channels (int) – Out_channels of cls_score.

  • background_label (int) – Label ID of background, set as num_classes.

  • loss_cls (nn.Module) – Loss for cls to choose positive target.

  • loss_reg (nn.Module) – Loss for reg to choose positive target.

class hat.models.task_modules.fcos.FCOSDecoder(num_classes, strides, transforms=None, inverse_transform_key=None, nms_use_centerness=True, nms_sqrt=True, rescale=True, test_cfg=None, input_resize_scale=None, truncate_bbox=True, filter_score_mul_centerness=False, upscale_bbox_pred=False)
Parameters
  • num_classes (int) – Number of categories excluding the background category.

  • strides (Sequence[int]) – A list contains the strides of fcos_head output.

  • transforms (Sequence[dict]) – A list contains the transform config.

  • inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.

  • nms_use_centerness (bool, optional) – If True, use centerness as a factor in nms post-processing.

  • nms_sqrt (bool, optional) – If True, sqrt(score_thr * score_factors).

  • rescale (bool, optional) – Whether to map the prediction result to the orig img.

  • test_cfg (dict, optional) – Cfg dict, including some configurations of nms.

  • truncate_bbox (bool, optional) – If True, truncate the predictive bbox out of image boundary. Default True.

  • filter_score_mul_centerness (bool, optional) – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])

Do post process for model predictions.

Parameters
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.FCOSHead(num_classes, in_strides, out_strides, stride2channels, upscale_bbox_pred, feat_channels=256, stacked_convs=4, use_sigmoid=True, share_bn=False, dequant_output=True, int8_output=True, share_conv=True, change_layout=False, deepcopy_share_conv=False)

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

Parameters
  • num_classes (int) – Number of categories excluding the background category.

  • in_strides (Sequence[int]) – A list contains the strides of feature maps from backbone or neck.

  • out_strides (Sequence[int]) – A list contains the strides of this head will output.

  • stride2channels (dict) – A stride to channel dict.

  • feat_channels (int) – Number of hidden channels.

  • stacked_convs (int) – Number of stacking convs of the head.

  • use_sigmoid (bool) – Whether the classification output is obtained using sigmoid.

  • share_bn (bool) – Whether to share bn between multiple levels, default is share_bn.

  • upscale_bbox_pred (bool) – If true, upscale bbox pred by FPN strides.

  • dequant_output (bool) – Whether to dequant output. Default: True

  • int8_output (bool) – If True, output int8, otherwise output int32. Default: True

  • share_conv (bool) – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, i, stride)

Forward features of a single scale levle.

Parameters
  • x (Tensor) – FPN feature maps of the specified stride.

  • i (int) – Index of feature level.

  • stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.

class hat.models.task_modules.fcos.FCOSTarget(strides, regress_ranges, cls_out_channels, background_label, norm_on_bbox=True, center_sampling=True, center_sample_radius=1.5, use_iou_replace_ctrness=False, task_batch_list=None)

Generate cls and reg targets for FCOS in training stage.

Parameters
  • strides (Sequence[int]) – Strides of points in multiple feature levels.

  • regress_ranges (tuple[tuple[int, int]]) – Regress range of multiple level points.

  • cls_out_channels (int) – Out_channels of cls_score.

  • background_label (int) – Label ID of background, set as num_classes.

  • center_sampling (bool) – If true, use center sampling.

  • center_sample_radius – Radius of center sampling. Default: 1.5.

  • norm_on_bbox (bool) – If true, normalize the regression targets with FPN strides.

  • use_iou_replace_ctrness (bool) – If true, use iou as box quality assessment method, else use ctrness. Default: false.

  • task_batch_list ([int, int]) – two datasets use same head, so we generate mask

class hat.models.task_modules.seg.FRCNNSegHead(group_base: int, in_strides: List, in_channels: List, out_strides: List, out_channels: List, bn_kwargs: Dict, with_extra_conv: bool = False, use_bias: bool = True, linear_out: bool = True, argmax_output: bool = False, dequant_output: bool = True, int8_output: bool = False)

FRCNNSegHead module for segmentation task.

Parameters
  • group_base (int) – Group base of group conv

  • in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.

  • in_channels (list[int]) – Number of channels of each input stride.

  • out_strides (list[int]) – List of output strides.

  • out_channels (list[int]) – Number of channels of each output stride.

  • bn_kwargs (dict) – Extra keyword arguments for bn layers.

  • with_extra_conv (bool) – Whether to use extra conv module.

  • use_bias (bool) – Whether to use bias in conv module.

  • linear_out (bool) – Whether NOT to use to act of pw.

  • argmax_output (bool) – Whether conduct argmax on output.

  • dequant_output (bool) – Whether to dequant output.

  • int8_output (bool) – If True, output int8, otherwise output int32.

forward(x: List[torch.Tensor])List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.seg.SegDecoder(out_strides: List[int], decode_strides: List[int], transforms: Optional[List[dict]] = None, inverse_transform_key: Optional[List[str]] = None, output_names: Optional[str] = 'pred_seg')

Semantic Segmentation Decoder.

Parameters
  • out_strides (list[int]) – List of output strides, represents the strides of the output from seg_head.

  • output_names (str or list[str]) – Keys of returned results dict.

  • decode_strides (int or list[int]) – Strides that need to be decoded, should be a subset of out_strides.

  • transforms (Sequence[dict]) – A list contains the transform config.

  • inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.

class hat.models.task_modules.seg.SegHead(num_classes, in_strides, out_strides, stride2channels, feat_channels=256, stride_loss_weights=None, stacked_convs=1, argmax_output=False, dequant_output=True, int8_output=True, upscale_stride4_to_stride2=False, output_with_bn=False, bn_kwargs=None, upsample_output_scale=None)

Head Module for segmentation task.

Parameters
  • num_classes (int) – Number of classes.

  • in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.

  • out_strides (list[int]) – List of output strides.

  • stride2channels (dict) – A stride to channel dict.

  • feat_channels (int or list[int]) – Number of hidden channels (of each output stride).

  • stride_loss_weights (list[int]) – loss weight of each stride.

  • stacked_convs (int) – Number of stacking convs of head.

  • argmax_output (bool) – Whether conduct argmax on output. Default: False

  • dequant_output (bool) – Whether to dequant output. Default: True

  • int8_output (bool) – If True, output int8, otherwise output int32. Default: True

  • upscale_stride4_to_stride2 (bool) – If True, stride4’s feature map is upsampled to stride2, then the stride2 is adding supervisory signal. Default is False.

  • output_with_bn (bool) – Whether add bn layer to the output conv.

  • bn_kwargs (dict) – Extra keyword arguments for bn layers.

  • upsample_output_scale (int) – Output upsample scale, only used in qat model, default is None.

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, stride_index=0)

Forward features of a single scale level.

Parameters
  • x (Tensor) – feature maps of the specified stride.

  • stride_index (int) – stride index of input feature map.

Returns

seg predictions of input feature maps.

Return type

tuple

class hat.models.task_modules.seg.SegTarget(ignore_index=255, label_name='gt_seg')

Generate training targets for Seg task.

Parameters
  • ignore_index (int, optional) – Index of ignore class.

  • label_name (str, optional) – The key corresponding to the gt seg in label.