hat.models¶

Models widely used in upper module in HAT.

Models¶

structures¶

`Classifier`	The basic structure of classifier.
`Segmentor`	The basic structure of segmentor.

detectors¶

FCOS

The basic structure of retinanet.

backbones¶

`EfficientNet`	A module of EfficientNet.
`MobileNetV1`	A module of mobilenetv1.
`ResNet18`	A module of resnet18.
`VargConvNet`	A module of vargconvnet.
`VargNetV2`	A module of vargnetv2.

necks¶

`BiFPN`	Weighted Bi-directional Feature Pyramid Network(BiFPN).
`DwUnet`	Unet segmentation neck structure.
`FPN`
`RetinaNetFPN`	FPN for RetinaNet.
`Unet`	Unet neck module.
`PAFPN`	Path Aggregation Network for Instance Segmentation.
`FastSCNNNeck`	Upper neck module for segmentation.

task_modules¶

fcos¶

FCOSDecoder

param num_classes: Number of categories excluding the background

FCOSHead

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

FCOSTarget

Generate cls and reg targets for FCOS in training stage.

seg¶

`SegDecoder`	Semantic Segmentation Decoder.
`SegHead`	Head Module for segmentation task.
`FRCNNSegHead`	FRCNNSegHead module for segmentation task.
`SegTarget`	Generate training targets for Seg task.

fcn¶

`FCNHead`(input_index, in_channels, …)	Head Module for FCN.
`DepthwiseSeparableFCNHead`(in_channels, …)
`FCNDecoder`([upsample_output_scale])	FCN Decoder.
`FCNTarget`(num_classes, need_oneshot)	Generate Target for FCN.

deeplab¶

Deeplabv3plusHead(in_channels, c1_index, …)

Head Module for Deeplab.

losses¶

`CEWithLabelSmooth`	The losses of cross-entropy with label smooth.
`CrossEntropyLoss`	Calculate cross entropy loss of multi stride output.
`CrossEntropyLossV2`	Calculate cross entropy loss of multi stride output.
`FocalLoss`	Sigmoid focal loss.
`SoftmaxFocalLoss`	Focal Loss.
`GIoULoss`	Generalized Intersection over Union Loss.
`SegLoss`	Segmentation loss wrapper.
`SmoothL1Loss`	Smooth L1 Loss.
`YOLOV3Loss`	The loss module of YOLOv3.

API Reference¶

class hat.models.structures.Classifier(backbone, losses=None)¶

The basic structure of classifier.

Parameters

backbone (torch.nn.Module) – Backbone module.
losses (torch.nn.Module) – Losses module.

forward(data)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.Segmentor(backbone, neck, head, losses=None)¶

The basic structure of segmentor.

Parameters

backbone (torch.nn.Module) – Backbone module.
neck (torch.nn.Module) – Neck module.
head (torch.nn.Module) – Head module.
losses (torch.nn.Module) – Losses module.

forward(data: dict)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.FCOS(backbone: Dict, neck: Optional[Dict] = None, head: Optional[Dict] = None, targets: Optional[Dict] = None, post_process: Optional[Dict] = None, loss_cls: Optional[Dict] = None, loss_reg: Optional[Dict] = None, loss_centerness: Optional[Dict] = None)¶

The basic structure of retinanet.

Parameters

backbone (Dict) – dict for building backbone module.
neck (Dict) – dict for building neck module.
head (Dict) – dict for building head module.
anchors (Dict) – dict for building anchors module.
targets (Dict) – dict for building target module.

extract_feat(img)¶: Directly extract features from the backbone + neck.

forward(data: Dict)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.RetinaNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, anchors: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None)¶

The basic structure of retinanet.

Parameters

backbone – backbone module or dict for building backbone module.
neck – neck module or dict for building neck module.
head – head module or dict for building head module.
anchors – anchors module or dict for building anchors module.
targets – targets module or dict for building target module.
post_process – post_process module or dict for building post_process module.
loss_cls – loss_cls module or dict for building loss_cls module.
loss_reg – loss_reg module or dict for building loss_reg module.

forward(data: Dict)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.YOLOV3(backbone: Optional[dict] = None, neck: Optional[dict] = None, head: Optional[dict] = None, anchor_generator: Optional[dict] = None, target_generator: Optional[dict] = None, loss: Optional[dict] = None, postprocess: Optional[dict] = None)¶

The basic structure of yolov3.

Parameters

backbone (torch.nn.Module) – Backbone module.
neck (torch.nn.Module) – Neck module.
head (torch.nn.Module) – Head module.
anchor_generator (torch.nn.Module) – Anchor generator module.
target_generator (torch.nn.Module) – Target generator module.
loss (torch.nn.Module) – Loss module.
postprocess (torch.nn.Module) – Postprocess module.

forward(data)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.EfficientNet(model_type: str, coefficient_params: tuple, num_classes: int, bn_kwargs: Optional[dict] = None, bias: bool = False, drop_connect_rate: float = 0.2, depth_division: int = 8, activation: str = 'relu', use_se_block: bool = False, blocks_args: Sequence[Dict] = (BlockArgs(kernel_size=3, num_repeat=1, in_filters=32, out_filters=16, expand_ratio=1, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=2, in_filters=16, out_filters=24, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=2, in_filters=24, out_filters=40, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=3, in_filters=40, out_filters=80, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=3, in_filters=80, out_filters=112, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=4, in_filters=112, out_filters=192, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=1, in_filters=192, out_filters=320, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25)), include_top: bool = True, flat_output: bool = True, resolution: int = 0, use_drop_connect: bool = False)¶

A module of EfficientNet.

Parameters

model_type (str) – Select to use which EfficientNet(B0-B7 or lite0-4), for EfficientNet model, model_type must be one of: [‘b0’, ‘b1’, ‘b2’, ‘b3’, ‘b4’, ‘b5’, ‘b6’, ‘b7’], for EfficientNet-lite model, model_type must be one of: [‘lite0’, ‘lite1’, ‘lite2’, ‘lite3’, ‘lite4’].
coefficient_params (tuple) – Parameter coefficients of EfficientNet, include: width_coefficient(float): scaling coefficient for net width. depth_coefficient(float): scaling coefficient for net depth. default_resolution(int): default input image size. dropout_rate(float): dropout rate for final classifier layer. num_classes (int): Num classes of output layer.
bn_kwargs (dict) – Dict for Bn layer.
bias (bool) – Whether to use bias in module.
drop_connect_rate (float) – Dropout rate at skip connections.
depth_division (int) – Depth division, Defaults to 8.
activation (str) – Activation layer, defaults to ‘relu’.
use_se_block (bool) – Whether to use SEBlock in module.
blocks_args (list) – A list of BlockArgs to MBConvBlock modules.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
use_drop_connect (bool) – Whether to use drop connect.

forward(inputs)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.MobileNetV1(num_classes: int, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, dw_with_relu: bool = True, include_top: bool = True, flat_output: bool = True)¶

A module of mobilenetv1.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
alpha (float) – Alpha for mobilenetv1.
bias (bool) – Whether to use bias in module.
dw_with_relu (bool) – Whether to use relu in dw conv.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.ResNet18(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True)¶

A module of resnet18.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.ResNet50(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True)¶

A module of resnet50.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.TinyVargNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None)¶

A module of TinyVargNetv2.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
alpha (float) – Alpha for tinyvargnetv2.
group_base (int) – Group base for tinyvargnetv2.
factor (int) – Factor for channel expansion in basic block.
bias (bool) – Whether to use bias in module.
extend_features (bool) – Whether to extend features.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
input_sequence_length (int) – Length of input sequence.
head_factor (int) – Factor for channels expansion of stage1(mod2).
input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

class hat.models.backbones.VarGDarkNet53(max_channels: int, bn_kwargs: dict, num_classes: int, include_top: bool, flat_output: bool)¶

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.VargConvNet(num_classes, bn_kwargs: dict, channels_list: list, repeats: list, group_list: int, factor_list: int, out_channels: int = 1024, bias: bool = True, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, deep_stem: bool = True)¶

A module of vargconvnet.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
channels_list (list) – List for output channels
repeats (list) – Depth of each stage.
group_list (list) – Group of each stage.
factor_list (list) – Factor for each stage.
out_channels (int) – Output channels.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
deep_stem (bool) – Whether use deep stem.

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.VargNetV2(num_classes, bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None)¶

A module of vargnetv2.

Parameters

num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
model_type (str) – Choose to use VargNetV2 or TinyVargNetV2.
alpha (float) – Alpha for vargnetv2.
group_base (int) – Group base for vargnetv2.
factor (int) – Factor for channel expansion in basic block.
bias (bool) – Whether to use bias in module.
extend_features (bool) – Whether to extend features.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
input_sequence_length (int) – Length of input sequence.
head_factor (int) – Factor for channels expansion of stage1(mod2).
input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

hat.models.backbones.get_vargnetv2_stride2channels(alpha: float, channels: Optional[List[int]] = None, strides: Optional[List[int]] = None) → Dict¶

Get vargnet v2 stride to channel dict with giving channels and strides.

Parameters

alpha – channel multipler.
channels – base channel of each stride.
strides – stride list corresponding to channels.

Returns: strides2channels: a stride to channel dict.

class hat.models.necks.BiFPN(in_strides, out_strides, stride2channels, out_channels, num_outs, stack=3, start_level=0, end_level=- 1, fpn_name='bifpn_sum')¶

Weighted Bi-directional Feature Pyramid Network(BiFPN).

This is an implementation of - EfficientDet: Scalable and Efficient Object Detection (https://arxiv.org/abs/1911.09070)

Parameters

in_strides (list[int]) – Stride of input feature map
out_strides (int) – Stride of output feature map
stride2channels (dict) – The key:value is stride:channel , the channles have been multipified by alpha
out_channels (int|dict) – Channel number of output layer, the key:value is stride:channel.
num_outs (int) – Number of BifpnLayer’s input, the value is must 5, because the bifpn layer is fixed
stack (int) – Number of BifpnLayer
start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.
end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, means the last level.
fpn_name (str) – the value is mutst between with ‘bifpn_sum’, ‘bifpn_fa’

forward(inputs)¶

Forward features.

Parameters: inputs (list[tensor]) – Input tensors

Returns (list[tensor]): Output tensors

class hat.models.necks.DwUnet(base_channels: int, bn_kwargs: Optional[Dict] = None, act_type: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, use_deconv: bool = False, dw_with_act: bool = False, output_scales: Sequence = (4, 8, 16, 32, 64))¶

Unet segmentation neck structure.

Built with separable convolution layers.

Parameters

base_channels (int) – Output channel number of the output layer of scale 1.
bn_kwargs (Dict, optional) – Keyword arguments for BN layer. Defaults to {}.
use_deconv (bool, optional) – Whether user deconv for upsampling layer. Defaults to False.
dw_with_act (bool, optional) – Whether user relu after the depthwise conv in SeparableConv. Defaults to False.
output_scales (Sequence, optional) – The scale of each output layer. Defaults to (4, 8, 16, 32, 64).

forward(inputs)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.FPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None, bn_kwargs: Optional[Dict] = None)¶

forward(features: List[torch.Tensor]) → List[torch.Tensor]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.FastSCNNNeck(in_channels: List[int], feat_channels: List[int], indexes: List[int], bn_kwargs: Optional[Dict] = None)¶

Upper neck module for segmentation.

Parameters

in_channels (list) – channels of each input feature map
feat_channels (list) – channels for featture maps.
indexes (list) – indexes of inputs.
bn_kwargs (dict) – Dict for Bn layer.

forward(inputs)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.PAFPN(in_channels, out_channels, num_outs, start_level=0, end_level=- 1, add_extra_convs=False, relu_before_extra_convs=False)¶

Path Aggregation Network for Instance Segmentation.

This is an implementation of the PAFPN in Path Aggregation Network <https://arxiv.org/abs/1803.01534>.

Parameters

in_channels (List[int]) – Number of input channels per scale.
out_channels (int) – Number of output channels (used at each scale)
num_outs (int) – Number of output scales.
start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.
end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, which means the last level.
add_extra_convs (bool | str) –
If bool, it decides whether to add conv layers on top of the original feature maps. Default to False. If True, it is equivalent to add_extra_convs=’on_input’. If str, it specifies the source feature map of the extra convs. Only the following options are allowed:
- ’on_input’: Last feat map of neck inputs (i.e. backbone feature).
- ’on_lateral’: Last feature map after lateral convs.
- ’on_output’: The last output feature map after fpn convs.
relu_before_extra_convs (bool) – Whether to apply relu before the extra conv. Default: False.

forward(inputs)¶: Forward function.

class hat.models.necks.RetinaNetFPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None)¶

FPN for RetinaNet.

The difference with FPN is that RetinaNetFPN has two extra convs correspond to stride 64 and stride 128 except the lateral convs.

Parameters

in_strides (list) – strides of each input feature map
in_channels (list) – channels of each input feature map, the length of in_channels should be equal to in_strides
out_strides (list) – strides of each output feature map, should be a subset of in_strides, and continuous (any subsequence of 2, 4, 8, 16, 32, 64 …). The largest stride in in_strides and out_strides should be equal
out_channels (list) – channels of each output feature maps the length of out_channels should be equal to out_strides
fix_out_channel (int, optional) – if set, there will be a 1x1 conv following each output feature map so that each final output has fix_out_channel channels

forward(features: List[torch.Tensor]) → List[torch.Tensor]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()¶: Initialize the weights of FPN module.

class hat.models.necks.Unet(in_strides: List[int], out_strides: List[int], stride2channels: Dict[int, int], factor: int = 2, use_bias: bool = False, bn_kwargs: Optional[Dict] = None, group_base: int = 8)¶

Unet neck module.

Parameters

in_strides – contains the strides of feature maps from backbone.
out_strides – contains the strides of feature maps the neck output.
stride2channels – stride to channel dict

forward(features)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.YoloGroupNeck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True, head_group: bool = True)¶

Necks module of yolov3.

Parameters

backbone_idx (list) – Index of backbone output for necks.
in_channels_list (list) – List of input channels.
out_channels_list (list) – List of output channels.
bn_kwargs (dict) – Config dict for BN layer.
bias (bool) – Whether to use bias in module.

forward(x)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CEWithLabelSmooth(smooth_alpha=0.1)¶

The losses of cross-entropy with label smooth.

Parameters: smooth_alpha (float) – Alpha of label smooth.

forward(input, target)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CrossEntropyLoss(loss_name, preds_name, label_name, weight_name='weight', avg_factor_name='avg_factor', use_sigmoid=False, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=- 1)¶

Calculate cross entropy loss of multi stride output.

This class will depracated in future, use CrossEntropyLossV2 instead.

Parameters

loss_name (str) – The key of loss in return dict.
preds_name (str) – The key of pred in pred dict.
label_name (str) – The key of label in target dict.
weight_name (str) – The key of weight in target dict.
avg_factor_name (str) – The key of avg_factor in target dict.
use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].
class_weight (list[float]) – Weight of each class. Defaults is None.
loss_weight (float) – Global weight of loss. Defaults is 1.
ignore_index (int) – Only works when using cross_entropy.

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred_dict: Mapping, target_dict: Mapping)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.CrossEntropyLossV2(use_sigmoid: bool = False, reduction: str = 'mean', class_weight: Optional[List[float]] = None, loss_weight: float = 1.0, ignore_index: int = - 1, loss_name: Optional[str] = None, auto_class_weight: Optional[bool] = False, weight_min: Optional[float] = None, weight_noobj: Optional[float] = None, num_class: int = 0)¶

Calculate cross entropy loss of multi stride output.

Parameters

use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].
class_weight (list[float]) – Weight of each class. Defaults is None.
loss_weight (float) – Global weight of loss. Defaults is 1.
ignore_index (int) – Only works when using cross_entropy.
loss_name (str) – The key of loss in return dict. If None, return loss directly.

Returns

cross entropy loss

forward(pred, target, weight=None, avg_factor=None)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.FocalLoss(loss_name, num_classes, alpha=0.25, gamma=2.0, loss_weight=1.0, eps=1e-12, reduction='mean')¶

Sigmoid focal loss.

Parameters

loss_name (str) – The key of loss in return dict.
num_classes (int) – Num_classes including background, C+1, C is number of foreground categories.
alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.
gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.
loss_weight (float) – Global weight of loss. Defaults is 1.0.
eps (float) – A small value to avoid zero denominator.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred, target, weight=None, avg_factor=None, points_per_strides=None, valid_classes_list=None)¶

Forward method.

Parameters

pred (Tensor) – Cls pred, with shape(N, C), C is num_classes of foreground.
target (Tensor) – Cls target, with shape(N,), values in [0, C-1] represent the foreground, C or negative value represent the background.
weight (Tensor) – The weight of loss for each prediction. Default is None.
avg_factor (float) – Normalized factor.

class hat.models.losses.FocalLossV2(alpha: float = 0.25, gamma: float = 2.0, eps: float = 1e-12)¶

Focal Loss.

Parameters

alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.
gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.
eps (float) – A small value to avoid zero denominator.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)¶

Forward method.

Parameters

pred (Tensor) – cls pred, with shape (B, N, C), C is num_classes of foreground.
target (Tensor) – cls target, with shape (B, N, C), C is num_classes of foreground.
weight (Tensor) – The weight of loss for each prediction. It is mainly used to filter the ignored box. Default is None.
avg_factor (float) – Normalized factor.

class hat.models.losses.GIoULoss(loss_name, loss_weight=1.0, eps=1e-06, reduction='mean')¶

Generalized Intersection over Union Loss.

Parameters

loss_name (str) – The key of loss in return dict.
loss_weight (float) – Global weight of loss. Defaults is 1.0.
eps (float) – A small value to avoid zero denominator.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

Returns

A dict containing the calculated loss, the key of loss is loss_name.

Return type

dict

forward(pred, target, weight=None, avg_factor=None)¶

Forward method.

Parameters

pred (torch.Tensor) – Predicted bboxes of format (x1, y1, x2, y2), represent upper-left and lower-right point, with shape(N, 4).
target (torch.Tensor) – Corresponding gt_boxes, the same shape as pred.
weight (torch.Tensor) – Element-wise weight loss weight, with shape(N,).
avg_factor (float) – Average factor that is used to average the loss.

class hat.models.losses.SegLoss(loss: List[torch.nn.modules.module.Module])¶

Segmentation loss wrapper.

Parameters: loss (dict) – loss config.

Note

This class is not universe. Make sure you know this class limit before using it.

forward(pred: Any, target: List[Dict]) → Dict¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.SmoothL1Loss(beta=1.0, reduction='mean')¶

Smooth L1 Loss.

Parameters

beta (float, optional) – The threshold in the piecewise function. Defaults to 1.0.
reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)¶

Forward function.

Parameters

pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (float) – Normalized factor.

class hat.models.losses.SoftTargetCrossEntropy(loss_name=None)¶

The losses of cross-entropy with soft target.

Parameters: loss_name (str) – The name of returned losses.

forward(input, target)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.SoftmaxFocalLoss(loss_name: str, num_classes: int, alpha: float = 0.25, gamma: float = 2.0, reduction: str = 'mean', weight: Union[float, Sequence] = 1.0)¶

Focal Loss.

Parameters

loss_name (str) – The key of loss in return dict.
num_classes (int) – Class number.
alpha (float, optional) – Alpha. Defaults to 0.25.
gamma (float, optional) – Gamma. Defaults to 2.0.
reduction (str, optional) – Specifies the reduction to apply to the output: 'mean' | 'sum'. Defaults to 'mean'.
weight (Union[float, Sequence], optional) – Weight to be applied to the loss of each input. Defaults to 1.0.

forward(logits, labels)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.YOLOV3Loss(num_classes: int, anchors: list, strides: list, ignore_thresh: float, loss_xy: dict, loss_wh: dict, loss_conf: dict, loss_cls: dict, lambda_loss: list)¶

The loss module of YOLOv3.

Parameters

num_classes (int) – Num classes of class branch.
anchors (list) – The anchors of YOLOv3.
strides (list) – The strides of feature maps.
ignore_thresh (float) – Ignore thresh of target.
loss_xy (dict) – Losses of xy.
loss_wh (dict) – Losses of wh.
loss_conf (dict) – Losses of conf.
loss_cls (dict) – Losses of cls.
lambda_loss (list) – The list of weighted losses.

forward(input, target=None)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.BBoxUpscaler(strides)¶

Parameters: strides (Sequence[int]) – A list contains the strides of fcos_head output.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶

Do post process for model predictions.

Parameters

pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.DynamicFcosTarget(strides, topK, loss_cls, loss_reg, cls_out_channels, background_label)¶

Generate cls and reg targets for FCOS in training stage base on dynamic losses.

Parameters

strides (Sequence[int]) – Strides of points in multiple feature levels.
topK (int) – Number of postive sample for each ground trouth to keep.
cls_out_channels (int) – Out_channels of cls_score.
background_label (int) – Label ID of background, set as num_classes.
loss_cls (nn.Module) – Loss for cls to choose positive target.
loss_reg (nn.Module) – Loss for reg to choose positive target.

class hat.models.task_modules.fcos.FCOSDecoder(num_classes, strides, transforms=None, inverse_transform_key=None, nms_use_centerness=True, nms_sqrt=True, rescale=True, test_cfg=None, input_resize_scale=None, truncate_bbox=True, filter_score_mul_centerness=False, upscale_bbox_pred=False)¶

Parameters

num_classes (int) – Number of categories excluding the background category.
strides (Sequence[int]) – A list contains the strides of fcos_head output.
transforms (Sequence[dict]) – A list contains the transform config.
inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.
nms_use_centerness (bool, optional) – If True, use centerness as a factor in nms post-processing.
nms_sqrt (bool, optional) – If True, sqrt(score_thr * score_factors).
rescale (bool, optional) – Whether to map the prediction result to the orig img.
test_cfg (dict, optional) – Cfg dict, including some configurations of nms.
truncate_bbox (bool, optional) – If True, truncate the predictive bbox out of image boundary. Default True.
filter_score_mul_centerness (bool, optional) – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶

Do post process for model predictions.

Parameters

pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.FCOSHead(num_classes, in_strides, out_strides, stride2channels, upscale_bbox_pred, feat_channels=256, stacked_convs=4, use_sigmoid=True, share_bn=False, dequant_output=True, int8_output=True, share_conv=True, change_layout=False, deepcopy_share_conv=False)¶

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

Parameters

num_classes (int) – Number of categories excluding the background category.
in_strides (Sequence[int]) – A list contains the strides of feature maps from backbone or neck.
out_strides (Sequence[int]) – A list contains the strides of this head will output.
stride2channels (dict) – A stride to channel dict.
feat_channels (int) – Number of hidden channels.
stacked_convs (int) – Number of stacking convs of the head.
use_sigmoid (bool) – Whether the classification output is obtained using sigmoid.
share_bn (bool) – Whether to share bn between multiple levels, default is share_bn.
upscale_bbox_pred (bool) – If true, upscale bbox pred by FPN strides.
dequant_output (bool) – Whether to dequant output. Default: True
int8_output (bool) – If True, output int8, otherwise output int32. Default: True
share_conv (bool) – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True

forward(feats)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, i, stride)¶

Forward features of a single scale levle.

Parameters

x (Tensor) – FPN feature maps of the specified stride.
i (int) – Index of feature level.
stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.

class hat.models.task_modules.fcos.FCOSTarget(strides, regress_ranges, cls_out_channels, background_label, norm_on_bbox=True, center_sampling=True, center_sample_radius=1.5, use_iou_replace_ctrness=False, task_batch_list=None)¶

Generate cls and reg targets for FCOS in training stage.

Parameters

strides (Sequence[int]) – Strides of points in multiple feature levels.
regress_ranges (tuple[tuple[int, int]]) – Regress range of multiple level points.
cls_out_channels (int) – Out_channels of cls_score.
background_label (int) – Label ID of background, set as num_classes.
center_sampling (bool) – If true, use center sampling.
center_sample_radius – Radius of center sampling. Default: 1.5.
norm_on_bbox (bool) – If true, normalize the regression targets with FPN strides.
use_iou_replace_ctrness (bool) – If true, use iou as box quality assessment method, else use ctrness. Default: false.
task_batch_list ([int, int]) – two datasets use same head, so we generate mask

class hat.models.task_modules.seg.FRCNNSegHead(group_base: int, in_strides: List, in_channels: List, out_strides: List, out_channels: List, bn_kwargs: Dict, with_extra_conv: bool = False, use_bias: bool = True, linear_out: bool = True, argmax_output: bool = False, dequant_output: bool = True, int8_output: bool = False)¶

FRCNNSegHead module for segmentation task.

Parameters

group_base (int) – Group base of group conv
in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.
in_channels (list[int]) – Number of channels of each input stride.
out_strides (list[int]) – List of output strides.
out_channels (list[int]) – Number of channels of each output stride.
bn_kwargs (dict) – Extra keyword arguments for bn layers.
with_extra_conv (bool) – Whether to use extra conv module.
use_bias (bool) – Whether to use bias in conv module.
linear_out (bool) – Whether NOT to use to act of pw.
argmax_output (bool) – Whether conduct argmax on output.
dequant_output (bool) – Whether to dequant output.
int8_output (bool) – If True, output int8, otherwise output int32.

forward(x: List[torch.Tensor]) → List[torch.Tensor]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.seg.SegDecoder(out_strides: List[int], decode_strides: List[int], transforms: Optional[List[dict]] = None, inverse_transform_key: Optional[List[str]] = None, output_names: Optional[str] = 'pred_seg')¶

Semantic Segmentation Decoder.

Parameters

out_strides (list[int]) – List of output strides, represents the strides of the output from seg_head.
output_names (str or list[str]) – Keys of returned results dict.
decode_strides (int or list[int]) – Strides that need to be decoded, should be a subset of out_strides.
transforms (Sequence[dict]) – A list contains the transform config.
inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.

class hat.models.task_modules.seg.SegHead(num_classes, in_strides, out_strides, stride2channels, feat_channels=256, stride_loss_weights=None, stacked_convs=1, argmax_output=False, dequant_output=True, int8_output=True, upscale_stride4_to_stride2=False, output_with_bn=False, bn_kwargs=None, upsample_output_scale=None)¶

Head Module for segmentation task.

Parameters

num_classes (int) – Number of classes.
in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.
out_strides (list[int]) – List of output strides.
stride2channels (dict) – A stride to channel dict.
feat_channels (int or list[int]) – Number of hidden channels (of each output stride).
stride_loss_weights (list[int]) – loss weight of each stride.
stacked_convs (int) – Number of stacking convs of head.
argmax_output (bool) – Whether conduct argmax on output. Default: False
dequant_output (bool) – Whether to dequant output. Default: True
int8_output (bool) – If True, output int8, otherwise output int32. Default: True
upscale_stride4_to_stride2 (bool) – If True, stride4’s feature map is upsampled to stride2, then the stride2 is adding supervisory signal. Default is False.
output_with_bn (bool) – Whether add bn layer to the output conv.
bn_kwargs (dict) – Extra keyword arguments for bn layers.
upsample_output_scale (int) – Output upsample scale, only used in qat model, default is None.

forward(feats)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, stride_index=0)¶

Forward features of a single scale level.

Parameters

x (Tensor) – feature maps of the specified stride.
stride_index (int) – stride index of input feature map.

Returns

seg predictions of input feature maps.

Return type

tuple

class hat.models.task_modules.seg.SegTarget(ignore_index=255, label_name='gt_seg')¶

Generate training targets for Seg task.

Parameters

ignore_index (int, optional) – Index of ignore class.
label_name (str, optional) – The key corresponding to the gt seg in label.