hat.data

Main data module for training in HAT, which contains datasets, transforms, samplers.

Datasets

ImageNet

ImageNet provides the method of reading imagenet data from target pack type.

ImageNetPacker

ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.

ImageNetFromImage

ImageNet from image by torchvison.

Coco

Coco provides the method of reading coco data from target pack type.

CocoDetectionPacker

CocoDetectionPacker is used for packing coco dataset to target format.

CocoFromImage

Coco from image by torchvision.

Cityscapes

Cityscapes provides the method of reading cityscapes data from target pack type.

CityscapesPacker

CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.

RandDataset

RepeatDataset

A wrapper of repeated dataset.

ComposeDataset

Dataset wrapper for multiple datasets with precise batch size.

ResampleDataset

A wrapper of resample dataset.

ConcatDataset

Dataset as a concatenation of multiple datasets.

BatchTransformDataset

Transforms

common

DeleteKeys

Delete keys in input dict.

ListToDict

Convert list args to dict.

PILToTensor

Convert PIL Image to Tensor.

RenameKeys

Rename keys in input dict.

Undistortion

Convert a PIL Image or numpy.ndarray to

classification

BgrToYuv444

BgrToYuv444 is used for color format convert.

ConvertLayout

ConvertLayout is used for layout convert.

LabelSmooth

LabelSmooth is used for label smooth.

OneHot

OneHot is used for convert layer to one-hot format.

segmentation

LabelRemap

Remap labels.

SegOneHot

OneHot is used for convert layer to one-hot format.

SegRandomAffine

Apply random for both image and label.

Scale

Scale input according to a scale list.

SegRandomCrop

Random crop on data with gt_seg label, can only be used for segmentation

SegResize

SegReWeightByArea

Calculate the weight of each category according to the area of each category.

detection

Batchify

ColorJitter

Randomly change the brightness, contrast, saturation and hue of an image.

FixedCrop

Crop image with fixed position and size.

MinIoURandomCrop

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

Normalize

Pad

RandomCrop

RandomExpand

Random expand the image & bboxes.

RandomFlip

Flip image & bbox & mask & seg.

Resize

Resize image & bbox & mask & seg.

ToTensor

Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.

Mosaic

Collates

collate_2d

Merge a list of samples to form a mini-batch of Tensor(s).

collate_3d

Merge a list of samples to form a mini-batch of Tensor(s).

Dataloaders

PassThroughDataLoader

Directly pass through input example.

API Reference

class hat.data.datasets.BatchTransformDataset(dataset, transforms_cfgs, epoch_steps)
class hat.data.datasets.Cityscapes(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

Cityscapes provides the method of reading cityscapes data from target pack type.

Parameters
  • data_path (str) – The path of packed file.

  • pack_type (str) – The pack type.

  • transfroms (list) – Transfroms of cityscapes before using.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.CityscapesPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.

Parameters
  • src_data_dir (str) – The dir of original cityscapes data.

  • target_data_dir (str) – Path for packed file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – Num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

Parameters

idx (int) – Idx for reading.

Returns

Processed data for pack.

class hat.data.datasets.Coco(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

Coco provides the method of reading coco data from target pack type.

Parameters
  • data_path (str) – The path of packed file.

  • transforms (list) – Transfroms of data before using.

  • pack_type (str) – The pack type.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.CocoDetection(root, annFile, num_classes=80, transform=None, target_transform=None, transforms=None)

Coco Detection Dataset.

Parameters
  • root (string) – Root directory where images are downloaded to.

  • annFile (string) – Path to json annotation file.

  • num_classes (int) – The number of classes of coco. 80 or 91.

  • transform (callable, optional) – A function transform that takes in an PIL image and returns a transformed version. E.g, transforms.ToTensor

  • target_transform (callable, optional) – A function transform that takes in the target and transforms it.

  • transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.

class hat.data.datasets.CocoDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_classes: int = 80, num_samples: Optional[int] = None, **kwargs)

CocoDetectionPacker is used for packing coco dataset to target format.

Parameters
  • src_data_dir (str) – The dir of original coco data.

  • target_data_dir (str) – Path for packed file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – The num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_classes (int) – The num of classes produced.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

Parameters

idx (int) – Idx for reading.

Returns

Processed data for pack.

class hat.data.datasets.CocoFromImage(*args, **kwargs)

Coco from image by torchvision.

The params of COCOFromImage is same as params of torchvision.dataset.CocoDetection.

class hat.data.datasets.ComposeDataset(datasets: List[Dict], batchsize_list: List[int])

Dataset wrapper for multiple datasets with precise batch size.

Parameters
  • datasets – config for each dataset.

  • batchsize_list – batchsize for each task dataset.

class hat.data.datasets.ConcatDataset(datasets: Iterable[torch.utils.data.dataset.Dataset])

Dataset as a concatenation of multiple datasets.

This class is useful to assemble different existing datasets.

Parameters

datasets (sequence) – List of datasets to be concatenated

class hat.data.datasets.ImageNet(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

ImageNet provides the method of reading imagenet data from target pack type.

Parameters
  • data_path (str) – The path of packed file.

  • transforms (list) – Transforms of voc before using.

  • pack_type (str) – The pack type.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.ImageNetFromImage(transforms=None, *args, **kwargs)

ImageNet from image by torchvison.

The params of ImageNetFromImage are same as params of torchvision.datasets.ImageNet.

class hat.data.datasets.ImageNetPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.

Parameters
  • src_data_dir (str) – The dir of original imagenet data.

  • target_data_dir (str) – Path for LMDB file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – Num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

Parameters

idx (int) – Idx for reading.

Returns

Processed data for pack.

class hat.data.datasets.RandDataset(length: int, example: Any, clone: bool = True)
class hat.data.datasets.RepeatDataset(dataset, times)

A wrapper of repeated dataset.

Using RepeatDataset can reduce the data loading time between epochs.

Parameters
  • dataset (torch.utils.data.Dataset) – The datasets for repeating.

  • times (int) – Repeat times.

class hat.data.datasets.ResampleDataset(dataset, resample_interval: int = 1)

A wrapper of resample dataset.

Using ResampleDataset can resample on original dataset

with specific interval.

Parameters
  • dataset (dict) – The datasets for resampling.

  • resample_interval (int) – resample interval.

class hat.data.transforms.common.AddKeys(kv: Dict[str, Any])

Add new key-value in input dict.

Frequently used when you want to add dummy keys to data dict but don’t want to change code.

Parameters

kv – key-value data dict.

class hat.data.transforms.common.CopyKeys(keys: List[str], split='|')

Copy new key in input dict.

Frequently used when you want to cache keys to data dict but don’t want to change code.

Parameters

kv – key-value data dict.

class hat.data.transforms.common.DeleteKeys(keys: List[str])

Delete keys in input dict.

Parameters

keys – key list to detele

class hat.data.transforms.common.ListToDict(keys: List[str])

Convert list args to dict.

Parameters

keys – keys for each object in args.

class hat.data.transforms.common.PILToTensor

Convert PIL Image to Tensor.

class hat.data.transforms.common.RenameKeys(keys: List[str], split='|')

Rename keys in input dict.

Parameters

keys – key list to rename, in “old_name | new_name” format.

class hat.data.transforms.common.TensorToNumpy

Convert tensor to numpy.

class hat.data.transforms.common.Undistortion
Convert a PIL Image or numpy.ndarray to

undistor PIL Image or numpy.ndarray.

class hat.data.transforms.classification.BgrToYuv444(rgb_input=False)

BgrToYuv444 is used for color format convert.

Args:

rgb_input (bool): Whether rgb input.

class hat.data.transforms.classification.ConvertLayout(hwc2chw=True, keys=None)

ConvertLayout is used for layout convert.

Parameters
  • hwc2chw (bool) – Whether to convert hwc to chw.

  • keys (list) –

class hat.data.transforms.classification.LabelSmooth(num_classes, eta=0.1)

LabelSmooth is used for label smooth.

Parameters
  • num_classes (int) – Num classes.

  • eta (float) – Eta of label smooth.

class hat.data.transforms.classification.OneHot(num_classes)

OneHot is used for convert layer to one-hot format.

Parameters

num_classes (int) – Num classes.

class hat.data.transforms.classification.TimmMixup(*args, **kwargs)

Mixup of timm.

Parameters

are the same as timm.data.Mixup (args) –

class hat.data.transforms.classification.TimmTransforms(*args, **kwargs)

Transforms of timm.

Parameters

are the same as timm.data.create_transform (args) –

class hat.data.transforms.segmentation.LabelRemap(mapping: Sequence)

Remap labels.

Parameters

mapping (Sequence) – Mapping from input to output.

class hat.data.transforms.segmentation.Scale(scales: Union[numbers.Real, Sequence], mode: str = 'nearest')

Scale input according to a scale list.

Parameters
  • scales (Union[Real, Sequence]) – The scales to apply on input.

  • mode (str) – algorithm used for upsampling: 'nearest' | 'bilinear' | 'area'. Default: 'nearest'

class hat.data.transforms.segmentation.SegOneHot(num_classes: int)

OneHot is used for convert layer to one-hot format.

Parameters

num_classes (int) – Num classes.

class hat.data.transforms.segmentation.SegRandomAffine(degrees: Union[Sequence, float] = 0, translate: Optional[Tuple] = None, scale: Optional[Tuple] = None, shear: Optional[Union[Sequence, float]] = None, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Union[tuple, int] = 0, label_fill_value: Union[tuple, int] = -1)

Apply random for both image and label.

Please refer to RandomAffine for details.

Parameters

label_fill_value (tuple or int, optional) – Fill value for label. Defaults to -1.

class hat.data.transforms.segmentation.SegRandomCrop(size, cat_max_ratio=1.0, ignore_index=255)
Random crop on data with gt_seg label, can only be used for segmentation

task.

Parameters
  • size (tuple) – Expected size after cropping, (h, w).

  • cat_max_ratio (float, optional) – The maximum ratio that single category could occupy.

  • ignore_index (int, optional) – When considering the cat_max_ratio condition, the area corresponding to ignore_index will be ignored.

get_crop_bbox(data)

Randomly get a crop bounding box.

class hat.data.transforms.segmentation.SegReWeightByArea(seg_num_classes, lower_bound: int = 0.5, ignore_index: int = 255)

Calculate the weight of each category according to the area of each category.

For each category, the calculation formula of weight is as follows: weight = max(1.0 - seg_area / total_area, lower_bound)

Parameters
  • seg_num_classes (int) – Number of segmentation categories.

  • lower_bound (float) – Lower bound of weight.

  • ignore_index (int) – Index of ignore class.

class hat.data.transforms.segmentation.SegResize(size, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>)
forward(data)
Parameters

img (PIL Image or Tensor) – Image to be scaled.

Returns

Rescaled image.

Return type

PIL Image or Tensor

class hat.data.transforms.detection.AugmentHSV(hgain: float = 0.5, sgain: float = 0.5, vgain: float = 0.5)

Random add color disturbance.

Convert img to HSV, and then randomly change the hue, saturation and value.

Parameters
  • hgain (float) – Gain of hue.

  • sgin (float) – Gain of saturation.

  • value (float) – Gain of value.

class hat.data.transforms.detection.ColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1)

Randomly change the brightness, contrast, saturation and hue of an image.

For det and dict input are the main differences with ColorJitter in torchvision and the default settings have been changed to the most common settings.

Parameters
  • brightness (float or tuple of float (min, max)) – How much to jitter brightness.

  • contrast (float or tuple of float (min, max)) – How much to jitter contrast.

  • saturation (float or tuple of float (min, max)) – How much to jitter saturation.

  • hue (float or tuple of float (min, max)) – How much to jitter hue.

class hat.data.transforms.detection.FixedCrop(size=None, min_area=- 1, min_iou=- 1, dynamic_roi_params=None)

Crop image with fixed position and size.

get_dynamic_roi_from_camera(image_key, camera_info, dynamic_roi_params, img_hw)

Get dynamic roi from camera info.

Parameters
  • image_key (str) – Image key.

  • camera_info (Dict) – Camera info.

  • dynamic_roi_params (Dict) – Must contains keys {‘w’, ‘h’, ‘fp_x’, ‘fp_y’}

  • img_hw (List|Tuple) – of 2 int height and width of the image.

Returns

dynamic ROI coordinate [x1, y1, x2, y2]

of the image.

Return type

dynamic_roi (List|Tuple)

inverse_transform(inputs, task_type, inverse_info)

Inverse option of transform to map the prediction to the original image.

Parameters
  • inputs (array) – Prediction

  • task_type (str) – detection or segmentation.

  • inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.detection.MinIoURandomCrop(min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3, bbox_clip_border=True, repeat_num=50)

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

Parameters
  • min_ious (tuple) – minimum IoU threshold for all intersections with

  • boxes (bounding) –

  • min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,

  • a >= min_crop_size) (where) –

  • bbox_clip_border (bool) – Whether clip the objects outside the border of the image. Defaults to True.

  • repeat_num (float) – Max repeat num for finding avaiable bbox.

class hat.data.transforms.detection.RandomExpand(mean=(0, 0, 0), ratio_range=(1, 4), prob=0.5)

Random expand the image & bboxes.

Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.

Parameters
  • ratio_range (tuple) – range of expand ratio.

  • prob (float) – probability of applying this transformation

class hat.data.transforms.detection.RandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)

Flip image & bbox & mask & seg.

Parameters
  • px – Horizontal flip probability, range between [0, 1].

  • py – Vertical flip probability, range between [0, 1].

class hat.data.transforms.detection.Resize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode='range', ratio_range=None, keep_ratio=True)

Resize image & bbox & mask & seg.

Parameters
  • img_scale – See above.

  • max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale

  • multiscale_mode (str) – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))

  • ratio_range (tuple[float]) – Value represent (min_ratio, max_ratio), scale factor range.

  • keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.

inverse_transform(inputs, task_type, inverse_info)

Inverse option of transform to map the prediction to the original image.

Parameters
  • inputs (array|Tensor) – Prediction.

  • task_type (str) – detection or segmentation.

  • inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.detection.ToTensor(to_yuv=False)

Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.

Supported types are: numpy.ndarray, torch.Tensor, Sequence, int, float.

Parameters

to_yuv (bool) – If true, convert the img to yuv444 format.

class hat.data.samplers.DistSamplerHook(dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)

The hook api for torch.utils.data.DistributedDampler. Used to get local rank and num_replicas before create DistributedSampler.

Parameters
  • dataset – compose dataset

  • num_replicas – same as DistributedSampler

  • rank – Same as DistributedSampler

  • shuffle – if shuffle data

  • seed – random seed

hat.data.collates.collate_2d(batch: List[Any])Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with inconsistent shapes.

Parameters

batch (list) – list of data.

hat.data.collates.collate_3d(batch_data: List[Any])

Merge a list of samples to form a mini-batch of Tensor(s).

Used in bev task. * If output tensor from dataset shape is (n,c,h,w),concat on aixs 0 directly. * If output tensor from dataset shape is (c,h,w),expand_dim on axis 0 and concat.

Parameters

batch (list) – list of data.

class hat.data.dataloaders.PassThroughDataLoader(data: Any, *, length: int, clone: bool = False)

Directly pass through input example.

Parameters
  • data (Any) – Input data

  • length (int) – Length of dataloader

  • clone (bool, optional) – Whether clone input data