hat.data¶

Main data module for training in HAT, which contains datasets, transforms, samplers.

Datasets¶

`ImageNet`	ImageNet provides the method of reading imagenet data from target pack type.
`ImageNetPacker`	ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.
`ImageNetFromImage`	ImageNet from image by torchvison.
`Coco`	Coco provides the method of reading coco data from target pack type.
`CocoDetectionPacker`	CocoDetectionPacker is used for packing coco dataset to target format.
`CocoFromImage`	Coco from image by torchvision.
`Cityscapes`	Cityscapes provides the method of reading cityscapes data from target pack type.
`CityscapesPacker`	CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.
`RandDataset`
`RepeatDataset`	A wrapper of repeated dataset.
`ComposeDataset`	Dataset wrapper for multiple datasets with precise batch size.
`ResampleDataset`	A wrapper of resample dataset.
`ConcatDataset`	Dataset as a concatenation of multiple datasets.
`BatchTransformDataset`

Transforms¶

common¶

`DeleteKeys`	Delete keys in input dict.
`ListToDict`	Convert list args to dict.
`PILToTensor`	Convert PIL Image to Tensor.
`RenameKeys`	Rename keys in input dict.
`Undistortion`	Convert a `PIL Image` or `numpy.ndarray` to

classification¶

`BgrToYuv444`	BgrToYuv444 is used for color format convert.
`ConvertLayout`	ConvertLayout is used for layout convert.
`LabelSmooth`	LabelSmooth is used for label smooth.
`OneHot`	OneHot is used for convert layer to one-hot format.

segmentation¶

`LabelRemap`	Remap labels.
`SegOneHot`	OneHot is used for convert layer to one-hot format.
`SegRandomAffine`	Apply random for both image and label.
`Scale`	Scale input according to a scale list.
`SegRandomCrop`	Random crop on data with gt_seg label, can only be used for segmentation
`SegResize`
`SegReWeightByArea`	Calculate the weight of each category according to the area of each category.

detection¶

`Batchify`
`ColorJitter`	Randomly change the brightness, contrast, saturation and hue of an image.
`FixedCrop`	Crop image with fixed position and size.
`MinIoURandomCrop`	Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.
`Normalize`
`Pad`
`RandomCrop`
`RandomExpand`	Random expand the image & bboxes.
`RandomFlip`	Flip image & bbox & mask & seg.
`Resize`	Resize image & bbox & mask & seg.
`ToTensor`	Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.
`Mosaic`

Collates¶

`collate_2d`	Merge a list of samples to form a mini-batch of Tensor(s).
`collate_3d`	Merge a list of samples to form a mini-batch of Tensor(s).

Dataloaders¶

PassThroughDataLoader

Directly pass through input example.

API Reference¶

class hat.data.datasets.BatchTransformDataset(dataset, transforms_cfgs, epoch_steps)¶

class hat.data.datasets.Cityscapes(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶

Cityscapes provides the method of reading cityscapes data from target pack type.

Parameters

data_path (str) – The path of packed file.
pack_type (str) – The pack type.
transfroms (list) – Transfroms of cityscapes before using.
pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.CityscapesPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶

CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.

Parameters

src_data_dir (str) – The dir of original cityscapes data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)¶

Read orginal data from Folder with some process.

Parameters: idx (int) – Idx for reading.
Returns: Processed data for pack.

class hat.data.datasets.Coco(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶

Coco provides the method of reading coco data from target pack type.

Parameters

data_path (str) – The path of packed file.
transforms (list) – Transfroms of data before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.CocoDetection(root, annFile, num_classes=80, transform=None, target_transform=None, transforms=None)¶

Coco Detection Dataset.

Parameters

root (string) – Root directory where images are downloaded to.
annFile (string) – Path to json annotation file.
num_classes (int) – The number of classes of coco. 80 or 91.
transform (callable, optional) – A function transform that takes in an PIL image and returns a transformed version. E.g, transforms.ToTensor
target_transform (callable, optional) – A function transform that takes in the target and transforms it.
transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.

class hat.data.datasets.CocoDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_classes: int = 80, num_samples: Optional[int] = None, **kwargs)¶

CocoDetectionPacker is used for packing coco dataset to target format.

Parameters

src_data_dir (str) – The dir of original coco data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – The num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_classes (int) – The num of classes produced.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)¶

Read orginal data from Folder with some process.

Parameters: idx (int) – Idx for reading.
Returns: Processed data for pack.

class hat.data.datasets.CocoFromImage(*args, **kwargs)¶

Coco from image by torchvision.

The params of COCOFromImage is same as params of torchvision.dataset.CocoDetection.

class hat.data.datasets.ComposeDataset(datasets: List[Dict], batchsize_list: List[int])¶

Dataset wrapper for multiple datasets with precise batch size.

Parameters

datasets – config for each dataset.
batchsize_list – batchsize for each task dataset.

class hat.data.datasets.ConcatDataset(datasets: Iterable[torch.utils.data.dataset.Dataset])¶

Dataset as a concatenation of multiple datasets.

This class is useful to assemble different existing datasets.

Parameters: datasets (sequence) – List of datasets to be concatenated

class hat.data.datasets.ImageNet(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶

ImageNet provides the method of reading imagenet data from target pack type.

Parameters

data_path (str) – The path of packed file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.ImageNetFromImage(transforms=None, *args, **kwargs)¶

ImageNet from image by torchvison.

The params of ImageNetFromImage are same as params of torchvision.datasets.ImageNet.

class hat.data.datasets.ImageNetPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶

ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.

Parameters

src_data_dir (str) – The dir of original imagenet data.
target_data_dir (str) – Path for LMDB file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)¶

Read orginal data from Folder with some process.

Parameters: idx (int) – Idx for reading.
Returns: Processed data for pack.

class hat.data.datasets.RandDataset(length: int, example: Any, clone: bool = True)¶

class hat.data.datasets.RepeatDataset(dataset, times)¶

A wrapper of repeated dataset.

Using RepeatDataset can reduce the data loading time between epochs.

Parameters

dataset (torch.utils.data.Dataset) – The datasets for repeating.
times (int) – Repeat times.

class hat.data.datasets.ResampleDataset(dataset, resample_interval: int = 1)¶

A wrapper of resample dataset.

Using ResampleDataset can resample on original dataset: with specific interval.

Parameters

dataset (dict) – The datasets for resampling.
resample_interval (int) – resample interval.

class hat.data.transforms.common.AddKeys(kv: Dict[str, Any])¶

Add new key-value in input dict.

Frequently used when you want to add dummy keys to data dict but don’t want to change code.

Parameters: kv – key-value data dict.

class hat.data.transforms.common.CopyKeys(keys: List[str], split='|')¶

Copy new key in input dict.

Frequently used when you want to cache keys to data dict but don’t want to change code.

Parameters: kv – key-value data dict.

class hat.data.transforms.common.DeleteKeys(keys: List[str])¶

Delete keys in input dict.

Parameters: keys – key list to detele

class hat.data.transforms.common.ListToDict(keys: List[str])¶

Convert list args to dict.

Parameters: keys – keys for each object in args.

class hat.data.transforms.common.PILToTensor¶: Convert PIL Image to Tensor.

class hat.data.transforms.common.RenameKeys(keys: List[str], split='|')¶

Rename keys in input dict.

Parameters: keys – key list to rename, in “old_name | new_name” format.

class hat.data.transforms.common.TensorToNumpy¶: Convert tensor to numpy.

class hat.data.transforms.common.Undistortion¶

Convert a PIL Image or numpy.ndarray to: undistor PIL Image or numpy.ndarray.

class hat.data.transforms.classification.BgrToYuv444(rgb_input=False)¶

BgrToYuv444 is used for color format convert.

Args:
rgb_input (bool): Whether rgb input.

class hat.data.transforms.classification.ConvertLayout(hwc2chw=True, keys=None)¶

ConvertLayout is used for layout convert.

Parameters

hwc2chw (bool) – Whether to convert hwc to chw.
keys (list) –

class hat.data.transforms.classification.LabelSmooth(num_classes, eta=0.1)¶

LabelSmooth is used for label smooth.

Parameters

num_classes (int) – Num classes.
eta (float) – Eta of label smooth.

class hat.data.transforms.classification.OneHot(num_classes)¶

OneHot is used for convert layer to one-hot format.

Parameters: num_classes (int) – Num classes.

class hat.data.transforms.classification.TimmMixup(*args, **kwargs)¶

Mixup of timm.

Parameters: are the same as timm.data.Mixup (args) –

class hat.data.transforms.classification.TimmTransforms(*args, **kwargs)¶

Transforms of timm.

Parameters: are the same as timm.data.create_transform (args) –

class hat.data.transforms.segmentation.LabelRemap(mapping: Sequence)¶

Remap labels.

Parameters: mapping (Sequence) – Mapping from input to output.

class hat.data.transforms.segmentation.Scale(scales: Union[numbers.Real, Sequence], mode: str = 'nearest')¶

Scale input according to a scale list.

Parameters

scales (Union[Real, Sequence]) – The scales to apply on input.
mode (str) – algorithm used for upsampling: 'nearest' | 'bilinear' | 'area'. Default: 'nearest'

class hat.data.transforms.segmentation.SegOneHot(num_classes: int)¶

OneHot is used for convert layer to one-hot format.

Parameters: num_classes (int) – Num classes.

class hat.data.transforms.segmentation.SegRandomAffine(degrees: Union[Sequence, float] = 0, translate: Optional[Tuple] = None, scale: Optional[Tuple] = None, shear: Optional[Union[Sequence, float]] = None, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Union[tuple, int] = 0, label_fill_value: Union[tuple, int] = -1)¶

Apply random for both image and label.

Please refer to RandomAffine for details.

Parameters: label_fill_value (tuple or int, optional) – Fill value for label. Defaults to -1.

class hat.data.transforms.segmentation.SegRandomCrop(size, cat_max_ratio=1.0, ignore_index=255)¶

Random crop on data with gt_seg label, can only be used for segmentation: task.

Parameters

size (tuple) – Expected size after cropping, (h, w).
cat_max_ratio (float, optional) – The maximum ratio that single category could occupy.
ignore_index (int, optional) – When considering the cat_max_ratio condition, the area corresponding to ignore_index will be ignored.

get_crop_bbox(data)¶: Randomly get a crop bounding box.

class hat.data.transforms.segmentation.SegReWeightByArea(seg_num_classes, lower_bound: int = 0.5, ignore_index: int = 255)¶

Calculate the weight of each category according to the area of each category.

For each category, the calculation formula of weight is as follows: weight = max(1.0 - seg_area / total_area, lower_bound)

Parameters

seg_num_classes (int) – Number of segmentation categories.
lower_bound (float) – Lower bound of weight.
ignore_index (int) – Index of ignore class.

class hat.data.transforms.segmentation.SegResize(size, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>)¶

forward(data)¶

Parameters: img (PIL Image or Tensor) – Image to be scaled.
Returns: Rescaled image.
Return type: PIL Image or Tensor

class hat.data.transforms.detection.AugmentHSV(hgain: float = 0.5, sgain: float = 0.5, vgain: float = 0.5)¶

Random add color disturbance.

Convert img to HSV, and then randomly change the hue, saturation and value.

Parameters

hgain (float) – Gain of hue.
sgin (float) – Gain of saturation.
value (float) – Gain of value.

class hat.data.transforms.detection.ColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1)¶

Randomly change the brightness, contrast, saturation and hue of an image.

For det and dict input are the main differences with ColorJitter in torchvision and the default settings have been changed to the most common settings.

Parameters

brightness (float or tuple of float (min, max)) – How much to jitter brightness.
contrast (float or tuple of float (min, max)) – How much to jitter contrast.
saturation (float or tuple of float (min, max)) – How much to jitter saturation.
hue (float or tuple of float (min, max)) – How much to jitter hue.

class hat.data.transforms.detection.FixedCrop(size=None, min_area=- 1, min_iou=- 1, dynamic_roi_params=None)¶

Crop image with fixed position and size.

get_dynamic_roi_from_camera(image_key, camera_info, dynamic_roi_params, img_hw)¶

Get dynamic roi from camera info.

Parameters

image_key (str) – Image key.
camera_info (Dict) – Camera info.
dynamic_roi_params (Dict) – Must contains keys {‘w’, ‘h’, ‘fp_x’, ‘fp_y’}
img_hw (List|Tuple) – of 2 int height and width of the image.

Returns

dynamic ROI coordinate [x1, y1, x2, y2]: of the image.

Return type

dynamic_roi (List|Tuple)

inverse_transform(inputs, task_type, inverse_info)¶

Inverse option of transform to map the prediction to the original image.

Parameters

inputs (array) – Prediction
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.detection.MinIoURandomCrop(min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3, bbox_clip_border=True, repeat_num=50)¶

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

Parameters

min_ious (tuple) – minimum IoU threshold for all intersections with
boxes (bounding) –
min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,
a >= min_crop_size) (where) –
bbox_clip_border (bool) – Whether clip the objects outside the border of the image. Defaults to True.
repeat_num (float) – Max repeat num for finding avaiable bbox.

class hat.data.transforms.detection.RandomExpand(mean=(0, 0, 0), ratio_range=(1, 4), prob=0.5)¶

Random expand the image & bboxes.

Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.

Parameters

ratio_range (tuple) – range of expand ratio.
prob (float) – probability of applying this transformation

class hat.data.transforms.detection.RandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)¶

Flip image & bbox & mask & seg.

Parameters

px – Horizontal flip probability, range between [0, 1].
py – Vertical flip probability, range between [0, 1].

class hat.data.transforms.detection.Resize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode='range', ratio_range=None, keep_ratio=True)¶

Resize image & bbox & mask & seg.

Parameters

img_scale – See above.
max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale
multiscale_mode (str) – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))
ratio_range (tuple[float]) – Value represent (min_ratio, max_ratio), scale factor range.
keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.

inverse_transform(inputs, task_type, inverse_info)¶

Inverse option of transform to map the prediction to the original image.

Parameters

inputs (array|Tensor) – Prediction.
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.detection.ToTensor(to_yuv=False)¶

Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.

Supported types are: numpy.ndarray, torch.Tensor, Sequence, int, float.

Parameters: to_yuv (bool) – If true, convert the img to yuv444 format.

class hat.data.samplers.DistSamplerHook(dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)¶

The hook api for torch.utils.data.DistributedDampler. Used to get local rank and num_replicas before create DistributedSampler.

Parameters

dataset – compose dataset
num_replicas – same as DistributedSampler
rank – Same as DistributedSampler
shuffle – if shuffle data
seed – random seed

hat.data.collates.collate_2d(batch: List[Any]) → Union[torch.Tensor, Dict]¶

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with inconsistent shapes.

Parameters: batch (list) – list of data.

hat.data.collates.collate_3d(batch_data: List[Any])¶

Merge a list of samples to form a mini-batch of Tensor(s).

Used in bev task. * If output tensor from dataset shape is (n,c,h,w),concat on aixs 0 directly. * If output tensor from dataset shape is (c,h,w),expand_dim on axis 0 and concat.

Parameters: batch (list) – list of data.

class hat.data.dataloaders.PassThroughDataLoader(data: Any, *, length: int, clone: bool = False)¶

Directly pass through input example.

Parameters

data (Any) – Input data
length (int) – Length of dataloader
clone (bool, optional) – Whether clone input data