hat.data¶
Main data module for training in HAT, which contains datasets, transforms, samplers.
Datasets¶
ImageNet provides the method of reading imagenet data from target pack type. |
|
ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format. |
|
ImageNet from image by torchvison. |
|
Coco provides the method of reading coco data from target pack type. |
|
CocoDetectionPacker is used for packing coco dataset to target format. |
|
Coco from image by torchvision. |
|
Cityscapes provides the method of reading cityscapes data from target pack type. |
|
CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format. |
|
A wrapper of repeated dataset. |
|
Dataset wrapper for multiple datasets with precise batch size. |
|
A wrapper of resample dataset. |
|
Dataset as a concatenation of multiple datasets. |
|
Transforms¶
common¶
Delete keys in input dict. |
|
Convert list args to dict. |
|
Convert PIL Image to Tensor. |
|
Rename keys in input dict. |
|
Convert a |
classification¶
BgrToYuv444 is used for color format convert. |
|
ConvertLayout is used for layout convert. |
|
LabelSmooth is used for label smooth. |
|
OneHot is used for convert layer to one-hot format. |
segmentation¶
Remap labels. |
|
OneHot is used for convert layer to one-hot format. |
|
Apply random for both image and label. |
|
Scale input according to a scale list. |
|
Random crop on data with gt_seg label, can only be used for segmentation |
|
Calculate the weight of each category according to the area of each category. |
detection¶
|
|
Randomly change the brightness, contrast, saturation and hue of an image. |
|
Crop image with fixed position and size. |
|
Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious. |
|
|
|
|
|
|
|
Random expand the image & bboxes. |
|
Flip image & bbox & mask & seg. |
|
Resize image & bbox & mask & seg. |
|
Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True. |
|
|
Collates¶
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
Dataloaders¶
Directly pass through input example. |
API Reference¶
- class hat.data.datasets.BatchTransformDataset(dataset, transforms_cfgs, epoch_steps)¶
- class hat.data.datasets.Cityscapes(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
Cityscapes provides the method of reading cityscapes data from target pack type.
- Parameters
data_path (str) – The path of packed file.
pack_type (str) – The pack type.
transfroms (list) – Transfroms of cityscapes before using.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.CityscapesPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.
- Parameters
src_data_dir (str) – The dir of original cityscapes data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- Parameters
idx (int) – Idx for reading.
- Returns
Processed data for pack.
- class hat.data.datasets.Coco(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
Coco provides the method of reading coco data from target pack type.
- Parameters
data_path (str) – The path of packed file.
transforms (list) – Transfroms of data before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.CocoDetection(root, annFile, num_classes=80, transform=None, target_transform=None, transforms=None)¶
Coco Detection Dataset.
- Parameters
root (string) – Root directory where images are downloaded to.
annFile (string) – Path to json annotation file.
num_classes (int) – The number of classes of coco. 80 or 91.
transform (callable, optional) – A function transform that takes in an PIL image and returns a transformed version. E.g,
transforms.ToTensor
target_transform (callable, optional) – A function transform that takes in the target and transforms it.
transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.
- class hat.data.datasets.CocoDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_classes: int = 80, num_samples: Optional[int] = None, **kwargs)¶
CocoDetectionPacker is used for packing coco dataset to target format.
- Parameters
src_data_dir (str) – The dir of original coco data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – The num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_classes (int) – The num of classes produced.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- Parameters
idx (int) – Idx for reading.
- Returns
Processed data for pack.
- class hat.data.datasets.CocoFromImage(*args, **kwargs)¶
Coco from image by torchvision.
The params of COCOFromImage is same as params of torchvision.dataset.CocoDetection.
- class hat.data.datasets.ComposeDataset(datasets: List[Dict], batchsize_list: List[int])¶
Dataset wrapper for multiple datasets with precise batch size.
- Parameters
datasets – config for each dataset.
batchsize_list – batchsize for each task dataset.
- class hat.data.datasets.ConcatDataset(datasets: Iterable[torch.utils.data.dataset.Dataset])¶
Dataset as a concatenation of multiple datasets.
This class is useful to assemble different existing datasets.
- Parameters
datasets (sequence) – List of datasets to be concatenated
- class hat.data.datasets.ImageNet(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
ImageNet provides the method of reading imagenet data from target pack type.
- Parameters
data_path (str) – The path of packed file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.ImageNetFromImage(transforms=None, *args, **kwargs)¶
ImageNet from image by torchvison.
The params of ImageNetFromImage are same as params of torchvision.datasets.ImageNet.
- class hat.data.datasets.ImageNetPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.
- Parameters
src_data_dir (str) – The dir of original imagenet data.
target_data_dir (str) – Path for LMDB file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- Parameters
idx (int) – Idx for reading.
- Returns
Processed data for pack.
- class hat.data.datasets.RandDataset(length: int, example: Any, clone: bool = True)¶
- class hat.data.datasets.RepeatDataset(dataset, times)¶
A wrapper of repeated dataset.
Using RepeatDataset can reduce the data loading time between epochs.
- Parameters
dataset (torch.utils.data.Dataset) – The datasets for repeating.
times (int) – Repeat times.
- class hat.data.datasets.ResampleDataset(dataset, resample_interval: int = 1)¶
A wrapper of resample dataset.
- Using ResampleDataset can resample on original dataset
with specific interval.
- Parameters
dataset (dict) – The datasets for resampling.
resample_interval (int) – resample interval.
- class hat.data.transforms.common.AddKeys(kv: Dict[str, Any])¶
Add new key-value in input dict.
Frequently used when you want to add dummy keys to data dict but don’t want to change code.
- Parameters
kv – key-value data dict.
- class hat.data.transforms.common.CopyKeys(keys: List[str], split='|')¶
Copy new key in input dict.
Frequently used when you want to cache keys to data dict but don’t want to change code.
- Parameters
kv – key-value data dict.
- class hat.data.transforms.common.DeleteKeys(keys: List[str])¶
Delete keys in input dict.
- Parameters
keys – key list to detele
- class hat.data.transforms.common.ListToDict(keys: List[str])¶
Convert list args to dict.
- Parameters
keys – keys for each object in args.
- class hat.data.transforms.common.PILToTensor¶
Convert PIL Image to Tensor.
- class hat.data.transforms.common.RenameKeys(keys: List[str], split='|')¶
Rename keys in input dict.
- Parameters
keys – key list to rename, in “old_name | new_name” format.
- class hat.data.transforms.common.TensorToNumpy¶
Convert tensor to numpy.
- class hat.data.transforms.common.Undistortion¶
- Convert a
PIL Image
ornumpy.ndarray
to undistor
PIL Image
ornumpy.ndarray
.
- Convert a
- class hat.data.transforms.classification.BgrToYuv444(rgb_input=False)¶
BgrToYuv444 is used for color format convert.
- Args:
rgb_input (bool): Whether rgb input.
- class hat.data.transforms.classification.ConvertLayout(hwc2chw=True, keys=None)¶
ConvertLayout is used for layout convert.
- Parameters
hwc2chw (bool) – Whether to convert hwc to chw.
keys (list) –
- class hat.data.transforms.classification.LabelSmooth(num_classes, eta=0.1)¶
LabelSmooth is used for label smooth.
- Parameters
num_classes (int) – Num classes.
eta (float) – Eta of label smooth.
- class hat.data.transforms.classification.OneHot(num_classes)¶
OneHot is used for convert layer to one-hot format.
- Parameters
num_classes (int) – Num classes.
- class hat.data.transforms.classification.TimmMixup(*args, **kwargs)¶
Mixup of timm.
- Parameters
are the same as timm.data.Mixup (args) –
- class hat.data.transforms.classification.TimmTransforms(*args, **kwargs)¶
Transforms of timm.
- Parameters
are the same as timm.data.create_transform (args) –
- class hat.data.transforms.segmentation.LabelRemap(mapping: Sequence)¶
Remap labels.
- Parameters
mapping (Sequence) – Mapping from input to output.
- class hat.data.transforms.segmentation.Scale(scales: Union[numbers.Real, Sequence], mode: str = 'nearest')¶
Scale input according to a scale list.
- Parameters
scales (Union[Real, Sequence]) – The scales to apply on input.
mode (str) – algorithm used for upsampling:
'nearest'
|'bilinear'
|'area'
. Default:'nearest'
- class hat.data.transforms.segmentation.SegOneHot(num_classes: int)¶
OneHot is used for convert layer to one-hot format.
- Parameters
num_classes (int) – Num classes.
- class hat.data.transforms.segmentation.SegRandomAffine(degrees: Union[Sequence, float] = 0, translate: Optional[Tuple] = None, scale: Optional[Tuple] = None, shear: Optional[Union[Sequence, float]] = None, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Union[tuple, int] = 0, label_fill_value: Union[tuple, int] = -1)¶
Apply random for both image and label.
Please refer to
RandomAffine
for details.- Parameters
label_fill_value (tuple or int, optional) – Fill value for label. Defaults to -1.
- class hat.data.transforms.segmentation.SegRandomCrop(size, cat_max_ratio=1.0, ignore_index=255)¶
- Random crop on data with gt_seg label, can only be used for segmentation
task.
- Parameters
size (tuple) – Expected size after cropping, (h, w).
cat_max_ratio (float, optional) – The maximum ratio that single category could occupy.
ignore_index (int, optional) – When considering the cat_max_ratio condition, the area corresponding to ignore_index will be ignored.
- get_crop_bbox(data)¶
Randomly get a crop bounding box.
- class hat.data.transforms.segmentation.SegReWeightByArea(seg_num_classes, lower_bound: int = 0.5, ignore_index: int = 255)¶
Calculate the weight of each category according to the area of each category.
For each category, the calculation formula of weight is as follows: weight = max(1.0 - seg_area / total_area, lower_bound)
- Parameters
seg_num_classes (int) – Number of segmentation categories.
lower_bound (float) – Lower bound of weight.
ignore_index (int) – Index of ignore class.
- class hat.data.transforms.segmentation.SegResize(size, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>)¶
- forward(data)¶
- Parameters
img (PIL Image or Tensor) – Image to be scaled.
- Returns
Rescaled image.
- Return type
PIL Image or Tensor
- class hat.data.transforms.detection.AugmentHSV(hgain: float = 0.5, sgain: float = 0.5, vgain: float = 0.5)¶
Random add color disturbance.
Convert img to HSV, and then randomly change the hue, saturation and value.
- Parameters
hgain (float) – Gain of hue.
sgin (float) – Gain of saturation.
value (float) – Gain of value.
- class hat.data.transforms.detection.ColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1)¶
Randomly change the brightness, contrast, saturation and hue of an image.
For det and dict input are the main differences with ColorJitter in torchvision and the default settings have been changed to the most common settings.
- Parameters
brightness (float or tuple of float (min, max)) – How much to jitter brightness.
contrast (float or tuple of float (min, max)) – How much to jitter contrast.
saturation (float or tuple of float (min, max)) – How much to jitter saturation.
hue (float or tuple of float (min, max)) – How much to jitter hue.
- class hat.data.transforms.detection.FixedCrop(size=None, min_area=- 1, min_iou=- 1, dynamic_roi_params=None)¶
Crop image with fixed position and size.
- get_dynamic_roi_from_camera(image_key, camera_info, dynamic_roi_params, img_hw)¶
Get dynamic roi from camera info.
- Parameters
image_key (str) – Image key.
camera_info (Dict) – Camera info.
dynamic_roi_params (Dict) – Must contains keys {‘w’, ‘h’, ‘fp_x’, ‘fp_y’}
img_hw (List|Tuple) – of 2 int height and width of the image.
- Returns
- dynamic ROI coordinate [x1, y1, x2, y2]
of the image.
- Return type
dynamic_roi (List|Tuple)
- inverse_transform(inputs, task_type, inverse_info)¶
Inverse option of transform to map the prediction to the original image.
- Parameters
inputs (array) – Prediction
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.
- class hat.data.transforms.detection.MinIoURandomCrop(min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3, bbox_clip_border=True, repeat_num=50)¶
Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.
- Parameters
min_ious (tuple) – minimum IoU threshold for all intersections with
boxes (bounding) –
min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,
a >= min_crop_size) (where) –
bbox_clip_border (bool) – Whether clip the objects outside the border of the image. Defaults to True.
repeat_num (float) – Max repeat num for finding avaiable bbox.
- class hat.data.transforms.detection.RandomExpand(mean=(0, 0, 0), ratio_range=(1, 4), prob=0.5)¶
Random expand the image & bboxes.
Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.
- Parameters
ratio_range (tuple) – range of expand ratio.
prob (float) – probability of applying this transformation
- class hat.data.transforms.detection.RandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)¶
Flip image & bbox & mask & seg.
- Parameters
px – Horizontal flip probability, range between [0, 1].
py – Vertical flip probability, range between [0, 1].
- class hat.data.transforms.detection.Resize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode='range', ratio_range=None, keep_ratio=True)¶
Resize image & bbox & mask & seg.
- Parameters
img_scale – See above.
max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale
multiscale_mode (str) – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))
ratio_range (tuple[float]) – Value represent (min_ratio, max_ratio), scale factor range.
keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.
- inverse_transform(inputs, task_type, inverse_info)¶
Inverse option of transform to map the prediction to the original image.
- Parameters
inputs (array|Tensor) – Prediction.
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.
- class hat.data.transforms.detection.ToTensor(to_yuv=False)¶
Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.
Supported types are: numpy.ndarray, torch.Tensor, Sequence, int, float.
- Parameters
to_yuv (bool) – If true, convert the img to yuv444 format.
- class hat.data.samplers.DistSamplerHook(dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)¶
The hook api for torch.utils.data.DistributedDampler. Used to get local rank and num_replicas before create DistributedSampler.
- Parameters
dataset – compose dataset
num_replicas – same as DistributedSampler
rank – Same as DistributedSampler
shuffle – if shuffle data
seed – random seed
- hat.data.collates.collate_2d(batch: List[Any]) → Union[torch.Tensor, Dict]¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in 2d task, for collating data with inconsistent shapes.
- Parameters
batch (list) – list of data.
- hat.data.collates.collate_3d(batch_data: List[Any])¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in bev task. * If output tensor from dataset shape is (n,c,h,w),concat on aixs 0 directly. * If output tensor from dataset shape is (c,h,w),expand_dim on axis 0 and concat.
- Parameters
batch (list) – list of data.
- class hat.data.dataloaders.PassThroughDataLoader(data: Any, *, length: int, clone: bool = False)¶
Directly pass through input example.
- Parameters
data (Any) – Input data
length (int) – Length of dataloader
clone (bool, optional) – Whether clone input data