Data¶

Generic Interfaces¶

Dataset¶

class Dataset(data, transform=None)[source]¶

Generic dataset to handle dictionary format data, it can operate transforms for specific fields. For example, typical input data can be a list of dictionaries:

[{                            {                            {
     'img': 'image1.nii.gz',      'img': 'image2.nii.gz',      'img': 'image3.nii.gz',
     'seg': 'label1.nii.gz',      'seg': 'label2.nii.gz',      'seg': 'label3.nii.gz',
     'extra': 123                 'extra': 456                 'extra': 789
 },                           },                           }]

Parameters

data (Iterable) – input data to load and transform to generate dataset for model.
transform (Callable, optional) – transforms to excute operations on input data.

Patch-based dataset¶

GridPatchDataset¶

class GridPatchDataset(dataset, patch_size, start_pos=(), pad_mode='wrap', **pad_opts)[source]¶

Yields patches from arrays read from an input dataset. The patches are chosen in a contiguous grid sampling scheme.

Initializes this dataset in terms of the input dataset and patch size. The patch_size is the size of the patch to sample from the input arrays. Tt is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).

Parameters

dataset (Dataset) – the dataset to read array data from
patch_size (tuple of int or None) – size of patches to generate slices for, 0/None selects whole dimension
start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension
pad_mode (str, optional) – padding mode, see numpy.pad
pad_opts (dict, optional) – padding options, see numpy.pad

Nifti format handling¶

Reading¶

class NiftiDataset(image_files, seg_files=None, labels=None, as_closest_canonical=False, transform=None, seg_transform=None, image_only=True, dtype=None)[source]¶

Loads image/segmentation pairs of Nifti files from the given filename lists. Transformations can be specified for the image and segmentation arrays separately.

Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.

Parameters

image_files (list of str) – list of image filenames
seg_files (list of str) – if in segmentation task, list of segmentation filenames
labels (list or array) – if in classification task, list of classification labels
as_closest_canonical (bool) – if True, load the image as closest to canonical orientation
transform (Callable, optional) – transform to apply to image arrays
seg_transform (Callable, optional) – transform to apply to segmentation arrays
image_only (bool) – if True return only the image volume, other return image volume and header dict
dtype (np.dtype, optional) – if not None convert the loaded image to this data type

load_nifti(filename_or_obj, as_closest_canonical=False, image_only=True, dtype=None)[source]¶

Loads a Nifti file from the given path or file-like object.

Parameters

filename_or_obj (str or file) – path to file or file-like object
as_closest_canonical (bool) – if True, load the image as closest to canonical axis format
image_only (bool) – if True return only the image volume, other return image volume and header dict
dtype (np.dtype, optional) – if not None convert the loaded image to this data type

Returns

The loaded image volume if image_only is True, or a tuple containing the volume and the Nifti header in dict format otherwise

Note

header[‘original_affine’] stores the original affine loaded from filename_or_obj. header[‘affine’] stores the affine after the optional as_closest_canonical transform.

Writing¶

write_nifti(data, affine, file_name, target_affine=None, dtype='float32')[source]¶

Write numpy data into nifti files to disk.

Parameters

data (numpy.ndarray) – input data to write to file.
affine (numpy.ndarray) – affine information for the data.
file_name (string) – expected file name that saved on disk.
target_affine (numpy.ndarray, optional) – before saving the (data, affine), transform the data into the orientation defined by target_affine.
dtype (np.dtype, optional) – convert the image to save to this data type.

Synthetic¶

create_test_image_2d(width, height, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None)[source]¶: Return a noisy 2D image with num_obj circles and a 2D mask image. The maximum radius of the circles is given as rad_max. The mask will have num_seg_classes number of classes for segmentations labeled sequentially from 1, plus a background class represented as 0. If noise_max is greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). If channel_dim is None, will create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim.

create_test_image_3d(height, width, depth, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None)[source]¶: Return a noisy 3D image and segmentation.

See also

create_test_image_2d()

Utilities¶

correct_nifti_header_if_necessary(img_nii)[source]¶

check nifti object header’s format, update the header if needed. in the updated image pixdim matches the affine.

Parameters: img (nifti image object) –

dense_patch_slices(image_size, patch_size, scan_interval)[source]¶

Enumerate all slices defining 2D/3D patches of size patch_size from an image_size input image.

Parameters

image_size (tuple of int) – dimensions of image to iterate over
patch_size (tuple of int) – size of patches to generate slices
scan_interval (tuple of int) – dense patch sampling interval

Returns

a list of slice objects defining each patch

get_random_patch(dims, patch_size, rand_state=None)[source]¶

Returns a tuple of slices to define a random patch in an array of shape dims with size patch_size or the as close to it as possible within the given dimension. It is expected that patch_size is a valid patch for a source of shape dims as returned by get_valid_patch_size.

Parameters

dims (tuple of int) – shape of source array
patch_size (tuple of int) – shape of patch size to generate
rand_state (np.random.RandomState) – a random state object to generate random numbers from

Returns

a tuple of slice objects defining the patch

Return type

(tuple of slice)

get_valid_patch_size(dims, patch_size)[source]¶: Given an image of dimensions dims, return a patch size tuple taking the dimension from patch_size if this is not 0/None. Otherwise, or if patch_size is shorter than dims, the dimension from dims is taken. This ensures the returned patch size is within the bounds of dims. If patch_size is a single number this is interpreted as a patch of the same dimensionality of dims with that size in each dimension.

iter_patch(arr, patch_size, start_pos=(), copy_back=True, pad_mode='wrap', **pad_opts)[source]¶

Yield successive patches from arr of size patch_size. The iteration can start from position start_pos in arr but drawing from a padded array extended by the patch_size in each dimension (so these coordinates can be negative to start in the padded region). If copy_back is True the values from each patch are written back to arr.

Parameters

arr (np.ndarray) – array to iterate over
patch_size (tuple of int or None) – size of patches to generate slices for, 0 or None selects whole dimension
start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension
copy_back (bool) – if True data from the yielded patches is copied back to arr once the generator completes
pad_mode (str, optional) – padding mode, see numpy.pad
pad_opts (dict, optional) – padding options, see numpy.pad

Yields

Patches of array data from arr which are views into a padded array which can be modified, if copy_back is True these changes will be reflected in arr once the iteration completes.

iter_patch_slices(dims, patch_size, start_pos=())[source]¶

Yield successive tuples of slices defining patches of size patch_size from an array of dimensions dims. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a first dimension as least significant ordering.

Parameters

dims (tuple of int) – dimensions of array to iterate over
patch_size (tuple of int or None) – size of patches to generate slices for, 0 or None selects whole dimension
start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension

Yields

Tuples of slice objects defining each patch

list_data_collate(batch)[source]¶: Enhancement for PyTorch DataLoader default collate. If dataset already returns a list of batch data that generated in transforms, need to merge all data to 1 list. Then it’s same as the default collate behavior. .. note:: Need to use this collate if apply some transforms that can generate batch data.

rectify_header_sform_qform(img_nii)[source]¶

Look at the sform and qform of the nifti object and correct it if any incompatibilities with pixel dimensions

Adapted from https://github.com/NifTK/NiftyNet/blob/v0.6.0/niftynet/io/misc_io.py