Available with Spatial Analyst license.
Available with Image Analyst license.
Summary
Converts labeled vector or raster data into deep learning training datasets using a remote sensing image. The output is a folder of image chips and a folder of metadata files in the specified format.
Usage
-
This tool will create training datasets to support third-party deep learning applications, such as Google TensorFlow, Keras, PyTorch, or Microsoft CNTK.
Deep learning class training samples are based on small subimages containing the feature or class of interest, called image chips.
Use your existing classification training sample data, or GIS feature class data such as a building footprint layer, to generate image chips containing the class sample from your source image. Image chips are often 256 pixel rows by 256 pixel columns, unless the training sample size is larger. Each image chip can contain one or more objects. If the Labeled Tiles metadata format is used, there can only be one object per image chip.
By specifying the Reference System parameter, training data can be exported in map space or pixel space (raw image space) to use for deep learning model training.
This tool supports exporting training data from a collection of images. You can add an image folder as the Input Raster. If your Input Raster is a mosaic dataset or an image service, you can also specify Processing Mode to process the mosaic as one input or each raster item separately.
The cell size and extent can be adjusted using the geoprocessing environment settings.
For information about requirements for running this tool and issues you may encounter, see the Deep Learning Frequently Asked Questions.
Syntax
ExportTrainingDataForDeepLearning(in_raster, out_folder, in_class_data, image_chip_format, {tile_size_x}, {tile_size_y}, {stride_x}, {stride_y}, {output_nofeature_tiles}, {metadata_format}, {start_index}, {class_value_field}, {buffer_radius}, {in_mask_polygons}, {rotation_angle}, {reference_system}, {processing_mode}, {blacken_around_feature}, {crop_mode})
Parameter | Explanation | Data Type |
in_raster | The input source imagery, typically multispectral imagery. Examples of the type of input source imagery include multispectral satellite, drone, aerial, or National Agriculture Imagery Program (NAIP). The input can be a folder of images. | Raster Dataset; Raster Layer; Mosaic Layer; Image Service; MapServer; Map Server Layer; Internet Tiled Layer; Folder |
out_folder | The folder where the output image chips and metadata will be stored. The folder can also be a folder URL that uses a cloud storage connection file (*.acs). | Folder |
in_class_data | The training sample data, in either vector or raster form. Vector inputs should follow a training sample format as generated by the Training Sample Manager. Raster inputs should follow a classified raster format as generated by the Classify Raster tool. Following the proper training sample format will produce optimal results with the statistical information, however, the input can also be a point feature class without a class value field, or an integer raster without any class information. | Feature Class; Feature Layer; Raster Dataset; Raster Layer; Mosaic Layer; Image Service |
image_chip_format | Specifies the raster format for the image chip outputs. PNG and JPEG support up to 3 bands.
| String |
tile_size_x (Optional) | The size of the image chips for the X dimension. | Long |
tile_size_y (Optional) | The size of the image chips for the Y dimension. | Long |
stride_x (Optional) | The distance to move in the X direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long |
stride_y (Optional) | The distance to move in the Y direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long |
output_nofeature_tiles (Optional) | Specifies whether image chips that do not capture training samples will be exported.
| Boolean |
metadata_format (Optional) | Specifies the format of the output metadata labels. The five options for output metadata labels for the training data are KITTI rectangles, PASCAL VOC rectangles, Classified Tiles (a class map), RCNN Masks, and Labeled Tiles. If your input training sample data is a feature class layer, such as a building layer or standard classification training sample file, use the KITTI or PASCAL VOC rectangles option. The output metadata is a .txt file or .xml file containing the training sample data contained in the minimum bounding rectangle. The name of the metadata file matches the input source image name. If your input training sample data is a class map, use the Classified Tiles option as your output metadata format.
For the KITTI metadata format, 15 columns are created, but only 5 of them are used in the tool. The first column is the class value. The next 3 columns are skipped. Columns 5-8 define the minimum bounding rectangle, which is comprised of 4 image coordinate locations: left, top, right, and bottom pixels, respectively. The minimum bounding rectangle encompasses the training chip used in the deep learning classifier. The remaining columns are not used. The following is an example of the PASCAL VOC option: For more information, see PASCAL Visual Object Classes. | String |
start_index (Optional) | Legacy:This parameter has been deprecated. Use a value of 0 or # within Python. | Long |
class_value_field (Optional) | The field that contains the class values. If no field is specified, the system searches for a value or classvalue field. If the feature does not contain a class field, the system determines that all records belong to one class. | Field |
buffer_radius (Optional) | The radius for a buffer around each training sample to delineate a training sample area. This allows you to create circular polygon training samples from points. The linear unit of the in_class_data spatial reference is used. | Double |
in_mask_polygons (Optional) | A polygon feature class that delineates the area where image chips will be created. Only image chips that fall completely within the polygons will be created. | Feature Layer |
rotation_angle (Optional) | The rotation angle that will be used to generate additional image chips. An image chip will be generated with a rotation angle of 0, which means no rotation. It will then be rotated at the specified angle to create an additional image chip. The same training samples will be captured at multiple angles in multiple image chips for data augmentation. The default rotation angle is 0. | Double |
reference_system (Optional) | Specifies the type of reference system to be used to interpret the input image. The reference system specified must match the reference system used to train the deep learning model.
| String |
processing_mode (Optional) |
Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service.
| String |
blacken_around_feature (Optional) | Specifies whether to blacken the pixels around each object or feature in each image tile. This parameter only applies when the metadata format is set to Labeled_Tiles and an input feature class or classified raster has been specified.
| Boolean |
crop_mode (Optional) | Specifies whether to crop the exported tiles such that they are all the same size. This parameter only applies when the metadata format is set to Labeled_Tiles and an input feature class or classified raster has been specified.
| String |
Code sample
This example creates training samples for deep learning.
# Import system modules
import arcpy
from arcpy.ia import *
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")
ExportTrainingDataForDeepLearning("c:/test/image.tif", "c:/test/outfolder",
"c:/test/training.shp", "TIFF", "256", "256", "128", "128",
"ONLY_TILES_WITH_FEATURES", "Labeled_Tiles", 0, "Classvalue",
0, None, 0, "MAP_SPACE", "PROCESS_AS_MOSAICKED_IMAGE", "NO_BLACKEN", "FIXED_SIZE")
This example creates training samples for deep learning.
# Import system modules and check out ArcGIS Image Analyst extension license
import arcpy
arcpy.CheckOutExtension("ImageAnalyst")
from arcpy.ia import *
# Set local variables
inRaster = "c:/test/InputRaster.tif"
out_folder = "c:/test/OutputFolder"
in_training = "c:/test/TrainingData.shp"
image_chip_format = "TIFF"
tile_size_x = "256"
tile_size_y = "256"
stride_x= "128"
stride_y= "128"
output_nofeature_tiles= "ONLY_TILES_WITH_FEATURES"
metadata_format= "Labeled_Tiles"
start_index = 0
classvalue_field = "Classvalue"
buffer_radius = 0
in_mask_polygons = "MaskPolygon"
rotation_angle = 0
reference_system = "PIXEL_SPACE"
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"
blacken_around_feature = "NO_BLACKEN"
crop_mode = “FIXED_SIZE”
# Execute
ExportTrainingDataForDeepLearning(inRaster, out_folder, in_training,
image_chip_format,tile_size_x, tile_size_y, stride_x,
stride_y,output_nofeature_tiles, metadata_format, start_index,
classvalue_field, buffer_radius, in_mask_polygons, rotation_angle,
reference_system, processing_mode, blacken_around_feature, crop_mode)
Environments
Licensing information
- Basic: Requires Image Analyst or Spatial Analyst
- Standard: Requires Image Analyst or Spatial Analyst
- Advanced: Requires Image Analyst or Spatial Analyst
Related topics
- An overview of the Deep Learning toolset
- Install deep learning frameworks for ArcGIS
- Train Deep Learning Model
- Classify Objects Using Deep Learning
- Classify Pixels Using Deep Learning
- Detect Objects Using Deep Learning
- An overview of the Segmentation and Classification toolset in Image Analyst
- Find a geoprocessing tool