Selecting Training Samples

Producer Field Guide

HGD_Product
Producer Field Guide
HGD_Portfolio_Suite
Producer

It is important that training samples be representative of the class that you are trying to identify. This does not necessarily mean that they must contain a large number of pixels or be dispersed across a wide region of the data. The selection of training samples depends largely upon your knowledge of the data, of the study area, and of the classes that you want to extract.

Use ERDAS IMAGINE to identify training samples using one or more of the following methods:

  • using a vector layer
  • defining a polygon in the image
  • identifying a training sample of contiguous pixels with similar spectral characteristics
  • identifying a training sample of contiguous pixels within a certain area, with or without similar spectral characteristics
  • using a class from a thematic raster layer from an image file of the same area (that is, the result of an unsupervised classification)

Digitized Polygon

Training samples can be identified by their geographical location (training sites, using maps, ground truth data). The locations of the training sites can be digitized from maps with the ERDAS IMAGINE Insert Geometry drawing tools. Polygons representing these areas are then stored as vector layers. The vector layers can then be used as input and used as training samples to create signatures.

User-defined Polygon

Using your pattern recognition skills (with or without supplemental ground truth information), you can identify samples by examining a displayed image of the data and drawing a polygon around the training site or sites of interest. For example, if it is known that oak trees reflect certain frequencies of green and infrared light according to ground truth data, you may be able to base your sample selections on the data (taking atmospheric conditions, sun angle, time, date, and other variations into account). The area within the polygon or polygons would be used to create a signature.

SHARED Tip Use the Insert Geometry drawing tools to define the polygons to be used as the training sample. Use the Signature Editor to create signatures from training samples that are identified with the polygons.

Identify Seed Pixel

Using Region Growing Properties dialog and Insert Geometry drawing tools, the cursor (crosshair) can be used to identify a single pixel (seed pixel) that is representative of the training sample. This seed pixel is used as a model pixel, against which the pixels that are contiguous to it are compared based on parameters specified by you.

When one or more of the contiguous pixels is accepted, the mean of the sample is calculated from the accepted pixels. Then, the pixels contiguous to the sample are compared in the same way. This process repeats until no pixels that are contiguous to the sample satisfy the spectral parameters. In effect, the sample grows outward from the model pixel with each iteration. These homogenous pixels are converted from individual raster pixels to a polygon and used as an AOI layer.

Select the Grow option to identify training samples with a seed pixel.

Seed Pixel Method with Spatial Limits

The training sample identified with the seed pixel method can be limited to a particular region by defining the geographic distance and area.

Display vector layers (polygons or lines) as the top layer in the View, and then use boundaries as an AOI for training samples defined under Growing Properties.

Thematic Raster Layer

Define a training sample by using class values from a thematic raster layer. The data file values in the training sample are used to create a signature. The training sample can be defined by as many class values as desired. The following table compares several training sample methods.

The thematic raster layer must have the same coordinate system as the image file being classified.

Method

Advantages

Disadvantages

Digitized Polygon

precise map coordinates, represents known ground information

may overestimate class variance, time-consuming

User-defined Polygon

high degree of user control

may overestimate class variance, time-consuming

Seed Pixel

auto-assisted, less time

may underestimate class variance

Thematic Raster Layer

allows iterative classifying

must have previously defined thematic layer

Evaluating Training Samples

Selecting training samples is often an iterative process. To generate signatures that accurately represent the classes to be identified, you may have to repeatedly select training samples, evaluate the signatures that are generated from the samples, and then either take new samples or manipulate the signatures as necessary. Signature manipulation may involve merging, deleting, or appending from one file to another. It is also possible to perform a classification using the known signatures, then mask out areas that are not classified to use in gathering more signatures.

See Evaluating Signatures for methods of determining the accuracy of the signatures created from your training samples.