Tips for fixing errors in reference data

Reference data is one of the key elements in building a locator, because your geocoding experience is only as good as the primary reference data on which the locator is based. Errors in the reference data cause poor matching quality. For example, if the geometry of the reference features is incorrect, the addresses that are matched to them will also be spatially incorrect. If the name of a reference feature is misspelled, correctly spelled addresses may go unmatched. Several potential errors to keep in mind regarding your reference data are described below.

Incomplete geometry and address attributes

The world is constantly changing and your reference data must be updated to reflect these changes to have the best geocoding experience with the locators you create. For example, if a new housing tract is added to the city street network, the additional street segments—with their associated house number ranges, street names, and other properties—need to be added. The locator created based on the street network will not find the addresses in the new housing tract until the locator is also updated.

If the address attributes are incomplete or contain errors, such as incorrect address ranges or missing street names and ZIP Codes, matching the address to those features may return unexpected results. For locators based on the Street Address role, features containing empty street names cause a failure when building the locator. Thus, it is essential to review and correct the address attribute errors in the reference data.

Learn more about updating your reference data

Spatial reference and geometry errors

The reference data, such as a street or point address feature class, is usually produced based on a specific spatial reference. The coordinate system adopted in the feature class determines how the features are georeferenced. When a locator is created, information of the spatial reference is stored in the locator. Locations of addresses geocoded against the locator will be georeferenced on the same spatial reference. It is important to make sure that the reference data contains a spatial reference.

Learn more about spatial references

To draw a feature on a map, the feature must contain a valid shape or geometry. If the shape of features in the reference data is null or empty, the feature is skipped and is not included in a locator created with the Create Locator and Create Feature Locator tools. Geometry errors such as coincident line segments that are not completely snapped to a vertex; polygons with self-intersections; line segments with curves, such as cul-de-sacs; or incorrect ring ordering can prevent addresses from being matched to intersections. Such errors can also result in a failure to build the locator. It is helpful to run tools such as Check Geometry and Repair Geometry on the reference data to find and repair geometry errors. The Planarize editing tool can also be used to modify line features if matches are not returned for intersection addresses due to invalid connectivity of lines and vertices in the data.

If the locator fails to build with line reference data, it can be due to line segments with curves. Run the Densify tool using the Offset densification method and the default Maximum Offset Deviation on the reference data to simplify the lines.

Features that have multiple parts, such as multipoints or multipart lines, are not supported geometry types for building locators. These geometry types in features can result in a failure to build the locator or unexpected match and suggestion results when using the locator. If your reference data contains features with multiple parts, convert the features to single parts using the Multipart to Singlepart tool and use the single part data to build the locator. To check if point reference data has multiple parts, look in the Shape field of the attribute table for the Multipoint value. To check polygon or line features for multiple parts, run the Calculate Geometry Attributes tool to add the Number of parts geometry property to the reference data. A feature with more than one part is considered a multipart feature.

Note:

The tools add a new field named PART_COUNT to the input features of the tools.

Related topics