The Create Data Loading Workspace tool creates a data loading workspace, a data reference workbook, and mapping workbooks.
Data loading workspace
The data loading workspace contains folders, Microsoft Excel workbooks, and other items to assist in mapping data from a source to a target schema. The following describes the folder structure of a data loading workspace:
- The Data Mapping folder contains the following folders, depending on the source data:
- Points—Contains workbooks for each of the source point datasets that were mapped.
- Polygons—Contains workbooks for each of the source polygon datasets that were mapped.
- Polylines—Contains workbooks for each of the source polyline datasets that were mapped.
- Table—Contains workbooks for each of the source table datasets that were mapped.
- GlobalLookup—Contains the GlobalLookup.xlsx workbook that can be used for creating lookup tables to be used across all mapping workbooks.
- The Scripts folder contains the script files described below that you can use to process data using custom functions. These functions can be used in the Expression column of mapping workbooks.
- base.py—Use the basic functions from this file in the mapping workbooks.
- shape_operations.py—Use the shape manipulation functions from this file in the mapping workbooks.
- user.py—Use this file for custom functions.
- The Domains folder contains workbooks with schema information for all domains in the source and target workspaces.
- The DataReference.xlsx file contains paths to the source, target, and mapping workbooks, as well as the data loading settings.
Data reference workbook
The data reference workbook includes paths to the source, target, and mapping workbooks. If the location of any of these items changes, you can manually update the paths. Additional columns control how data can be loaded from source to target, including certain geoprocessing environment settings.
The following are the column definitions:
- Source—The source dataset location.
- SourceDefinitionQuery—Write an SQL query to select a subset of a source dataset to be mapped to the target dataset. For example, use Type = 0 to only select records in which the attribute type equals 0 to load into the target dataset.
- Target—The target dataset location.
- TargetDeleteQuery—Write an SQL query to selectively delete data from the target dataset before loading. For example, 1=1 removes all existing data from the target dataset.
- MappingWorkbook—The link to the location of the data mapping workbook in the data loading workspace.
- Enabled—A Boolean value specifying whether the mapping workbook will be processed or skipped during the data loading process. The default is True.
- MaintainAttachments—A Boolean value specifying whether attachments will be maintained during the data loading process. The default is True.
- PreserveGlobalIds— A Boolean value specifying whether global IDs will be preserved during the data loading process. The default is False. Global IDs cannot be preserved when loading to a service.
- GeographicTransformations—Define valid geographic transformations to be used when the data is loaded.
Data mapping workbooks
The data mapping workbooks are in the Data Mapping folder and are organized by geometry type. Each data mapping workbook corresponds to one of the source and target pairs defined in the data reference workbook. Use these workbooks to define how data will be mapped from the source to the target using the options described below.
Mapping worksheet
Use this worksheet to define field mapping. The worksheet is highlighted in green in the workbook.
The following are the column definitions:
- TargetField—The fields from the target dataset.
- FieldType—The field type of each field in the target dataset.
- Expression—Directly map fields by selecting a field from the drop-down menu, or write a Python expression. For example, this can reference an external Python function or a constant value such as a number or string.
- LookupSheet—Specify a lookup worksheet. This must be used with the LookupKeys and LookupValue columns.
- LookupKeys—Specify one or more columns from the lookup worksheet to use as keys. To use multiple columns, separate them with a comma.
- LookupValue—Specify a column from the lookup worksheet to be used as a target value.
- LookupDefault—When using a lookup worksheet, this field can be used to define a default value. Null values and values that are not matched in the lookup worksheet will be assigned this value. This column is optional and can be left blank if no default value is used.
Mapping worksheet validation
The conditional formatting in a workbook uses different colors to represent specific types of errors and warnings. The meanings of the different colors are as follows:
- Solid red—Cells highlighted in solid red signify that there is an invalid data type between the source and target fields. The source and target data types are highly unlikely to be compatible when processed and most likely will need to be changed.
- Solid orange—Cells highlighted in solid orange, signify that there is a potential invalid source field data type for the target field. The source and target data types are possibly incompatible and should be evaluated.
- Solid light blue—Cells highlighted in solid light blue, indicate that both the Expression column and any one of the lookup columns have been filled with values. Either the expression column or the lookup columns can have values, but not both at the same time.
- Red outline—Cells outlined in red, indicate that there are missing values in the lookup columns. The LookupSheet, LookupKeys, and LookupValue columns are all required to be completed with values when using a lookup sheet.
- Solid dark blue—Cells highlighted in solid dark blue indicate that a field name in the TargetField column is a duplicate. All values in this column must be unique and correspond with the schema of the target.
Lookup worksheet
Use lookup worksheets to define the source-to-target value mapping. They are used in the mapping worksheet as lookup keys and lookup values. A sheet is automatically created for all coded value domains in the source data, with columns containing domain codes and descriptions. Additional columns that represent fields in the source data can be added manually to support a multiple-value lookup. These worksheets are highlighted in blue in the workbook.
If the Create Data Loading Workspace tool's Predictive Field Matching Options parameter is set to Domain Coded Value Description Similarity, a lookup worksheet will be generated automatically between source and target domains. The source data columns will be added first, followed by any matching target data columns.
Global lookup workbook
The global lookup workbook operates in the same way as a local lookup worksheet but allows a single lookup worksheet to be used across many mapping workbooks. The global lookup workbook is located in the DataMapping\GlobalLookup directory of the data loading workspace. When referencing a global lookup worksheet, enclose the lookup sheet name in brackets in the LookupSheet column of the mapping workbook. For example, if the global lookup worksheet is named Materials, you would reference it as [Materials] in the LookupSheet column of the mapping worksheet. This communicates to the data loading tools to retrieve the values from the global lookup worksheet rather than the local lookup worksheet.
Schema worksheet
Use these worksheets to view schema information about the source and target datasets. These are used in building a mapping workspace and are not used in processing. These worksheets are highlighted in yellow in the workbook.
Info worksheet
Use this worksheet to optionally define the source subtype value. This worksheet is highlighted in orange in the workbook.
Data mapping options
There are a number of options to map data in the data mapping workbooks. The Lookup and Expression columns can be used, but only one method can be used per target field. Review all options to determine what will work best for the data loading process. The options described below are configured in the mapping worksheet.
Expression
The Expression column can be used to directly map fields with compatible data types. Some source and target fields are automatically matched if they have the same field names. Review any automatically matched fields to verify that the source and target data types match, and if coded domains are present, that the values are equivalent.
The Expression column can also be customized to use a Python expression written directly in Excel or to call a function in the Scripts folder. The following are scenarios when this is the best option:
- Converting a field value to a different data type
- Mapping all source values to one target value
- Creating a globally unique identifier (GUID)
- Performing shape manipulation to convert between geometry types
Mapping table
The mapping table allows you to add or override the matches made using the Create Data Loading Workspace tool. The mapping table also allows you to block matches. When you create the mapping table, specify how the source data will be mapped to the target dataset. This provides more control over the data loading process and ensures that the fields, attributes, and domain descriptions are mapped accurately. The use of a mapping table is optional, but when used, the Create Data Loading Workspace tool uses the mapping table as input. It takes into account the mappings specified in the table and uses them to assist in the matches and blocks that are created.
For more information about the mapping table, see the Generate Mapping Table tool documentation.