Packaging Python scripts

Dive-in:

The majority of Python script tools that execute successfully on your computer will package and execute successfully when unpacked on another machine—you do not have to modify your script in any way. However, if you are encountering problems, it may be due to your script using hardcoded data paths or using import statements to import Python modules you developed. In this case, you may find this topic helpful as it delves into the following details.

  1. How data with hardcoded paths in a script is found and included in your package.
  2. How imported modules are found and included in your package.
  3. How tool validation code is packaged.
  4. How third-party libraries are packaged.

How data in your script is found

Whenever you share a result, either as a package or as a service, and the result references a script tool, the script tool is scanned to discover any data used in the script. When a data path is found, it is consolidated into a temporary folder that is included in the package.

When your script is scanned, every quoted string (either single- or double-quotes) used in a Python variable or as an argument to a function is tested to see if it is a path to data that exists. Data, in this case, means

  • A map layer
  • A folder
  • A file
  • A geodataset, such as a feature class, shapefile, geodatabase, or layer file

For the purposes of discussion, only data that is used as input to geoprocessing tools or paths referencing other Python modules is of interest. Output data is also consolidated.

Whenever a quoted string is found in the script, the test for data existence proceeds as follows:

  1. Does the string refer to a map layer?
  2. Does the string contain an absolute path to data, such as e:\Warehousing\ToolData\SanFrancisco.gdb\streets?
  3. Does the string reference data that can be found relative to the script location? The script location is defined as follows:
    • The folder containing the script.
    • If the script is embedded in the toolbox, the location is the folder containing the toolbox.
    • If the script is in a Python toolbox, the location is the folder containing the Python toolbox.

These tests proceed in sequential order. If the test passes, and the data exists, it will be consolidated.

Note:

When folders are consolidated, only files and geodatasets within the folder are copied; no subfolders are copied. Some geodatasets, such as file geodatabases, rasters, and TINS are technically folders, but they are also geodatasets, so they will be copied. If the folder contains layer files, all data referenced by the layer file is also consolidated so that any arcpy.mp routines in the script can gain access to the referenced data.

Tip:

Due to the way folders are consolidated, you should avoid cluttering the folder with large datasets and files that will never be used by your tool; it unnecessarily increases the size of the data to be packaged.

Examples

The examples below are based on this folder structure:

Example project folder
Example project folder

Relative paths and folders

The following technique of finding data relative to the location of the script is a common pattern. For reference, the script code that follows resides in the Scripts folder illustrated above. The ToolData folder contains SanFrancisco.gdb. Within SanFrancisco.gdb is a feature class named Streets. In the code sample below, the path to the ToolData folder is constructed relative to the location of the script (the Scripts folder).


import arcpy
import os
import sys

# Get the pathname to this script, then strip off the
#  script file name to yield the containing folder
#
scriptPath = sys.path[0]
thisFolder = os.path.dirname(scriptPath)

# Construct paths to ../ToolData/SanFrancisco.gdb/Streets and
#                    ../ToolData/Warehouse.lyr
#
toolDataPath = os.path.join(thisFolder, "ToolData")
streetFeatures = os.path.join(toolDataPath, "SanFrancisco.gdb", "Streets")
streetLyr = os.path.join(toolDataPath, "Warehouse.lyr")

In the code above, the string "ToolData" (an argument to the os.path.join function) is tested to see if it exists. In this case, there is a folder named ToolData relative to the location of the script. This ToolData folder will be consolidated—all its contents (with the exception of subfolders as noted above) will be packaged.

Note that folder contents are copied, not individual files. For example, in the above code, the path to the dataset e:/Warehousing/ToolData/SanFrancisco.gdb/Streets is constructed. The consolidation process does not isolate and copy only the Streets dataset—it copies the entire ToolData folder.

Absolute path to a geodataset

An absolute path is one that begins with a drive letter, such as e:/, as shown in the code sample below.


streetFeatures = 'e:/Warehousing/ToolData/SanFrancisco.gdb/Streets'

In the above code, the Streets dataset, and all other data it depends upon (such as relationship classes and domains), will be consolidated.

Hybrid example


toolDataPath = r'e:\Warehousing\ToolData'
warehouseLyr = os.path.join(toolDataPath, "Warehouse.lyrx")

In the above code, the entire contents of the ToolData folder is consolidated. Since the folder contents (minus subfolders) is consolidated, Warehouse.lyrx will be consolidated as well, along with any data referenced by Warehouse.lyrx.

Forward versus backward slashes

The Windows convention is to use a backward slash (\) as the separator in a path. UNIX systems use a forward slash (/).

Note:

Throughout ArcGIS, it doesn't matter whether you use a forward or backward slash in your path—ArcGIS will always translate forward and backward slashes to the appropriate operating system convention.

Backward slash in scripting

Programming languages that have their roots in UNIX and the C programming language, such as Python, treat the backslash (\) as the escape character. For example, \t signifies a tab. Since paths can contain backslashes, you need to prevent backslashes from being used as the escape character. The easiest way is to convert paths into Python raw strings using the r directive, as shown below. This instructs Python to ignore backslashes.

thePath = r"E:\data\telluride\newdata.gdb\slopes"

Importing other Python modules

Your script may import other scripts that you developed. For example, the code below shows importing a Python module named myutils, which is found in the same directory as the parent script and contains a routine named getFIDName.


import arcpy
import myutils
inFeatures = arcpy.GetParameterAsText(0)
inFID = myutils.getFIDName(inFeatures)

Whenever an import statement is encountered, the following order is used to locate the script:

  • The same folder as the script. If the script is embedded in the toolbox, the folder containing the toolbox is used.
  • The folder referenced by the system's PYTHONPATH variable.
  • Any folder referenced by the system's PATH variable.
If the script to import is found in any of these folders, the script is consolidated. The scanning process is recursive—the imported script is also scanned for project data and imports using all the rules described above.

Another technique for referencing modules to import is to use the sys.path.append method. This allows you to set a path to a folder containing scripts that you need to import.


import arcpy
import sys
import os

# Append the path to the utility modules to the system path
#  for the duration of this script.
#
myPythonModules = r'e:\Warehousing\Scripts'
sys.path.append(myPythonModules)
import myutils # a Python file within myPythonModules

In the above code, note that the sys.path.append method requires a folder as an argument. Since 'e:\Warehousing\Scripts' is a folder, the entire contents of the folder will be consolidated. The rules for copying folder contents apply here as well—everything in the folder is copied except for subfolders that are not geodatasets.

Note:

Python scripts within the folder are not scanned for project data or imported modules.

Tool validation code

If you have experience writing script tools, you may be providing your own tool validation logic. Validation logic is implemented with Python, and your validation code will be scanned for project data and modules, just like any other Python script. For example, your validation logic may open a folder (for example, d:\approved_projections) containing projection files (.prj) to build a choice list of spatial references users can choose when they execute your tool. This folder is not a tool parameter; it is simply a data path used within your tool validation script. The same rules described above for Python scripts apply here, and the consequence is that the d:\approved_projections folder will be consolidated and included in the package.

Third-party libraries

Third-party modules (any module that is not part of the core Python installation) are not consolidated. You need to ensure the module is installed on the machine where the packaged is unpacked. You should provide documentation for your tool and package that specifies what third-party modules are required. This does not apply to the third-part modules which are installed with ArcGIS, including numpy, matplotlib, and pandas, among others.