Summary
Calculates summary statistics for fields in a feature class.
Usage
Summarize Attributes is a tabular analysis tool, not a spatial analysis tool. Inputs can be a tabular layer or a layer with geometry (points, lines, or polygons).
You can specify one or more fields to summarize by or summarize all features. When you summarize by fields, statistics are calculated for each unique combination of attribute values.
The output table will consist of fields containing the result of the statistical operation.
A field will be created for each specified statistic type using the following naming convention: sum_<field>, max_<field>, min_<field>, range_<field>, std_<field>, count_<field>, var_<field>, and any_<field> (where <field> is the name of the input field for which the statistic is computed). The statistics will be calculated on each group separately.
If time is enabled on the input, you can apply time stepping to your analysis. Each time step is analyzed independent of features outside the time step. To use time stepping, your input data must be time enabled and represent an instant in time. When time stepping is applied, output features will be time intervals represented by the START_DATETIME and END_DATETIME fields.
You can apply this tool to spatial data, and you will get a tabular result. You can join your results to spatial data using Join Features.
The tables below illustrate the statistical calculations of a layer that is summarized using like values of fields. The VO2 field was used to calculate the numeric statistics (Count,Sum, Minimum, Maximum, Range, Mean, Standard Deviation, and Variance) for the layer. The Rating field was used to calculate the string statistics (Count and Any) for the layer.
The table above was summarized on the Designation field, and the VO2 field was used to calculate the numeric statistics (Count,Sum, Minimum, Maximum, Range, Mean, Standard Deviation, and Variance) for the layer. The Rating field was used to calculate the string statistics (Count and Any) for the layer. This results is a table with two features, representing the distinct values of Designation.
The following table represents what the first few fields look like when the layer is summarized using the Designation and Age Group fields. Statistics are calculated using the same methods as the previous example.
You can improve the performance of the Summarize Attributes tool by using the following tips:
- Set the extent environment so you only analyze data of interest.
- Use data that is local to where the analysis is being run.
This geoprocessing tool is powered by ArcGIS GeoAnalytics Server. Analysis is completed on your GeoAnalytics Server, and results are stored in your content in ArcGIS Enterprise.
When running GeoAnalytics Server tools, the analysis is completed on the GeoAnalytics Server. For optimal performance, make data available to the GeoAnalytics Server through feature layers hosted on your ArcGIS Enterprise portal or through big data file shares. Data that is not local to your GeoAnalytics Server will be moved to your GeoAnalytics Server before analysis begins. This means that it will take longer to run a tool, and in some cases, moving the data from ArcGIS Pro to your GeoAnalytics Server may fail. The threshold for failure depends on your network speeds, as well as the size and complexity of the data. Therefore, it is recommended that you always share your data or create a big data file share.
Learn more about sharing data to your portal
Learn more about creating a big data file share through Server Manager
Similar analysis can also be completed using the Summary Statistics tool in the Analysis toolbox.
Syntax
arcpy.geoanalytics.SummarizeAttributes(input_layer, output_name, fields, {summary_fields}, {data_store}, {time_step_interval}, {time_step_repeat}, {time_step_reference})
Parameter | Explanation | Data Type |
input_layer | The point, polyline, or polygon layer to be summarized. | Record Set |
output_name | The name of the output feature service. | String |
fields [fields,...] | A field or fields used to summarize similar features. For example, if you choose a single field called PropertyType with the values of commercial and residential, all of the fields with the value residential fields will be summarized together, with summary statistics calculated, and all of the fields with the value commercial will be summarized together. This example will results in two rows in the output, one for commercial, and one for residential summary values. | Field |
summary_fields [summary_fields,...] (Optional) | The statistics that will be calculated on specified fields.
| Value Table |
data_store (Optional) | Specifies the ArcGIS Data Store where the output will be saved. The default is SPATIOTEMPORAL_DATA_STORE. All results stored in a spatiotemporal big data store will be stored in WGS84. Results stored in a relational data store will maintain their coordinate system.
| String |
time_step_interval (Optional) | A value that specifies the duration of the time step. This parameter is only available if the input points are time enabled and represent an instant in time. Time stepping can only be applied if time is enabled on the input. | Time Unit |
time_step_repeat (Optional) | A value that specifies how often the time-step interval occurs. This parameter is only available if the input points are time enabled and represent an instant in time. | Time Unit |
time_step_reference (Optional) | A date that specifies the reference time with which to align the time steps. The default is January 1, 1970, at 12:00 a.m. This parameter is only available if the input points are time enabled and represent an instant in time. | Date |
Derived Output
Name | Explanation | Data Type |
output | The output table with summarized attributes. | Record Set |
Code sample
The following Python window script demonstrates how to use the SummarizeAttributes tool.
#-------------------------------------------------------------------------------
# Name: Summarize Attributes.py
# Description: Summarize Crime Data by year and beat.
#
# Requirements: ArcGIS GeoAnalytics Server
# Import system modules
import arcpy
# Set local variables
# This example used a big data file share name "Crimes" with dataset "Chicago" registered on my GeoAnalytics server
inFeatures = "https://MyGeoAnalyticsMachine.domain.com/geoanalytics/rest/services/DataStoreCatalogs/bigDataFileShares_Crimes/BigDataCatalogServer/Chicago"
summaryFields = ["Year", "Beat"]
summaryStatistics = [["Arrest", "COUNT"], ["District", "COUNT"]]
outFS = 'SummarizeCrimes'
dataStore = "SPATIOTEMPORAL_DATA_STORE"
# Execute SummarizeAttributes
arcpy.geoanalytics.SummarizeAttributes(inFeatures, outFS, summaryFields,
summaryStatistics, dataStore)
Environments
- Output Coordinate System
The coordinate system that will be used for analysis. Analysis will be completed in the input coordinate system unless specified by this parameter. For GeoAnalytics Tools, final results will be stored in the spatiotemporal data store in WGS84.
Licensing information
- Basic: Requires ArcGIS GeoAnalytics Server
- Standard: Requires ArcGIS GeoAnalytics Server
- Advanced: Requires ArcGIS GeoAnalytics Server