Software Performance 31st Edition
Fall 2012 GIS Software Performance 31st Edition
This section shares lessons learned about selecting and building effective GIS design solutions that satisfy operational performance and scalability needs. Software technology allows us to model our work processes, and provide these models to computers to optimize user workflow performance. The complexity of these models, the functions selected to generate our display, and how application functions are orchestrated to analyze and present information processing needs have a significant impact on computer system workload and subsequent performance and scalability.
For many years we focused our system architecture design consulting efforts toward identifying and establishing a hardware infrastructure that would support a standard implementation of Esri software technology. We developed platform sizing models based on consulting experience and customer implementation success. We updated our sizing models based on relative performance benchmark testing which focused on quantifying changes in critical processing loads introduced with each new software release. Today we have a Capacity Planning Tool that automates our system architecture design analysis enabling more refined and accurate performance management.
There were examples of GIS deployments that did not take advantage of system architecture design best practices. Systems were deployed with unresolved performance issues, and scalability was not well understood. In some cases performance issues were not identified before the production system was under critical peak loads, and the platform solution or network infrastructure failed to meet performance needs. Resolving performance issues while in production can be expensive, both in terms of lost services and user productivity. Building a system design that addresses capacity planning needs during initial planning and throughout development and deployment can improve user productivity and reduced implementation risk.
- 1 Workflow baselines
- 2 Custom workflows
- 3 Software workflow recipe
- 4 Software technology selection
- 5 CPT Map document/Imagery selection
- 6 Data density makes a difference
- 7 Map cache can reduce the number of dynamic display layers
- 8 Display complexity
- 8.1 Tradeoff between map display quality and user performance
- 8.2 GIS user performance expectations
- 8.3 Map display complexity
- 8.4 Defining display complexity
- 8.5 Use map publishing tools to measure display complexity
- 8.6 Use MXDPerfStat to measure display complexity
- 9 Selecting the best map resolution
- 10 Selecting the best output format
- 11 Data source selection
- 12 Custom workflow processing loads
- 13 Software performance summary
- 14 CPT Video: Software Performance
- 15 Previous Editions
Workflow baselines provide a foundation for capacity planning. We discussed the various GIS software deployment patterns in Chapter 2 Software Technology. Each software deployment pattern provides a unique combination of hardware and network processing loads deployed within a component architecture that supports the system computing environment. Figure 3.1 provides an overview of the software and network components that impact the system architecture design. We will be discussing these components and their system configuration strategies throughout the System Design Strategies wiki. These are the primary components that work together to make up the system architecture design.
Software technology selection determines the software components that will participate in the selected workflow. Each software deployment pattern includes components that are installed on the computing system.
- ArcGIS for Desktop workstation deployments. Software includes the Client, SDE, and DBMS components. Client communicates over the network to the DBMS server.
- ArcGIS for Desktop Citrix deployments. Software includes the terminal Client, WTS, SDE, and DBMS components. Terminal client communicates over the network to the WTS server.
- ArcGIS for Server deployments. Software includes the Client, Web server, SOC (GIS Server), SDE, and DBMS components. Web client communicates over the network to the Web server.
The workflow baseline identifies the medium system processing loads for a specific software technology pattern. These processing loads are expressed as display service times and network traffic per display on the various system components. The load profile (how the load is distributed across the software components) is fairly consistent for each software technology selection. The processing and traffic load intensity will vary within each technology deployment pattern based on display complexity and some additional key software configuration parameters we will discuss in this chapter.
A workflow baseline is established for each software technology pattern, identifying a medium complexity load for each of the system components for that workflow, for capacity planning purposes. The software baseline loads are established from analysis of benchmark test results and software deployment experience.
Each of the boxes in Figure 3-2 represents key performance parameters that will adjust the workflow loads on the system. These key performance parameters are decisions you will make in developing the map service, are impacted by the complexity of your data model or application functions, or impact the display performance due to how they are presented in the final system architecture design. As you choose your display configuration the workflow processing loads will be adjusted accordingly.
Software workflow recipe
The software workflow recipe identifies the assumptions used to generate the workflow processing loads. The CPT Calculator is a tool developed for generating custom workflow service times (estimated system loads) directly from a defined software workflow baseline. The components of the workflow recipe track the most critical performance parameters evaluated during our performance testing.
This chapter will describe each of the critical recipe components used to establish software component workflow service times, and show how to use the CPT Calculator is used to select appropriate project workflow performance targets.
There are two workflow recipe formats, one for GIS workflows and another for Imagery workflows.
GIS workflow recipe
- The software selection uses the baseline workflow to establish a medium workflow performance baseline.
- The remaining recipe selections modify the performance baseline to represent the selected custom workflow.
Best Practice: Performance adjustments are selected from look-up tables established from analysis of benchmark test results.
Image service workflow recipe
- The MapDoc selection is replaced by an imagery dataset selection (mosaic dataset or raster dataset). *Imagery workflows will always use a RASTER image density selection.
Best Practice: Performance adjustments are selected from look-up tables established from analysis of benchmark test results.
Software technology selection
The GIS software technology patterns were introduced in Chapter 2. The CPT Calculator can generate workflow processing loads from a variety of software technology patterns. Your CPT Software technology selection identifies the performance baseline used for generating the selected workflow processing loads.
CPT Map document/Imagery selection
ArcGIS Server Optimized Map Service Document (MSD)
During map service publishing, the standard map document (MXD) is translated to an optimized map service description (MSD) for high-performance, high-quality map publishing. Estimated performance gains range from 30 - 70 percent.
MSD MapDoc selection reduces SOC load in the CPT by 50 percent.
- MSD rendering engine performs better than ArcIMS AXL and MXD-based map services.
- Provides more consistent performance across Windows and Linux operating system environments.
- New cartographic engine significantly improves map display output quality.
Best Practice: Use MSD rendering for high-performance map publishing. ArcGIS 10.1 uses MSD rendering for all published map services.
ArcGIS imagery service patterns
ArcGIS for Server can publish a pre-processed mosaicked single image using a raster dataset or a mosaic dataset. Imagery can also be loaded into ArcSDE and served directly as a single image.
Best Practice: Mosaic dataset should be used for all imagery services. Mosaic datasets can significantly improve and simplify imagery management, and provide optimum dynamic performance from collections of raw imagery data formats.
The mosaic dataset, introduced with ArcGIS 10, fully integrates imagery processing within the ArcGIS software.
- The ArcGIS for Server imagery extension allows you to publish image services directly from raw imagery files using imagery processing tools associated with imagery managed by the mosaic dataset.
- An ArcGIS for Server image service provides processing of the requested imagery extent to include dynamic mosaicking and a variety of on-the-fly imagery processing functions.
The selection options are MosaicDS or RasterDS. The selection choice is included in the Workflow recipe.
Data density makes a difference
Figure 3.10 shows a vector and raster output of the same map along with the different traffic and travel time for each output format. Density of the data will make a difference in processing and output image compression.
- Results above are based on REST MXD service.
- Both images are 600 x 400 resolutions.
- JPEG compression works best with raster data (less traffic and faster display response time).
- PNG compression works best with vector data (less traffic and faster display response time).
PNG24 is CPT Calculator default for vector density.
- Supports transparent overlay.
- Compresses common pixel values.
JPEG is default for raster imagery density.
- Common compression across different pixel values.
Best Practice: Use JPEG output when imagery layers are included in the map display.
PNG24 is the recommended default setting for vector density.
- PNG supports transparent overlay.
- PNG compresses common pixel values.
Best Practice: Use PNG8 for vector business layers when higher color depth is not required.
JPEG is the default for raster imagery density.
- Provides common compression across different pixel values.
The density selection modifies traffic and rendering time processing loads.
Map cache can reduce the number of dynamic display layers
- The technology has changed, but the procedure for building a map is much the same.
- Maps with a few layers require less processing than maps with many layers.
- The layer complexity (number of features, edges, symbology, tasks, etc.) impacts rendering time for each layer.
- Building a map display renders one layer at a time, joining the features (points, polygons, lines) in each layer sequentially, one on top of the other, until the final display is complete.
Parallel query displays can be published with ArcGIS for Server technology—but is the performance gain worth the use of extra shared infrastructure resources?
Warning: The extra network transport time and queue time to support the parallel display build consumed almost 50 percent of the parallel processing display performance gain.
Parallel processing may not always improve display speed, and in most cases could reduce overall system performance capacity.
Take advantage of caching (%DataCache)
A map cache reduces the number of dynamic display layers (less processing load).
Preprocessed basemap images are available in optimized map cache format.
- Maps can include dynamic and cached layers.
- Operational layers come from a dynamic data source.
- Basemaps come from a preprocessed map cache (static layers).
- Display combines operational layer rendered over cached basemap (mash-up/overlay).
- Cached basemaps are available from ArcGIS Online.
Dynamic layers = rapidly changing data
- Roads showing snow depth
- Electrical network showing latest posted work order
Static layers = slowly changing data
- Land use/land cover:
- Road network
- Basemap data
Data classification is application-specific.
The quality of the fully cached map can be much established much higher than the medium dynamic display (and map publishing performance still the same), the difference is that the fully cached map processing was completed before posting on the website, and the final processing time for the cached map tiles is minimal.
Best Practice: The optimum web mapping display combines dynamic map services presented as a transparent image [mash-up (business layers)] over a map cache base layer.
- Dynamic map services are important for geographic analysis, editing, and geoprocessing functions, which require access to point, polygon, and line features rendered in a dynamic map display.
- Map cache tiles provide an optimum basemap foundation layer, combining high-quality map visualization with high display performance.
- A mash-up of dynamic operational layers over high-quality base map reference layer delivers the optimum combination of quality and performance.
The map cache setting identifies the percentage of display layers will be pre-processed into a tiled map cache. The percent dynamic is calculated (1-%DataCache) and the %Dyn percentage is included in the Workflow recipe.
Display complexity is an estimate of how much processing a computer system must do to complete a unit of work. Workflow display complexity includes a broad range of software and data design factors. The complexity determination used for initial capacity planning is often a rough estimate related to a standard (medium complexity) baseline workflow processing load profile. There will be opportunities to measure the display complexity during initial publishing and deployment of the map service.
Tradeoff between map display quality and user performance
- Shaded relief
- Transparent layers
- Dynamic Maplex labeling
- Slow performance
- Low-resolution relief
- Solid colors
- Simple annotation
- Fast performance
Both provide very similar information, but they show very different performance.
Best Practice: High-quality map features served as a cached basemap perform very well.
GIS user performance expectations
User display performance expectations in 2000 were around five seconds—a challenge for light map displays viewed in the computer room. The same map service today can be rendered in less than 0.25 seconds. These performance improvements open opportunities for:
- More complex dynamic map services.
- Deployment of ArcGIS for Server on easier-to-manage virtual server environments.
- Deployment of web services on a hosted cloud computing environment.
- The possibility of much richer dynamic services that employ more sophisticated statistical analysis or network routing algorithms (two to three times the complexity of current GIS workflow baselines).
These opportunities will introduce new challenges. As heavier processing options are introduced, it will be increasingly important to plan, set performance milestones, and manage compliance during system deployment.
At the same time, there are more opportunities than ever before to reduce the risk of deploying systems that do not meet your performance needs.
Map display complexity
GIS provides users with a geographic view of their business environment, and for many GIS workflows, the map display is used as a primary spatial information resource.
Warning: Not all map displays are created equal.
Display complexity is used to establish workflow performance targets; selection identifies the display processing load relative to a medium complexity baseline workflow.
Standard workflows are included on the CPT Workflow tab. You can select Standard Workflows on the Calculator tab for single workflow sizing.
High performance GIS mapping services
The first step in publishing a map service is to create a map document (MXD) representing the geographic information product you wish to publish. The number of layers in the display, the number of features in each layer, and the types of processing functions required to render the map display will contribute to display complexity.
- The software functions used to generate map documents used in the display can represent the heaviest processing loads within the user workflow. Other functions used in the analysis besides what is required to produce the display should also be considered.
- The types of functions, data source format, and design of the user display can make a big difference on the level of processing and network loads required to support a GIS user workflow.
The CPT Calculator provides seven choices for complexity.
- Light: Fifty percent of the loads of a medium workflow. Use for very simple map displays with a minimum number of focused layers and no heavy functions.
- Medium-light: Processing load between light and medium complexity.
- Medium: Standard workflow baseline, suitable for most dynamic mapping workflows that follow standard best practices for high-performance mapping.
- Medium-heavy: Processing load between medium and heavy complexity.
- Heavy: Heavy functions, large number of display layers, or large number of features require heavier processing loads for this workflow. Fifty percent more loads than a medium workflow.
- 2xMedium: Twice the loads of a medium workflow.
- 3xMedium: Three times the loads of a medium workflow.
The complexity of the authored map document will be a primary factor in determining map service performance and scalability. Light maps are rendered three times faster than heavy maps, and six times faster than the much heavier 3x Medium complexity maps.
The following are some best practices for authoring high-performance web maps. Use these recommendations as a guide to build a map that will perform within your performance budgets.
Only show relevant data.
- Start simple.
- Use field visibility.
Use scale dependencies.
- Display appropriate data for the given scale.
- Display the same number of features at all scales.
Select the appropriate point representation.
- Use single-layer simple or character markers.
- Use EMF instead of bitmaps.
- Use integer fields for symbol values.
- Avoid halos, complex shapes, and masking.
Select the appropriate lines and polygons.
- Use the ESRI Optimized style.
- Avoid cartographic lines and polygon outlines.
Use appropriate text and labeling.
- Use annotation instead of labels.
- Use indexed fields.
- Use label and feature conflict weights sparingly.
- Avoid special effects (fill patterns, halos, callouts, backgrounds).
- Avoid very large text size (60+ pts).
- Avoid Maplex for dynamic labeling (avoid over-use).
Similar performance variations apply to imagery workflows.
High performance Imagery services
- Limit the maximum image size per request.
- Limit the maximum number of rasters per mosaic.
- Select the optimum resampling method.
- Discrete data (Nearest Neighbor or Majority)
- Continuous data (Bilinear Interpolation or Cubic Convolution)
- Use the optimum compression method.
- Set the optimum compression quality.
- Set the optimum mosaic method.
- Limit the maximum number of records returned per request (mosaic dataset only).
- Select the appropriate metadata level.
- Basic, full, or none
- Set the appropriate allowed fields (mosaic dataset only).
- Select appropriate output and virtual directories.
- Identify supported image return type.
Best Practice: Use the map publishing analysis tool to identify potential performance tuning opportunities. Warnings identify opportunities for performance tuning and show help documentation for adjusting complexity of your map service.
Defining display complexity
- Medium map display rendering time with a high-performance 2012 platform (Xeon E5-2643 3300 MHz processors) would be about 0.34 sec.
- The same map display rendering time measured on a 2006 Pentium D 3200 workstation would be about 1.7 sec.
Rendering time is a function of display complexity and platform performance.
- Platform performance is represented by vendor-published platform SPEC benchmark results.
- Faster computer platforms will render the map in much less time than slower platforms.
Best Practice: Measure display performance during map publishing to verify within design performance targets. Dynamic map display results shown in the chart were rendered from a local file geodatabase data source using the MXDPerfStat performance measurement tool.
Warning: Display complexity is a relative term representing a variety of system performance variables. Workflow display complexity is important for capacity planning since it directly contributes to system performance and scalability.
ArcGIS for Desktop provides a map publishing Service Editor with specific performance tuning tools. With ArcGIS 10.1, the Analyze and Preview tools appear at the top of the Service Editor only during the map publishing process.
You must have access to ArcGIS for Server to use the map publishing Service Editor preview tool.
[Map publishing services] are described in the ArcGIS 10.1 online documentation available on the ArcGIS Resource Center.
Use map publishing tools to measure display complexity
1. Design and create a map document in ArcMap.
2. Analyze your ArcMap document by clicking Analyze on the map publishing Service Editor.
The Analyze function will identify errors, warnings, and messages.
- Errors are issues you must fix before publishing the map.
- Warnings are issues that may affect drawing performance and appearance.
- ArcGIS 10.1 help Prepare window [warning messages] section provides a list of potential performance problems.
- Messages provide other information you may want to be aware of.
3. Preview will let you zoom and pan around your map to check performance.10.1 map publishing preview tool].
- Rendering performance is provided at the top of the display.
- Set the planning image output format for your service.
- Move around the more dense areas to measure display complexity.
- If performance is not acceptable, review the Analyze function warnings for additional opportunities to improve performance.
Best Practice: Use the map publishing tools during the authoring phase to evaluate map display complexity. Render time targets can be established to author a published service within your performance budget. Early performance validation can reduce implementation risk.
Use MXDPerfStat to measure display complexityMXDperfstat tool] identifies display refresh times at multiple scales, shows layer refresh times for each map scale, and provides layer performance statistics [such as number of features, vectors, labeling, and breaks out display time for several key rendering phases (geography, graphics, cursor, and database)].
Figure 3-24 shows some sample MXDPerfstat results run on a file geodatabase dataset. Results are summarized for display purposes.
Best Practice: Caching the basemap would reduce display complexity by 50 percent.
The tool also provides some high-level recommendations for performance tuning.
MXDperfstat is an excellent tool for measuring map document display performance, since it lists layer statistics (render time, features, edges, projection, etc.) for each layer included within a complete series of map scales. The measured results can be used for evaluating map display complexity and tuning your map document for optimum display performance.
Best Practice: Measure display render performance during initial map authoring to validate compliance with design performance targets.
Once you see the layer processing time and the performance metrics, the layers with display problems are usually easy to spot.
Selecting the best map resolution
Figure 3-27 shows the variation in traffic and response time for a 600x400 and 1280x1024 pixel display resolution of the same map. Remote client performance with a 1280 x 1024 display on a dedicated T-1 connection would be over 6 times slower than a local user display. A display resolution of 600 x 400 is a common size for web browser map displays. Remote display performance over T-1 line is less than 10 seconds.
Web mapping services produce map images that are sent to the client browser for display. Each user request will generate a new map image that must be delivered to the client browser. The size of the output image varies directly with the number of pixels, so higher resolution images generate much higher client traffic loads.
Warning: The required amount of traffic per display can have a significant impact on user performance over lower bandwidth.
Note: Server processing loads may also increase with higher resolution displays (higher resolution can result in larger map extent, increasing the number of features rendered for each client display).
Calculator resolution selection identifies the map output display size. More traffic is required when publishing larger map resolution.
Selecting the best output format
- With image services, each user request will generate a new map image that must be delivered to the client browser.
- The selected image type can have a direct impact on the volume of network traffic. Lighter images require less display traffic and heavier images require more display traffic.
- The required amount of traffic per display can have a significant impact on user performance over lower bandwidth.
Image output selection observations:
- Vector-only images compress better than images that include a Digital Ortho raster layer.
- JPEG image types provide the most consistent compression, with minimum variation between raster and vector images.
- PNG images compress much better with vector data than with raster—PNG supports transparencies and is the default ArcGIS for Server output format.
- PDF is a heavier output format used for high-quality map plotting.
Best Practice: Select the data format that provides the best display performance while meeting your business requirements.
Several output service formats are available for map publishing. Select the format that applies to your published service. Calculator will use the output look-up table to adjust traffic and loads based on your selection.
Data source selection
Data source is selected for each user workflow on both the CPT Calculator and the CPT Design tabs. Data source selection is configured separate from the workflow service times, so a defined project or standard workflow can be used to create multiple use cases on the CPT Design tab each with a different data source. On the CPT Design tab, each workflow selection (row) in the requirements module will have a defined data source.
Selecting the best vector data source format
- Maintenance SDE geodatabase is needed for data integrity (some performance overhead).
- Simple feature SDE geodatabase provides improved performance for production database.
- A file geodatabase performs well in a static, read-only environment.
- Larger shapefile data formats incur increased traffic and performance overhead.
Note: Data source management and deployment options will be discussed in more detail in Lesson 5: GIS data administration.
Several vector data source formats are available for your selection. Calculator will use a look-up table to adjust output traffic and loads based on your data format selection.
Selecting the best imagery storage format
- TIFF uncompressed provides the best performance, but requires high storage traffic.
- TIFF LZW compression results in reasonable performance overhead with high storage traffic.
- TIFF JPEG compression results in good storage traffic reduction with reasonable performance overhead.
- JPEG 2000 provides good storage traffic compression but incurs high performance overhead.
- LizardTech MrSID compression reduces storage traffic with moderate performance overhead.
- Intergraph ECW format reduces storage traffic with moderate performance overhead.
- ERDAS IMG format requires moderate performance overhead and high storage traffic.
- Storing imagery in an SDE geodatabase reduces storage traffic and provides good performance. This format is read-only, and is no longer a preferred Imagery storage option.
There are many factors that drive how you may store your imagery. The CPT provides a limited sample of storage options, and performance metrics can change with each software release. Performance data provided is based on preliminary test results. Much better information on performance of ArcGIS for Server Image Services should be available as the ArcGIS technology matures.
Several imagery data source formats are available for your selection. Calculator will use a look-up table to adjust output traffic and loads based on your data format selection.
Custom workflow processing loads
Custom workflow performance loads are generated from baseline workflows. Selected software performance factors are applied to generate custom baseline service times.]] CPT Calculator workflows are generated from performance benchmark baselines.
Custom workflow service times created on the Calculator tab are copied to CPT Workflow tab. Workflow service times can then be copied and included in Project Workflows. Calculator workflows provide a source for both standard and custom workflow performance targets.
Best Practice: Software technology baseline service time, traffic loads, and relative performance adjustments are derived from test benchmarks.
Software performance summary
Experience suggests we can do a better job selecting and building better software solutions. Understanding software performance can reduce implementation risk and save customer time and money. Projects can be delivered within project cost, time, and performance budgets.
- This video shows how to create custom workflows on the CPT Calculator tab and then move these workflows to your Project Workflow section on the Workflow tab for use in your design.
The next section will take a closer look at ArcGIS Server software performance.